Core science

Cresset applications are based on robust science that delivers results you can rely on. Our goal is to help you discover, design and optimize the best possible molecules for your project.

We believe that computational methods are an excellent way of arriving at a better understanding of the properties and behaviors of chemical structures and proteins. Excellent science is at the heart of everything we do, and we always prioritize scientific rigor over ease of use and faster runtimes.

Introduction to Cresset technology

Cresset technology centers on the application of the XED force field to the design of new small molecule bioactive compounds. The XED approach improves traditional molecular mechanics by using a complex description of atoms to model charge away from atomic centers. This enables a more detailed description of electrostatics and excellent reproduction of intermolecular interactions. Developed by Dr Andy Vinter and refined by Cresset, the XED force field correctly models substituent effects on aromatics, charge density changes in complex aromatics and the intermolecular interactions of small molecules, water and proteins. Read more about the XED force field.

The XED force field calculates excellent electrostatics

The most important factor affecting molecular recognition is electrostatics, but it is also affected by shape and hydrophobicity. Cresset’s approach describes the electrostatic environment around a ligand or protein as a molecular interaction potential (MIP) or field. The MIP describes all of the energetically important interactions that a ligand can make with a protein, and viewing the MIP or a protein provides clear insights as to why some ligands bind more strongly than others. Describing molecules in terms of electrostatics rather than structure enables us to sensibly compare molecules from different series. Read more about fields on ligands and fields on proteins.

Electrostatic similarity calculations inform small molecule discovery

Cresset has a patented method to compare the molecular interaction potentials for two molecules and compute a similarity. Field similarity is used to:

Read more about field similarity.

The XED force field

Cresset’s main focus is the description of molecules in terms of electrostatics. For this to be effective, the electrostatic model needs to be accurate. Quantum mechanics calculations can give very accurate electrostatic potentials, but are still too slow in most cases. As a result, an accurate method of computing electrostatic potentials in a molecular mechanics context is needed.

Most standard force fields use the atom-centred charge (ACC) approximation: the electrostatics of the molecule are approximated by a set of point partial charges placed on the nuclei. Many methods are available to compute these partial charges (Gasteiger-Hückel, AM1-BCC, etc), but the underlying model is one point charge per atom. This method can work well for the gross long-distance electrostatic potential (e.g., dipole moments), but performs poorly when describing the electrostatic potential near the molecular surface. This is because atoms are not charged spheres: they have lone pairs, pi orbitals, sigma holes and so forth. In addition, atoms and molecules are polarizable and change their electrostatic behavior in response to external electric fields. The ACC model covers none of these effects. Newer force fields such as AMOEBA solve this problem by placing explicit multipoles and polarization functions on the atoms, which does give a much more realistic electrostatic potential. These force fields perform well on proteins but have issues with parameter transferability which makes them largely unsuitable for ligand modeling.


XED model for benzene and for acetone, showing the additional charge points.

The XED force field was the first major effort to solve these electrostatic problems, and did so not by placing explicit multipoles on atoms but by placing additional monopoles around them. The technique was originally introduced by Hunter and Sanders, JACS 1989, to model aromatic-aromatic interactions, and was extended into a full general-purpose force field by Cresset’s founder Andy Vinter, JCAMD, 1994.

Additional monopole points, or XEDs (eXtended Electron Distributions), are treated within the force field as atoms with zero van der Waals radii. They are not placed in a rigid geometry with respect to their parent atom. Instead, they come with bond stretching and angle bending potentials and can move under the influence of external (and intramolecular) electrostatic potentials, allowing the direct modeling of polarizability. The more complex internal electrostatic model allows for intramolecular electrostatic/orbital interactions such as the anomeric effect to be modeled without the introduction of specific torsional parameters: the anomeric effect falls naturally out of the electrostatic model and does not need to be added post hoc.

The XED force field has been demonstrated to provide quantitatively superior results for the energetics of aromatic-aromatic interactions. Cresset’s academic collaborators have used the XED model to study and predict specific intermolecular interactions. For example, Professor Chris Hunter studied the dimerization of a series of di-aryl amides using NMR and computational chemistry (Substituent Effects on Aromatic Stacking Interactions). He found that the XED model accurately predicted the experimentally observed association constants.

hunter

Dimerization of a series of di-aryl amides (left), experimentally observed association constants (right).

The superior modeling of aromatic interactions was used to design aromatic ‘zippers’ for linking collagen mimic fibres (Thrombogenic Collagen-Mimetic Peptides: Self-Assembly of Triple Helix-Based Fibrils Driven by Hydrophobic Interactions).

The XED force field provides both qualitatively and quantitatively correct results for the interaction of phenyl and pentafluorophenyl groups.

Over the last 20 years the XED force field has undergone numerous improvements. Unlike most other force fields, the XED force field is parameterised where possible against experimental data (microwave conformation energies, small molecule crystal structures etc.) rather than relying purely on ab initio calculations. XED 3, released in 2012 offers an improved treatment of nitrogen, amongst many other enhancements. Rather than having to assign separate types for trigonal and tetrahedral nitrogen, the XED 3 force field determines on the fly the degree of pyramidalization that is appropriate in any given molecular environment, allowing for a continuum from completely flat N to completely pyramidal N. In addition, XED 3 has an improved description of halogens, correctly describing the ‘sigma hole’ in the heavier halogens and giving good results for halogen bonding. Cresset continues to develop and improve the force field on an ongoing basis.

 

Calculating fields to assess molecular interactions

Cresset’s fundamental 3D ligand similarity technology compares molecules in terms of their molecular electrostatic interaction potentials (MIPs), or ‘fields’. The MIP of a molecule is a scalar field where the value at each point in space is the interaction energy of a charged probe atom (with the van der Waals parameters of oxygen) with the molecule. These are calculated using the XED force field. As the interaction energies are poorly defined inside the molecule, the field value is set to zero anywhere where the van der Waals interaction energy is positive and larger than the absolute value of the electrostatic energy.

Dealing with a full 3D scalar potential is computationally difficult. You can sample the values on a grid, but you then have issues with gauge variance, grid spacing and so forth leading to irreproducibility. Instead, what Cresset does is ask “Where are the maxima/minima of the fields?”. Each such extremum is termed a ‘field point’, and the set of field points is uniquely defined for any given molecular conformation. The field points are usually displayed as colored spheres, where the visual extent of each field point is determined by the magnitude of the field – stronger fields get larger spheres. This allows you to see at a glance where the molecule can make a locally maximal electrostatic interaction with another molecule. Full definitions of the fields and algorithm used to compute the field point positions are detailed in the paper Molecular Field Extrema as Descriptors of Biological Activity: Definition and Validation.


Molecular field extrema applied to Sildenafil extracted from PDB code 1UDT.

Cresset’s fields have been validated as part of the development of the the XED force field, and have also been extensively compared to experimental data from small molecule crystal structures. The distribution of H-bond donors and acceptors around a functional group is a good proxy for the interaction potential, and the field point patterns obtained are consistent with this information. Fields are also particularly useful for describing the properties of aromatic systems (building on the XED force field’s excellent description of these): the field surface around an aromatic ring holds a wealth of information about how electron rich/poor it is, how its charge density is arranged, and how strong a π-stacking interaction it could make. A few example rings are shown below.


Isostar plot of oxazole, pyridine, fluorobenzene.

Whenever you are describing molecules in terms of electrostatics, it is critical to handle formal charge states correctly. Cresset have a complex rule-based system for assigning formal charges which, while not a complete pKa estimator, will correctly assign the protonation state for the vast majority of drug-like molecules at pH 7. However, just assigning the formal charge state is not enough. Solvation is much more important for ions than for neutral molecules, so additional effort needs to be made to account for that. Our full algorithm for handling formal charges in small molecules is detailed in the blog post How to calculate the electrostatic environment around charged molecules.

Fields on proteins

Cresset’s fields and field points have been extensively validated on small molecule structures. Extending this approach to proteins is more difficult, as much more attention needs to be paid to system preparation, charge states, and solvation effects. Flare, our structure-based drug design platform, solves these issues, in particular through careful protein preparation and the use of a modified dielectric function based on the work of Mehler (E. L. Mehler, The Lorentz-Debye-Sack theory and dielectric screening of electrostatic effects in proteins and nucleic acids, in Molecular Electrostatic Potentials: Concepts and Applications, Theoretical and Computational Chemistry Vol. 3, 1996).

The resulting interaction potentials (PIPs, or protein interaction potentials, to distinguish them from ligand MIPs) are extremely useful for analyzing a protein active site and determining what ligand properties might need to be altered to achieve optimum binding – see for example Comparing ligand and protein electrostatics of Btk inhibitors.


Ligand 4L6 superimposed to the protein interaction potentials of 4Z3V. Top-left: ‘dry’ active site, not including crystallographic water molecules. Top-right: ‘wet’ active site including stable water molecules. Bottom: Ligand fields for 4L6. Protein interaction potentials shown at isolevel = 3; ligand fields shown at isolevel = 2.

Field similarity for ligand-based design

Good-quality electrostatic potentials on a molecule can be computed based on an advanced representation of its underlying charge structure. The next step is being able to compare two conformations in terms of their electrostatic similarity. You can do this just by comparing the field points of the two molecules, which leads to a pharmacophore-like technique. However, a better solution is to take account of the full electrostatic potential.


The full electrostatic potential contains more information than just the fact that the nitrogen is an acceptor.

Similarity needs to be computed in terms of the underlying potentials, not just in terms of the field points. Although computationally difficult, Cresset’s solution is both elegant and effective. The fields of the two molecules are compared, but only at the places where one of the conformations has a field point; keeping the number of field computations limited, but ensuring the field is computed only at places where at least one of the conformations suggested that the field was important (i.e., at a field point). The full algorithm has been published (Cheeseright et al JCIM 2006).

A score for conformation A into conformation B is computed by determining the field potential for B at the places where the field points for A lie. The overall score can be made symmetric by also computing the converse B-into-A score and averaging the two. However, it is not enough to score a particular alignment: you need to be able to locate the optimal alignment. This is a difficult global optimisation problem. Cresset’s solution is to generate a set of initial alignments by computing colored clique matches between the sets of field points on the two conformations: a clique match is a set of field points on each conformation that match in terms of field point type and in terms of all of the inter-field-point distances (to within a distance tolerance). Each clique match determines an alignment (by least-squares fitting of the matching field points in 3D), and the alignments are then scored according to the field similarity algorithm described above. The top-scoring alignment is then taken as the ‘correct’ alignment for those two conformations.

In many cases, of course, it is not known which conformation the molecules should be in. If comparing two known bioactive conformations, for example from protein crystal structures, then the field similarity algorithm can be applied directly. It is more often the case that one of the molecules has an unknown conformation: the best example is virtual screening – searching with a defined 3D conformation of the query molecule, but not knowing if a priori conformation of the molecules being searched is going to be relevant. In this case a conformation search is performed, generating a set of conformers that represent the available configuration space of the molecule. Each of these are aligned to the query and take the best-scoring alignment as the overall score of the molecule.

In some circumstances it is necessary to compare molecules without knowing the bioactive conformation of any of them. In this situation conformer populations are computed of both molecules and compare each conformation of the first to each conformation of the second. This is the procedure that is used in our FieldTemplater technology for pharmacophore elucidation.

The field similarity algorithm is fast. Comparing a query conformation to a set of 100 conformers of a molecule takes 1-2 seconds on a single CPU core. It has proved very effective for virtual screening. Assessment of the performance of Cresset’s similarity algorithm as embodied in our Blaze virtual screening software shows that it performs significantly better than docking (Cheeseright et al JCIM 2008). The algorithm can be enhanced by combining it with a shape similarity calculation (Grant et al.): the overall similarity is a weighted combination of the field similarity and the shape similarity. In most cases an equal weight is used (50% shape, 50% fields), but this is customizable by the user for particular circumstances.

Further enhancements of the field similarity calculation include the ability to add field constraints, pharmacophore constraints and excluded volumes. Field constraints are used to mark a particular region of field as being of higher importance than the rest, pharmacophore constraints require particular types of atoms to be close to each other, while excluded volumes enable the use of protein structure information to constrain alignments to lie within the available space. All of these can significantly improve the accuracy of alignment and scoring.