Core science: Technology

Overview

Cresset technology centers on the application of the XED force field to the design of new small molecule bioactive compounds. Unlike traditional molecular mechanics, the XED approach uses a complex description of atoms to model charge away from atomic centers enabling a more detailed description of electrostatics and excellent reproduction of intermolecular interactions. Developed by Dr Andy Vinter and improved by Cresset, the XED force field correctly models substituent effects on aromatics, charge density changes in complex aromatics and the intermolecular interactions of small molecules, water and proteins. Read more about the XED force field.

The XED force field calculates excellent electrostatics

The most important factor affecting molecular recognition is electrostatics, but it is also affected by shape and hydrophobicity. Cresset’s approach describes the electrostatic environment around a ligand or protein as a molecular interaction potential (MIP) or field. The MIP describes all of the energetically important interactions that a ligand can make with a protein. Using a distillation of the MIP to the local maxima we are able to compute a 3D electrostatic similarity of two ligands that enables us to transform the fundamental chemistry of a ligand whilst maintaining the binding to a protein target. Read more about fields.

Electrostatic similarity calculations inform small molecule discovery

Cresset has a patented method to compare the molecular interaction potential for two molecules and compute a similarity. We use this field similarity to:

Read more about field similarity.

The XED force field

Cresset’s main focus is the description of molecules in terms of electrostatics. For this to be effective, the electrostatic model needs to be accurate. Quantum mechanics can give very accurate electrostatic potentials, but is still too slow in most cases. As a result, we need an accurate method of computing electrostatic potentials in a molecular mechanics context.

Most standard force fields use the atom-centred charge (ACC) approximation: the electrostatics of the molecule are approximated by a set of point partial charges placed on the nuclei. Many methods are available to compute these partial charges (Gasteiger-Hückel, AM1-BCC, etc), but the underlying model is one point charge per atom. This method can work well for the gross long-distance electrostatic potential (e.g., dipole moments), but performs poorly when describing the electrostatic potential near the molecular surface. This is because atoms are not charged spheres: they have lone pairs, pi orbitals, sigma holes and so forth. In addition, atoms and molecules are polarizable and change their electrostatic behavior in response to external electric fields. The ACC model covers none of these effects. Newer force fields such as AMOEBA solve this problem by placing explicit multipoles and polarization functions on the atoms, which does give a much more realistic electrostatic potential. These force fields perform well on proteins but have issues with parameter transferability which makes them largely unsuitable for ligand modeling.

XED model for benzene and for acetone, showing the additional charge points.
 

The XED force field was the first major effort to solve these electrostatic problems, and did so not by placing explicit multipoles on atoms but by placing additional monopoles around them. The technique was originally introduced by Hunter and Sanders, JACS 1989, to model aromatic-aromatic interactions, and was extended into a full general-purpose force field by Cresset’s founder Andy Vinter, JCAMD, 1994.

The additional monopole points, or XEDs (eXtended Electron Distributions), are treated within the force field as atoms with zero van der Waals radii. They are not placed in a rigid geometry with respect to their parent atom. Instead, they come with bond stretching and angle bending potentials and can move under the influence of external (and intramolecular) electrostatic potentials, allowing the direct modeling of polarizability. The more complex internal electrostatic model allows for complex intramolecular electrostatic/orbital interactions such as the anomeric effect to be modeled without the introduction of specific torsional parameters: the anomeric effect falls naturally out of the electrostatic model and does not need to be added post hoc.The XED force field has been demonstrated to provide quantitatively superior results for the energetics of aromatic-aromatic interactions.

hunter
Cresset’s academic collaborators have used the XED model to study and predict specific intermolecular interactions. For example Professor Chris Hunter studied the dimerization of a series of di-aryl amides (left) using NMR and computational chemistry. He found that the XED model accurately predicted the experimentally observed association constants (right).

The superior modeling of aromatic interactions was used to design aromatic ‘zippers’ for linking collagen mimic fibres (Cejas, M. A et al.. (2008) Thrombogenic Collagen-Mimetic Peptides: Self-Assembly of Triple Helix-Based Fibrils Driven by Hydrophobic Interactions. Proc. Natl. Acad. Sci. 105, (25), 8513–8518) – the XED force field provides both qualitatively and quantitatively correct results for the interaction of phenyl and pentafluorophenyl groups.

The XED force field has undergone numerous improvements in the last 20 years. Unlike most other force fields, the XED force field is parameterised where possible against experimental data (microwave conformation energies, small molecule crystal structures etc) rather than relying purely on ab initio calculations. The latest version of the XED force field is XED 3, released in 2012. Among many other enhancements, XED 3 offers an improved treatment of nitrogen. Rather than having to assign separate types for trigonal and tetrahedral nitrogen, the XED 3 force field determines on the fly the degree of pyramidalisation that is appropriate in any given molecular environment, allowing for a continuum from completely flat N to completely pyramidal N. In addition, XED 3 has an improved description of halogens, correctly describing the ‘sigma hole’ in the heavier halogens and giving good results for halogen bonding. Cresset continue to develop and improve the force field on an ongoing basis.

Calculating fields to assess molecular interactions

Cresset’s fundamental 3D ligand similarity technology compares molecules in terms of their molecular electrostatic interaction potentials (MIPs), or ‘fields’. The MIP of a molecule is a scalar field where the value at each point in space is the interaction energy of a charged probe atom (with the van der Waals parameters of oxygen) with the molecule. These are calculated using the XED force field. As the interaction energies are poorly defined inside the molecule, we set the field value to zero anywhere where the van der Waals interaction energy is positive and larger than the absolute value of the electrostatic energy.

Dealing with a full 3D scalar potential is computationally difficult. You can sample the values on a grid, but you then have issues with gauge variance, grid spacing and so forth leading to irreproducibility. Instead, what Cresset do is ask “Where are the maxima/minima of the fields?”. Each such extremum is termed a ‘field point’, and the set of field points is uniquely defined for any given molecular conformation. The field points are usually displayed as colored spheres, where the visual extent of each field point is determined by the magnitude of the field – stronger fields get larger spheres. This allows you to see at a glance where the molecule can make a locally maximal electrostatic interaction with another molecule. The full definitions of the fields that we use and the algorithm that is used to compute the field point positions are detailed in the paper Molecular Field Extrema as Descriptors of Biological Activity:  Definition and Validation.

 


Molecular field extrema applied to Sildenafil extracted from PDB code 1UDT.

Cresset’s fields have been validated as part of the development of the the XED force field, and have also been extensively compared to experimental data from small molecule crystal structures. The distribution of H-bond donors and acceptors around a functional group is a good proxy for the interaction potential, and we ensure that the field point patterns obtained are consistent with this information. Fields are also particularly useful for describing the properties of aromatic systems (building on the XED force field’s excellent description of these): the field surface around an aromatic ring holds a wealth of information about how electron rich/poor it is, how its charge density is arranged, and how strong a π-stacking interaction it could make. A few example rings are shown below.


Isostar plot of oxazole, pyridine, fluorobenzene.
Whenever you are describing molecules in terms of electrostatics, it is critical to handle formal charge states correctly. Cresset have a complex rule-based system for assigning formal charges which, while not a complete pKa estimator, will correctly assign the protonation state for the vast majority of drug-like molecules at pH 7. However, just assigning the formal charge state is not enough. Solvation is much more important for ions than for neutral molecules, so additional effort needs to be made to account for that. Our full algorithm for handling formal charges in small molecules is detailed in the blog post How to calculate the electrostatic environment around charged molecules.

Field similarity for ligand based design

We can compute good-quality electrostatic potentials on a molecule based on an advanced representation of its underlying charge structure. The next step is being able to compare two conformations in terms of their electrostatic similarity. You can do this just by comparing the field points of the two molecules, which leads to a pharmacophore-like technique. However, a better solution is to take account of the full electrostatic potential.


The full electrostatic potential contains more information than just the fact that the nitrogen is an acceptor.

We need to compute similarity in terms of the underlying potentials, not just in terms of the field points. This is computationally difficult, and Cresset’s solution is both elegant and effective. We compare the fields of the two molecules, but do so only at the places where one of the conformations has a field point. That keeps the number of field computations limited, but ensures that the field is computed only at places where at least one of the conformations suggested that the field was important (i.e. at a field point). The full algorithm has been published (Cheeseright et al JCIM 2006).


We compute a score for conformation A into conformation B by determining the field potential for B at the places where the field points for A lie. The overall score can be made symmetric by also computing the converse B-into-A score and averaging the two.
However, it is not enough to score a particular alignment: you need to be able to locate the optimal alignment. This is a difficult global optimisation problem. Cresset’s solution is to generate a set of initial alignments by computing colored clique matches between the sets of field points on the two conformations: a clique match is a set of field points on each conformation that match in terms of field point type and in terms of all of the inter-field-point distances (to within a distance tolerance). Each clique match determines an alignment (by least-squares fitting of the matching field points in 3D), and the alignments are then scored according to the field similarity algorithm described above. The top-scoring alignment is then taken as the ‘correct’ alignment for those two conformations.

In many cases, of course, we do not know which conformation the molecules should be in. If comparing two known bioactive conformations, for example from protein crystal structures, then the field similarity algorithm can be applied directly. It is more often the case that one of the molecules has an unknown conformation: the best example is virtual screening where we are searching with a defined 3D conformation of the query molecule, but we do not know a priori which conformation of the molecules that we are searching is going to be relevant. In this case we perform a conformation search, generating a set of conformers that represent the available configuration space of the molecule. We align each of these to the query and take the best-scoring alignment as the overall score of the molecule.

In some circumstances we need to compare molecules without knowing the bioactive conformation of any of them. In this situation we have to compute conformer populations of both molecules and compare each conformation of the first to each conformation of the second. This is the procedure that is used in our FieldTemplater technology for pharmacophore elucidation.

The field similarity algorithm is fast. Comparing a query conformation to a set of 100 conformers of a molecule takes 1-2 seconds on a single CPU core. It has proved very effective for virtual screening. Assessment of the performance of Cresset’s similarity algorithm as embodied in our Blaze virtual screening software shows that it performs significantly better than docking (Cheeseright et al JCIM 2008). The algorithm can be enhanced by combining it with a shape similarity calculation (ref Grant and Pickup paper): the overall similarity is a weighted combination of the field similarity and the shape similarity. In most cases an equal weight is used (50% shape, 50% fields), but this is customizable by the user for particular circumstances.

Further enhancements of the field similarity calculation include the ability to add field constraints and excluded volumes. Field constraints are used to mark a particular region of field as being of higher importance than the rest while excluded volumes enable the use of protein structure information to constrain alignments to lie within the available space. Both of these can significantly improve the accuracy of alignment and scoring.conformation search

About our science

Cresset software is based on robust science that delivers results you can rely on.

Our goal is to help you discover, design and optimize the best possible molecules for your project. We believe that computational methods are an excellent way of arriving at a better understanding of the properties and behaviors of chemical structures and proteins.

Excellent science is at the heart of everything we do, and we always prioritize scientific rigor over ease of use and faster runtimes. Our technology makes extensive use of the XED force field developed by Dr Andy Vinter, the company’s founder and Chairman, to describe a more detailed electrostatic environment around atoms.

Our patented algorithms use this detailed view of electrostatics together with shape to describe the key ligand-protein interactions that underpin biological activity. We continue to work on this pioneering science, crafting ever improving computer models of molecular interactions.