Investigating the SAR of XIAP ligands with Electrostatic Complementarity maps and scores


Electrostatic Complementarity™ maps implemented in Flare™,1 Cresset’s structure-based design application, were used to investigate the protein-ligand electrostatic interactions and the Structure-Activity Relationship (SAR) of a small set of inhibitors of the X-linked IAP (XIAP)-caspase protein-protein interaction. A good correlation was also obtained between XIAP-BIR3 affinity and the Electrostatic Complementarity scores for the same data set.


Inhibitor of apoptosis proteins (IAPs) are key regulators of antiapoptotic and pro-survival signaling pathways.2-4 Their deregulation occurs in various cancers and is associated with tumor growth, resistance to treatment and poor prognosis. This makes them an attractive target for anticancer drug discovery.5-7  The best characterized IAP, X-linked IAP (XIAP), exerts its antiapoptotic activity by binding and inactivation of caspases 3, 7, and 9 via its BIR domains. Disruption of the protein-protein interaction (PPI) between XIAP-BIR domains and caspases via small molecules is a promising strategy to inhibit XIAP. However, drugging PPIs can be particularly challenging due to their unusual binding interfaces, which are unlike classical binding sites generally flat and large.8

A recent paper from Astex9 reports that the XIAP-BIR3 activity of the small dataset of antagonists in Table 1 is increased by the introduction of electron-withdrawing substituents on the indoline ring, and shows a nice correlation between the XIAP-BIR3 pIC50 and Hammett’s σp.

In this case study, we used the Electrostatic Complementarity maps available in Flare to investigate the protein-ligand electrostatic interactions and the SAR of the molecules in Table 1. Electrostatic Complementarity scores calculated with Flare were used to quantitatively model XIAP-BIR3 pIC50.

Table 1. XIAP-BIR3 affinity of C-6 substituted indolines.8


Protein preparation

The 5C7A ligand-protein complex was downloaded from the Protein Data Bank into Flare and prepared using the Build Model10 tool from BioMolTech,11 to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes and assign optimal protonation states to the protein structure. Any truncated protein chain was capped as part of protein preparation. The binding site was visually inspected to check for correct protonation states of ligands and amino acid side chains and re-optimize water orientations of suboptimal water hydrogen bonding networks. We chose to keep only water molecules in and close to the binding site that have at least 2 hydrogen bonding contacts to the protein or at least 1 hydrogen bond to ligand and protein for electrostatic complementarity calculations. As many of the modeled binding modes (e.g., compounds 9, 11, 15, 16) clash with the flexible side chain of Lys297 (Figure 1), the side chain atoms were minimized with the XED force field12 for each ligand. The resulting receptors were used to compute the electrostatic complementarity of the respective compounds.

Figure 1. The PDB: 5C7A ligand-protein complex.

Data set construction

The compounds in Table 1 were drawn using the molecule editor in Flare, starting from the crystal structure of the ligand in PDB:5C7A (compound 7 in Table 1). The 11 compounds were then aligned in Forge13 to the 5C7A ligand, using a Maximum Common Substructure alignment to minimize the conformational noise in the common indoline-piperazine scaffold.

Electrostatic Complementarity surfaces and scores

Electrostatic Complementarity maps and scoring functions are an extension of Flare’s Protein Interaction Potentials based on Cresset’s polarizable XED force field. In contrast to classical force fields that rely on atom-centered charges, XED enables description of anisotropic charge distribution around atoms which is usually only possible with ab initio approaches. Polarization effects and description of atomic charge anisotropy are especially useful for computing electrostatic properties of aromatic or unsaturated hydrocarbons, sp2 hybridized oxygen atoms, sp or sp2 hybridized nitrogen atoms, and aromatic halogens (sigma hole of Cl, Br, and I).14-16

To calculate the Electrostatic Complementarity map for a ligand towards a protein of interest, the solvent-accessible surface is first placed over the ligand. A calculation of electrostatic potentials due to the ligand and the protein is then carried out at each vertex on the surface.

These potentials are then scaled, added together, and normalized to yield the Electrostatic Complementarity score. Perfect electrostatic complementarity means that at each vertex point the ligand electrostatic potential value is paired with a protein electrostatic potential value of the same magnitude with reverse sign. Regions of the ligand surface where there is electrostatic complementarity with the protein are colored green, while the regions where there is a electrostatic clash are colored red. A more detailed description of the electrostatic potential and complementarity methodology will be presented elsewhere.17

The Electrostatic Complementarity scores quantify the ligand-protein electrostatic complementarity with three different metrics suitable for diverse protein-ligand scenarios.

The first computed score (‘Complementarity’) is the normalized surface integral of the complementarity score over the surface of the ligand (effectively the average value of that score over the surface of the ligand).

The other two scores (‘Complementarity r’ and ‘Complementarity rho’) are the Pearson’s correlation coefficient and the Spearman rank correlation coefficient, respectively, which are computed on the raw ligand and protein electrostatic potentials sampled on the surface vertices.

All three measures range from 1 (perfect complementarity) to -1 (perfect clash) but have different characteristics. The Complementarity score includes some compensation for desolvation effects, and so may be more robust when these are significant. The Pearson and Spearman correlation coefficients can provide a better indication of ligand activity in some cases, but are more susceptible to noise (r more than rho). The Spearman’s rho number is more robust against background electric fields, which may be useful if the computed protein electric potential is being biased by a large net charge on the protein.

The calculation is fast and predictive: scoring a hundred ligands normally takes less than a couple of minutes on an average laptop and gives important insights into protein-ligand electrostatics, which typically correlate with compound activity.

Mapping the electrostatics of the XIAP active site

The Electrostatic Complementarity map of compound 7 in the XIAP active site (PDB: 5C7A, Figure 2 – left) shows a strong electrostatic clash (red) in the region above the indoline ring. This is caused by an area of negative electrostatic potential in the protein’s active site, generated by the backbone carbonyl of Gly306 and the phenolic oxygen of Tyr324 (Figure 2 – middle), clashing with the negative electrostatic field associated with the indoline ring (Figure 2 – right). A less pronounced electrostatic clash can be seen between the positive electrostatic field of the protonated side chain of Lys297 (Figure 2 – middle) and the positive electrostatic field of the sigma hydrogens of the indoline ring (Figure 2 – right).

According to this map (and in agreement with the reported correlation8), electron-withdrawing substituents which make the indoline ring less electron-rich are expected to increase XIAP binding. Substituents associated with a more negative (or less positive) electrostatic field, favoring the interaction with the protonated side chain of Lys297, should also be beneficial.

Figure 2. Left: Electrostatic Complementarity map for the PDB:5C7A ligand (green: good complementarity; red: electrostatic clash). Middle: protein electrostatic potential map for PDB:5C7A (red: positive; cyan: negative). Right: ligand fields for the ligand in PDB:5C7A (red: positive; cyan: negative.

Electrostatic Complementarity and XIAP SAR

Figure 3 shows the Electrostatic Complementarity maps for the compounds in Table 1, shown in order of increasing XIAP-BIR3 activity from left to right.

A clear trend can be observed as we move from the electron-donating substituents (-NH2, -OMe), to the electron-withdrawing substituents -F, -Cl, -SO2Me. These make the indoline ring less electron-rich, reducing the clash with the negative electrostatic of the XIAP active site.

Figure 3. Electrostatic complementarity maps for some of the ligands in Table 1 (green: good complementarity; red: electrostatic clash).

 The substituents for the three most potent compounds are also associated with a negative ligand field of their own (Figure 4), favoring the interaction with the protonated side chain of Lys297, according to our initial hypothesis.

Figure 4. Negative ligand fields (cyan) for compounds 17, 15 and 16.

 These qualitative observations are confirmed by the nice correlation (r2 = 0.671) between XIAP-BIR3 pIC50 and the ‘Complementarity rho’ score shown in Figure 5.

Figure 5. Plot of XIAP-BIR3 pIC50 versus Complementarity rho.

Electrostatic Complementarity scores and MW

We monitored the correlation between MW and XIAP-BIR3 affinity/Complementarity rho to verify whether the Electrostatic Complementarity scores provide information which goes beyond the use of simple physico-chemical descriptors for drug design.

The correlation between MW and XIAP-BIR3 pIC50 (r2 = 0.613, Figure 6 – left), would possibly point towards a space filling effect as the simplest explanation of the changes in XIAP affinity in this data set.

However, the low correlation between Complementarity rho and MW (Figure 6 – right) confirms that the Electrostatic Complementarity scores are size independent.

Using the Electrostatic Complementarity scores for quantitative SAR modeling, therefore, generates trends completely independent from size effects.

Furthermore, Electrostatic Complementarity maps provide visual insight into ligand-protein binding and SAR which cannot be derived from traditional, simple physico-chemical descriptors such as MW and Hammett’s σp, thus providing invaluable information for drug design.

Figure 6. Left: Plot of XIAP-BIR3 pIC50 versus MW. Right: Plot of Complementarity rho versus MW.


Application of Electrostatic Complementarity to a reported XIAP-BIR3 data set showed that our method can detect and quantify electrostatic differences in XIAP ligands that cause changes in bioactivity. Electrostatic Complementarity scores and maps in Flare V2, based on Cresset’s polarizable XED force field, provide rapid activity prediction with visual feedback on new molecule designs. They provide useful information for understanding ligand binding and SAR and can be used for rapidly ranking of new molecule designs.

References and Links

  2. Salvesen, G. S. et al., Rev. Mol. Cell Biol. 2002, 3 (6), 401-10
  3. Gyrd-Hansen et al., Nat. Cancer 2010, 10 (8), 561-74
  4. Silke, J. et al., Cold Spring Harbor Perspect. Biol. 2013, 5 (2), a008730
  5. I et al., Clin. Cancer Res. 2004, 10 (11), 3737-3744
  6. Mizutani, Y. et al., Int. J. Oncol. 2007, 30 (4), 919-925
  7. Fulda S. et al., Nat. Rev. Drug Discovery 2012, 11 (2), 109 -124
  8. Arkin, M. R. et al., Chem. Biol. 2014, 21 (9), 1102-1114
  9. Chessari, G. et al., J. Med. 2015, 58 (16), 6574-6588
  10. V. Stroganov et al., Proteins 2011, 79 (9), 2693-2710
  14. Vinter, J. G., Comput. Aided Mol. Des. 1994, 8 (6), 653–668
  15. Vinter, J. G., Comput. Aided Mol. Des. 1996, 10 (5), 417–426
  16. Chessari, G. et al., Chem. Eur. J. 2002, 8 (13), 2860–2867
  17. Bauer, M. R. & Mackey, M. D. et al., manuscript in preparation

Water stability is key to designing novel patentable chemistry

An analysis of the water stability and positions in a ligand-protein complex informed the design of novel ligands for a customer target. This work led to new active chemistry that the customer went on to patent.

A Cresset Discovery Services customer had identified a novel target with a natural ligand and were looking for new chemistry that would be active at the target site. Our scientists carried out an initial project to learn more about the protein-ligand system. The Cresset field approach, used to analyse the structure and interactions, gave the customer valuable insights into the active features of the ligand.

The customer used this information to develop analogous synthetic compounds and example molecules. They asked us to work with them again to computationally align the example molecules and prioritize them for synthesis.

We carried out an initial alignment and then modeled the system in detail. It appeared that part of the molecule that was important for the interaction was not making any contact with the protein.

The PDB had some crystal structures of related proteins, but not of the target of interest. We studied the available protein data to learn as much as possible about the binding pocket, paying particular attention to the positions and stability of the water molecules. This led to us putting forward the hypothesis that an important part of the ligand interaction included the stabilization of water.

Based on this hypothesis we prioritized the molecules that bridged the observed gap between the natural ligand and the target while also stabilizing the free waters.

Water analysis was carried out by manually superimposing multiple crystal structures, viewing the crystallographic waters that clustered together, and mapping on their temperature factors. This process allowed us to determine the importance of each water molecule in the solvation sphere around the ligand and protein pocket. With the advent of the new 3D-RISM method in Flare a similar computational work-flow can be accessed which is far more efficient for this type of analysis. This is a more systematic approach which enables us to calculate the position and stability of all water molecules around a proposed ligand in a binding pocket. Moreover, as it does this without the need for any crystallographic water data, this is far more useful as well as convenient. Ultimately, this data can be used to assess or compare ligands in terms of how well they might stabilize essential water.

Based on our equivalent ‘hands-on’ analysis, we worked with the customer to choose the best candidates for synthesis. These newly-designed ligands resulted in new active chemistry for the customer that was valuable enough for them to patent.

The position and energetics of water molecules in and around the active site is of crucial importance when designing novel ligands. Knowing which water molecules are energetically favorable can give valuable insights into the best positions for ligand molecules. 3D-RISM analysis is one of the methods available in Flare for structure-based drug design.

Homology modeling and ligand electrostatics plays key role in elucidating binding mode and molecular interaction of new class of antifungal drugs

Last month F2G published a paper in PNAS [1] describing F901318, the leading representative of a novel class of antifungal drug. Dr Martin Slater, Director of Cresset Discovery Services, is a co-author on the paper. He describes how modeling work carried out by Cresset Discovery Services was critical to predicting the binding mode of the inhibitor and important interacting amino acid residues. F901318 is currently in clinical development for the treatment of invasive aspergillosis.

There is an important medical need for new antifungal agents with novel mechanisms of action to treat the increasing number of patients with life-threatening systemic fungal disease and to overcome the growing problem of resistance to current therapies.

F2G are a UK-based antifungal drug discovery and development company who have identified F901318 as a leading representative of the orotomides, a novel class of antifungal drug. Their identification of dihydroorotate dehydrogenase (DHODH) as the mechanism by which F901318 inhibits and kills Aspergillus fumigatus has been a major breakthrough differentiating F901318 from other systemic antifungal agents.

From hit to lead with medicinal chemistry

F2G had a large amount of proprietary cellular activity data developed over time against their antifungal screening platform. After an initial hit finding campaign significant progress had been made using classical medicinal chemistry approaches.

F2G were keen to inform and assist the development process by gaining a molecular level understanding of the target protein ligand system. They approached Cresset Discovery Services for help in elucidating the molecular interaction of the target protein-ligand system.

A detailed molecular understanding with modeling

Cresset’s unique approach of defining the electrostatics around the active chemotype made it possible to identify the precise nature of the various sites on the active molecules. In conjunction with sequence analysis across the wider DHODH family, Cresset scientists were able to match these subtle ligand features to the patterns of residues that were likely to be key.

Subsequent homology and ligand protein interaction modeling of Aspergillus fumigatus DHODH using the XED force field identified a predicted binding mode of the inhibitor and important interacting amino acid residues.

We combined a detailed ligand centric approach using Forge with protein modeling using a prototype of the new Cresset protein tool to arrive at a binding hypothesis consistent with the selectivity profile. The modeling process is fully reported in the paper [1].

Testing in silico hypotheses in vitro

Having made a binding hypothesis, a number of lab experiments were initiated by F2G to check the predictions e.g., using site directed mutagenesis.

Most satisfyingly, the lab results supported our predictions.

F901318 is currently in late Phase 1 clinical trials, offering hope that the antifungal armamentarium can be expanded to include a class of agent with a mechanism of action distinct from currently marketed antifungals.

Cresset’s consulting work with F2G provided valuable insight into the predicted interaction pattern of the main chemical series with the Aspergillus DHODH target protein. As with many research projects, any level of understanding achieved is often a prelude to even deeper questions, and there are many remaining to be answered for this unique system. Cresset continues to work closely with F2G, providing software and services to support them in their ongoing projects.










Dr Martin Slater

Director, Cresset Discovery Services

Build and cluster diverse 3D libraries

Cresset Discovery Services (CDS) worked with BioBlocks to analyze their fragment library to maximize coverage of 3D chemical space. As part of the project, we developed an innovative clustering method that made it possible to assess the 3D similarity across their virtual database of over 1.5 million fragments.

The goal of the project was to help BioBlocks build the maximum 3D diversity into a fragment library of manageable size from a starting pool of over a million compounds. Existing techniques would have required an infeasible amount of computing power, so CDS developed an entirely novel rapid clustering method especially for the project. The solution was still extremely computationally challenging, but we were able to use our expertise in distributing calculations to the cloud to deliver the results that BioBlocks needed on time and within budget.

“Working with Cresset has been a positive experience from start to finish,” said Warren Wade, VP of Chemistry at BioBlocks. “Because our fragments are designed to be new chemical matter, they challenged the limits of existing structural descriptions. Cresset worked closely with us to overcome these limits and produce a high value compound set”.

The final result was a 3D fragment library that contains a significant number of compounds with novel core structures that are now viable candidates for fragment screening. BioBlocks envisions this Comprehensive Fragment Library to be a drug discovery tool available only to collaborators who will be able to leverage this new chemical space for their lead discovery programs. Hits from the library are entry points to BioBlocks’ collaborative medicinal chemistry processes, developed to increase the probability of generating commercially viable leads.

3D Similarity-based clustering workflow
3D similarity-based clustering workflow

Read more about this project: Large scale compound clustering in 3D.

Contact Cresset Discovery Services to find out more about how we can help you design large scale libraries for your project.

Engaging with Cresset Discovery Services

Cresset Discovery Services (CDS) offers bespoke in silico services for small molecule discovery. We do a lot of work in drug discovery and optimization for the pharmaceutical industry but we also work extensively in agrochemicals, flavors, fragrances – in fact, in any industry that involves work with small organic molecules.

This post explains the process that we go through when customers work with Cresset Discovery Services, from the first contact to the final deliverables.


At the enquiry stage we talk with customers about their requirements in general terms to get an idea of whether we will be able to help them. The answer is usually yes, but we will certainly let you know if we think that our approach would not be the best match for your project.

These initial discussions will involve members of both our sales team and the scientific team. Everything at the enquiry stage is free, but the discussions will not be at a great depth since confidential details cannot yet be shared.


Once both sides have agreed to proceed, we exchange confidentiality agreements and can then get down to the details. The customer will share their confidential data and CDS will prepare a detailed proposal of the work they will carry out.

This stage will involve a detailed meeting to gather the data and another to present the proposal. The proposal will include full details of pricing and milestones. If the work is a collaboration, then all partners will be involved at this stage.


Close collaboration is key to any successful project. Depending on the size and complexity of the project, there may be several long meetings at the start of the project. These could involve many members of the customer team. The goal is to focus on the project and to scope out exactly what needs to be achieved.

Work then moves to the details – for example, what to do, with which molecules and which conformations. This could involve conversations several times a week until everything is in place to run the study.

Frequent reviews take place throughput the project between the customer and CDS. Each customer has a personal point of contact who remains consistent throughout the project.

At each stage of the project there will be several conversations to make sure that the customer is getting exactly what they wanted. These will be tied in to agreed milestone reviews and deliverables.

Project deliverables are likely to be available through the project, not only at the end. No matter when they are delivered, the approach remains the same: we make sure that the customer gets the maximum value out of the results.

For example, typical results for a large screening project with multiple compounds may be between 10,000 and 20,000 hits. But CDS will make sure that the customer gets more than a list from the project. We will always ensure that the customer fully understands and can interpret the results in the context of the project in order to get the best out of them.


No project is complete without a project review of what went well and what could go better. As part of this process we agree the next steps, which could range from a follow-on project, to advice on the next research steps.

Many of our customers remain customers for the long term. In fact, when we do lose a services customer it’s usually because they have decided to buy our software and hire a computational chemist to work full time. This case study describes how we helped one customer to hire and train computational chemists. Even then, customers still come back to us for projects if they need the extra resource.


Contact us today to start the process of working with CDS.


What can the cloud offer computational chemistry?

The latest edition of Innovations in Pharmaceutical Technology (IPT) includes the article Sky’s the Limit by Tim Cheeseright and Katriona Scoffin of Cresset outlining some of the key benefits of the cloud for computational chemistry.

They point out that, “computational chemistry methods all involve a trade-off between accuracy and computational resources”. Cloud computing makes it easy to access computing power on a flexible basis, translating to “better results faster and cheaper”. Other benefits include, “flexible access to computing resources meaning users only pay for what they need” and “easy to use web interfaces that remove the need for local installation”.

Issues of security around using the cloud are also discussed, notably that, “cloud computing is an infrastructure and, in that sense, the security is as good as the product that is built upon it.”

The article also includes a recent example of how Cresset used Blaze Cloud to cluster a large database to create a diverse compound library.

Displacing crystallographic water molecules with Spark


Cresset’s Spark1 software for bioisosteric replacement was used to carry out a water displacement experiment starting from the X-ray crystal structure of a selective inhibitor of Bruton’s tyrosine kinase2. The use of databases derived from available reagents ensured that the results could be tethered to molecules that were readily synthetically accessible. The availability of a sufficiently diverse source of reagents was crucial in demonstrating the feasibility of this approach.


Bruton’s tyrosine kinase (Btk) is a member of the Tec family of non-receptor tyrosine kinases. Recent literature findings2 indicate that Btk inhibition could be an attractive approach for the treatment of autoimmune diseases such as rheumatoid arthritis, a progressive autoimmune disease characterized by swelling and erosion of the joints3.

A fragment-based drug design approach was recently2 applied to the discovery of non-covalent, potent inhibitors of Btk inhibitors with Lck selectivity (Lymphocyte-specific protein tyrosine kinase, a target playing a key role in T-cell activation).

Among the most interesting hits identified with this approach, compound 2 (Table 1) was selected for further optimization. Position 8 of the cinnoline ring of fragment 2 was explored using the Suzuki−Miyaura4 synthetic methodology, starting from a series of monocyclic boronic acids/esters. This initial SAR exploration led to the discovery of compound 8 (Table 1), which shows improved potency and selectivity with respect to fragment 2.

The published X-ray crystal structure of compound 8 in the active site of Btk (PDB 4ZLZ) shows a water-mediated hydrogen bond from the pyridyl nitrogen to the P-loop backbone residues Phe413 and Gly414 of Btk2 (Figure 1 – left). The replacement of 4-methylpyridin-3-yl in compound 8 with small bicyclic heterocycles displacing the water molecule and making direct H-bond interactions with the P-loop led to the discovery of compounds 10 and 11 (Table 1), with a 10-fold improved potency towards Btk.

The 3D structure of compound 8 and the bridging water molecule were used as the starting point for this Spark case study. The aim of this experiment is to verify whether our methodology is able to displace the bridging water molecule and correctly identify the same alternative indazole fragments.

Table 1. SAR exploration of fragment hit 2
SAR exploration of fragment hit 2

Spark reagent databases: accessing available chemical diversity

Spark’s approach to scaffold hopping and R-group replacement uses Cresset’s field-based technology5 6 to identify viable replacements for a selected portion of a reference compound using a series of fragments. In this case study we chose to use standard reagent databases7 supplied by Cresset which are based on the available chemicals directory. This gives the opportunity to rapidly search all R-groups that could be introduced at a selected position. However, an optional Database Generator module enables the creation of fragment databases that are derived from corporate compound registries or inventory systems, linking your available chemistry directly to the Spark experiment.


The published X-ray crystal structure of compound 8 bound into to the active site of Btk (PDB 4ZLZ) was downloaded into Forge8. The structure of the ligand was minimized and then combined with the water molecule mediating the H-bond interaction with the P-loop backbone residues of Btk to make a single molecule entry. The merging of the two 3D structures was done using the ‘combine selected pair into single molecule’ feature available in Forge. The unique entry thus created (see Figure 1 – right) was used as the Starter molecule for the Spark experiment (Figure 2 – left).

In this water displacement experiment, we want the Spark search to be driven mainly by the electrostatic fields, rather than by the usual combination of fields and shape.

For this reason a constraint was added to the negative and positive field points of the water molecule using the Spark Field Constraints Editor (Figure 2 – right). This introduced a score penalty for those results that did not match the constrained field points.

Furthermore, the ‘Normal’ conditions for scoring the Spark search results were fine-tuned to 90% Field and 10% shape, using the Btk protein as a ‘hard’ excluded volume, to constrain the size of the potential replacement fragments.

X-ray crystal structure and 3D structures
Figure 1. Left: X-ray crystal structure of compound 8 in the active site of Btk making a water mediated hydrogen bond with the P-loop backbone. Right: 3D structures and field points of compound 8 and of the bridging water molecule combined into a single entry.
Color coding of field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

The gradient cutoff for minimization was set to 0.200 kcal/mol/A, removing at the same time the automatic constraint of fragment size to ensure that the results of the search were not too biased by the size of the starter molecule.

Finally, to focus the experiment on small bicyclic heterocycles, monocyclic fragments were filtered out from the list of potential results using an appropriate SMARTS filter.

Two runs of Spark were carried out using the above conditions. The initial experiment was run on a database of 775 boronic acids to closely replicate the chemistry used in the original publication2, 4.

Combined 3D structures and constraints associated to the field points
Figure 2. Left: the combined 3D structures of compound 8 and the bridging water molecule used as a starter molecule in the Spark experiment. Right: constraints associated to the field points of the water molecule.
Color coding of field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

In the second experiment, the ZINC7 database of commercial aromatic halides (41K fragments) was also searched to explore a larger chemical diversity, starting from the assumption that the appropriate boronic acid/ester could be obtained from any interesting commercial aryl halide at the cost of an additional synthetic effort.


The top scoring compound from the initial search (boronic acids only) is compound 10 (Table 1). As can be seen in Fig. 3 – right, this compound superimposes very well with the starter molecule and matches the constrained field points in a satisfactory manner. However, compound 11, which would presumably superimpose even better with the conformation of the ortho-methyl-pyridin-3-yl group of compound 8, was not found in this search, due to the limited chemical diversity of the database searched.

In the second Spark search, which was run on a much larger collection of reagents (boronic acids and aryl halides), compound 11 (Fig. 3 – center and Fig. 4) is the top scoring result, while compound 10 ranks 4th in the list (Fig. 4).

The original paper2 also reports the indole-substituted compound 9 (Table 1), quite similar in terms of 2D structure to the much more potent indazole compounds 10 and 11. This fragment is available in both the databases searched, but is not retrieved by Spark. The indole fragment in fact cannot match the constrained negative field point of the bridging water molecule, as shown in Fig. 5, where compound 9 is shown superimposed to the starter molecule in Forge. The lack of this relevant interaction explains the much lower potency of compound 9, with a Btk IC50 = 850nM (Table 1).

Figure 4 shows a tile view of the 16 top scoring results from the second Spark experiment. Several different flavors of the indazole fragment carrying different substitution patterns are represented in this list. Alternative bicyclic fragments are also proposed, which may provide useful ideas for a further exploration of this target.

Electrostatics starter molecule_Compound 11_Compound 10
Figure 3. Left: electrostatics of starter molecule. Center: compound 11 (Btk IC50 = 4.0 nM). Right: compound 10 (Btk IC50 = 12 nM)
Color coding of fields/field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

Tile view of top scoring Spark results
Figure 4. Tile view of the top scoring Spark results for the second experiment.

Compound 9 superimposed to starter molecule
Figure 5. Compound 9 (right) superimposed to the starter molecule of the Spark experiment (left).


In this case study Spark successfully managed to displace the crystallographic water molecule bridging the interaction between compound 8 and the P-loop of Btk, replacing it with small, synthetically accessible bicyclic heterocycles.

Availability of appropriate sources of chemical diversity is still a key factor in determining the success of any bioisosteric replacement experiment.

For this reason, the creation of fragment databases derived from corporate compound registries or inventory systems, linking your available chemistry directly to the Spark experiment, is highly recommended.

References and links

2. Smith, C. R. et al., J. Med. Chem. 2015, 58, 5437−5444
3. Firestein, G. S., Nature 2003, 423 (6937), 356−361
4. Miyaura, N., Suzuki, A. et. al., J. Am. Chem. Soc. 1989, 111 (1), 314−321.
5. J. Chem. Inf. Model., 2006, 46, 665-676.
7. Spark fragment databases come from commercial compounds, ChEMBL, ZINC and VEHICLe.

Identifying bioisosteres of the benzazepine scaffold

Drug discovery projects continuously explore novel and diverse structures with the objective of optimizing existing leads, improving IP position, or identifying new leads by switching scaffolds completely. The identification of novel chemotypes can be particularly difficult for those targets where the crystallographic information is scarce or unavailable (for example GPCRs, ion channels and novel targets). In this case study, working from just a 2D fragment of a known active D3 antagonist, we show how Spark was able to quickly identify a variety of alternative scaffolds, some of which have proven D3 activity.

Figure 2 Results for the first run of Spark searches
Results for the first run of Spark searches. Lime green: SB-414796; cyan: known D3 scaffolds; magenta: other Spark bioisosteres.

Read the case study.

Elucidating the bioactive conformation of CCR5 Chemokine Receptor inhibitors

There are still many projects which do not have a relevant protein-ligand crystal structure to drive compound design. This includes those targeting GPCRs and Ion Channels as well as those working with phenotypic or whole-organism screens. In such cases, field pharmacophore modeling as implemented in FieldTemplater can help to decipher how and which active compounds interact with a common protein target and which parts of those active molecules are involved in binding, in the absence of any protein information.

FieldTemplater generates a series of conformations that the ligands might adopt at physiological conditions. It analyzes these conformations to find sets that show a high molecular field similarity (and hence have similar shape/binding properties). Where all the ligands with a common activity align well, it is very likely that this is the bioactive conformation.

The case study Elucidating the bioactive conformation of CCR5 Chemokine Receptor inhibitors shows how FieldTemplater, working from just a few 2D structures of known active CCR5 Chemokine Receptor inhibitors, was able to correctly reproduce the bioactive conformation of the CCR5 receptor inhibitor Maraviroc as derived from the 4MBS PDB crystal structure, without making use of the X-ray information about the binding mode of this ligand. Additionally, FieldTemplater indicates the relative alignments and likely bioactive conformations of 3 further CCR5 inhibitors enabling the transfer of SAR between series. The case study gives full experimental details and results and is used in a web clip to show the power of the Cresset Engine Broker to accelerate computationally intensive experiments.

Dr Giovanna Tedesco, Product Manager

Deciphering complex aromatic SAR

The substitution of aromatic groups provides a unique tool to moderate the potency and physicochemical properties of drug like molecules. However, the huge variety of substitutions that are possible can give rise to SAR that is almost impossible to understand, with small changes resulting in large shifts in potency. In these circumstances the understanding of the causes of the observed activity cliff is critical to progressing the project aims. This is an area where we at Cresset have always felt that using molecular interaction fields gives you a head start as you can model the electrostatic and shape properties of the molecule accurately. The release of the Activity Miner module for Forge and Torch significantly improves this process by detecting automatically activity cliffs in the SAR. Below we present a case study on a small set of changes around a set of reported DPP-IV inhibitors and show how the Activity Miner interface helps find the root causes of the changes in activity.

A set of DPP-IV inhibitors related to the ligands from PDB codes 2QOE and 2P8S were extracted from bindingdb together with IC50 values for enzyme inhibition. Using Forge, PDB 2QOE was downloaded and split into reference ligand and protein. The ligand from PDB code 2P8S was downloaded as a fixed conformation and aligned to the 2QOE reference using the default ‘normal’ settings then added as an additional reference molecule. The remaining 31 compounds in the dataset were aligned using the ‘Substructure’ method to these references with the maximum score against any reference being used to choose the alignment. The resulting alignments are shown below.

The aligned dataset was transferred to the Activity Miner module to study the SAR around the terminal phenyl substituent. Using the activity view focused on the most active compound (shown below) highlights that the SAR around this substituent is complicated with many small changes resulting in significantly worse IC50 values. The activity view presents a central (focus) molecule, with the most similar molecules to the focus compound displayed in a wheel around it. The size of the segment represents the distance between the two molecules and the segment is colored by the disparity between the pair. Highly colored segments represent changes that result in disproportionately high changes in activity (colored red is worse activity, green is better).

It is interesting to contrast the activity view above with a classic SAR table:

row Phenyl substitution Activity (pIC50) row Phenyl substitution Activity (pIC50)
1 2,4,5-triF 8.2 6 3,4-diCl 5.8
2 2-Cl-4,5-diF 7.1 7 3-F 6.9
3 3,4-diF 6.9 8 2,4,5-triF 6.1
4 2,4,6-triF 7.1 9 4-F 6.6
5 2,5-diF 7.6

Clearly the SAR around the phenyl substituent is critical to activity but it is very difficult to decipher. However, with the combination of Activity Miner, field differences and the protein crystal structure we can get some pretty good hypotheses. (Note that all pictures below show field differences not absolute fields – regions where one molecule is more positive (red) or negative (blue) than the other.

1. The 2- substituent should have a negative field

The change of F to Cl in the 2- position (compare row 1 to row 2) is a slight increase in size but also introduces a small positive field at the end of the chlorine atom. It is interesting to note that the phenyl ring is slightly less electron poor when changing to chlorine (Cl is a better pi-donator than F). Taken together with the change of 2-F to 2-H (row 1 to row 3) there is a strong suggestion that this substituent should present a negative “end”. This is consistent with the protein crystal structure which indicates interactions with an arginine and the NH2 of an asparagine side chains.

Comparing row 1 to row 2 (top) and row 1 to row 3 (bottom) shows the less active molecules (right) are more positive at the end of the ortho substituent
Comparing row 1 to row 2 (top) and row 1 to row 3 (bottom) shows the less active molecules (right) are more positive at the end of the ortho substituent.

2. The 4 position prefers negativity at the end

Removing the 4-F from row 1 gives row 5. Moving the fluorine atom in this position round the ring one position gives row 8. In both cases the activity is reduced by the change. The smaller change in activity when going from F→H suggests that introducing a negative region in the 3 position is additionally unfavorable. Neither of these hypotheses are obvious from the protein crystal structure where both the 3 and 4 positions interact with a number of residues of various types.

Comparing row 1 to row 5 (top) and row 1 to row 8 (bottom) shows the less active molecules (right) are more positive at the end of the para substituent

Comparing row 1 to row 5 (top) and row 1 to row 8 (bottom) shows the less active molecules (right) are more positive at the end of the para substituent.

3. The 5 position must be negative at the end

All the changes that remove the negativity from the end of the 5 position result in significant drops in activity whilst those that retain the negativity, even in the absence of other favorable interactions retain some activity. For example row 4 has both the 2 and 4 fluoro atoms but is only pIC50 7.1. The reason for this becomes evident on examination of the protein crystal structure. This atom points directly at the edges of the indole from tryptophan-659 and the phenyl of tyrosine-670 (numbers from PDB 2QOE).

Comparing row 1 to row 4 (top) shows the less active molecules (right) are more positive at the end of the 5-substituent. Bottom shows the interaction of this substituent with the protein
Comparing row 1 to row 4 (top) shows the less active molecules (right) are more positive at the end of the 5-substituent. Bottom shows the interaction of this substituent with the protein.

4. The electron density of the phenyl substituent is important

This hypothesis is harder to establish as it comes from many observations. The most obvious is the change from row 3 to row 6 where there is a drop in activity from pIC50 6.9 to 5.8. Clearly this could be due to the increased size of the chlorine atoms but equally likely is the change in the electronic properties of the phenyl ring where highly electron poor rings have higher activity. This change is also observed where any of the fluorines of row 1 are deleted or where any atom is switched from fluorine to chlorine. Again the protein crystal structure helps to validate this hypothesis as the catalytic serine together with a couple of tyrosine residues point their respective alcohol oxygen atoms at the face of this ring.

Comparing row 3 to row 6 (top) shows the less active molecules (right) are more electron rich. Bottom shows the interaction of this phenyl ring with alcohols from the protein.


Many of our hypotheses could have been guessed at from studying the crystal structure of the 2,3,5-tri-fluorophenyl analogue in detail. However, the use of the field difference mode in Activity Miner brings the interactions into sharp focus and helps us rationalize the observations that we have. Subtle effects such as the difference between electron-rich aromatic and electron-poor aromatic rings are clearly visualized, explaining difficult and complex SAR in a way that is easy to interpret.

Our hypotheses can now be used in the design of new ligands with better IP or physicochemical properties with each design being validated against the regions of positive or negative field that we conclude to be important. Equally we could look for new ideas for this section of the molecule by using Spark together with the new reagent databases to suggest compounds (that we could make today!) that would retain the activity we have in this series while driving us into new regions of chemical space.