Forge V10.5 release delivers new functionality for molecule alignment, and more ….

V10.5 of ForgeTM, the powerful computational chemistry suite for understanding structure activity relationship (SAR) and design, is now available. This release introduces significant enhancements to molecule alignment, plus the new Conformation Explorer, to visualize and inspect conformational populations. Also included are a large number of GUI styling and usability improvements.

Improved molecule alignment

Molecule alignment is the core experiment in Forge. It is key to developing robust qualitative or quantitative SAR models, building FieldTemplater pharmacophore hypotheses, understanding  the design of new compounds and small scale virtual screening experiments (for larger scale virtual screening use Blaze). V10.5 enables fine-tuning of alignment results by introducing appropriate constraints, an optimized substructure alignment algorithm, and new similarity scoring options.

Field and pharmacophore constraints

Field and pharmacophore constraints bias the alignment algorithm by introducing a penalty which down-scores results that do not satisfy the constraint. This provides you with a mechanism for ensuring that the results that you get from your alignment experiment fit with the known SAR or with your expectations.

With field constraints, you can specify that a particular type of field must be present in the aligned molecule. For example, you may want to a constrain a positive field where you want an interaction but this can be matched by both H-bond donors and other electropositive features such as the aromatic hydrogens in the example below.

V10.5 introduces the new pharmacophore constraints, which ensures that your desired pharmacophore features (e.g., Donor H, Acceptor, Cation, Anion) are matched by an atom of a similar type in the alignment results. A pharmacophore constraint can be used when you are certain that a particular interaction requires transfer of electrons (as in H-bonding or metal binding) in addition to the electrostatic character of the interaction.

Pharmacophore constraints introduce a tighter constraint on the alignment than a field constraint. Where field constraints allow matches across chemical features, pharmacophore constraints are limited to matching specific functional groups (e.g., specific donor-acceptor interactions): alignments that do not place a suitable atom on top or close to the constrained atom cause a penalty to be applied to the score. However, pharmacophore constraints in Forge V10.5 go beyond traditional H-bond donor/acceptor definitions to include, for example, covalent centres and metal binding motifs giving the ability to ensure that key warheads always align in the correct positions.

While field and pharmacophoric constraints are a powerful way of fine tuning alignment results, we recommend that they are using sparingly, as they will be introducing a bias in your Forge experiment. E.g., introducing a pharmacophore constraint on the indazole NH of the PDB 4Z3V ligand in Figure 1 – left would not have matched the aromatic hydrogens of the active ligand in Figure 1 – right.


Figure 1. Left: Ligand from PDB: 4Z3V with pharmacophore and field constraints. Right: Active BTK ligand which satisfies both constraints.

Improved alignment and scoring

Enhancements to alignment and scoring, accessed from the advance options panel, include:

  • Option to require full ring matches, and to bias the alignment towards a specific substructure specified by a SMARTS pattern, in the maximum common substructure alignment algorithm
  • New functionality to weigh specific fields independently when scoring
  • New similarity metrics to provide alternate scoring methods for the alignments
  • New widget for adding field and pharmacophore constraints.

New Conformation Explorer

Molecular conformations are central to Forge. The conformation hunter does a good job of generating a diverse range of energetically accessible conformations. V10.5 gives you the opportunity to more easily inspect the conformations generated for your molecules, enabling you to interact with and edit the populations.

In the new Conformation Explorer, you can inspect a set of conformations with respect to energies, measured distances/angles/torsions, as well as calculate the CSD torsion frequency for each rotatable bond to assess the feasibility of the generated conformations.

Conformations are listed in order of increasing relative conformational energy. Unrealistic conformations or those which are not deemed interesting can be selected and removed from the conformation population for that molecule. Preferred conformations can be promoted to the reference role in Forge with the click of a button.

CSD torsion frequencies can be calculated for all rotatable bonds. These are based on the Torsion Library which contains hundreds of rules for small molecule conformations derived from the Cambridge Structural Database (CSD) and curated by molecular design experts. CSD torsion frequencies are useful to highlight cases where the torsion angle in a calculated conformation is not one that is frequently observed in the CSD, and accordingly is a possible cause for concern.

Distances, angles and torsions can be measured for each conformation and those values can be used for filtering or generating a histogram plot.

Conformation energies can also be plotted in an interactive histogram plot. In Figure 2, the column or bucket with the blue highlight reflects the current conformations shown in the 3D view; the grey columns or buckets reflect to conformations which do not pass the set of filters.

Conformations can be filtered by energy, CSD torsion frequency and calculated distances, angles, torsions. Smart coloring includes coloring by energy and by CSD torsion frequency.


Figure 2. The Conformation Explorer in Forge. Rotatable bonds are colored and labelled by CSD torsion frequency.

Other new features and improvements

This V10.5 release also includes a variety of additional new functionalities and improvements to the Forge interface, including:

  • Enhanced Molecule Editor with a more intuitive layout, featuring a radial plot that is updated as changes are made to a molecule and the new ‘Save a copy’ button to store your molecule directly into the project without leaving the editor
  • New support for touch screen displays
  • Enhanced stereo view functionality with improved accessibility
  • New functionality to export Activity Atlas™ models as surfaces from the GUI
  • New Forge surface command-line binary to export Cresset field surfaces (positive, negative, hydrophobic and vdW)
  • New functionality to sort disparity matrixes in Activity Miner™ by Forge project tags, enabling easier identification of molecules of interest
  • New capability to export molecules by drag-and-drop to the Windows desktop (Windows only)
  • New capability to annotate and re-name Storyboard scenes
  • New tagging of project molecules from the 3D window and according to cluster membership, as calculated in Activity Miner
  • New ‘Send to Flare’ functionality
  • Improved grid view function
  • Improved display of protein ribbons, offering a choice of different ribbon styles and the capability to show ribbons for the active site only
  • Improved look and feel of the GUI with re-designed toolbars and updated and clearer icons for a more modern and sleek interface.

Upgrade to Forge V10.5

Upgrade at your earliest convenience to try the new Conformation Explorer and pharmacophore constraints in Forge, together with the many new and improved features in this release.

Evaluate Forge

If you are not currently a Forge customer, download a free evaluation.

Sneak peek at Forge V10.5

New versions (V10.5) of Forge™ and Torch™  are due out next month. This release offers new science and functionality and plenty of improvements that significantly enhance both applications. Below is a sneak peek at some of the new functionality in Forge.

Pharmacophore constraints in alignment

In this release of Forge we have included the new options to constrain the alignments using specific pharmacophoric features. As in Blaze, constraints (e.g., DonorH, Acceptor, Cation, Anion, covalent center) can be added to reference molecules and must be matched in the alignment or a penalty will be applied to the score. Pharmacophore constraints will be useful in those cases (such as specific kinase targets or metal chelators) where explicit interactions dominate the alignments.

Alignment

Molecule alignment is significantly improved in V10.5. New and enhanced functionality include:

  • Improved substructure alignment algorithm
  • New capability to specify the substructure you wish to match by writing a SMARTS pattern
  • New alternative similarity metrics
  • New individual field similarity weighting
  • Improved field and pharmacophore constraints editor, to define field and pharmacophore constraints and add specified field points in the desired position in 3D.

The result of all these improvements will be significantly improved generation of alignments that match your expectations without manual interference.

Conformation explorer

Molecular conformations are central to what we do. We think that our conformation hunter does a good job of generating a diverse range of energetically accessible conformations. However, we wanted to give you the opportunity to more easily explore the conformations of your molecules, enabling you to interact with and edit the populations.

The conformation explorer is a new tool in Forge for visualizing and analyzing conformation analysis results. Within the conformation explorer you can:

  • Visualize all the conformations created for each molecule in your Forge project
  • Delete unwanted conformations
  • Calculate and plot distances, angles and torsions
  • Calculate the CSD torsion frequency for all rotatable bonds
  • Filter conformations by energy, CSD torsion frequency and calculated distances, angles, torsions
  • Smart coloring of conformations includes coloring by energy and by CSD torsion frequency.

 


Figure 1: The conformation explorer in Forge. Rotatable bonds are colored and labelled by CSD torsion frequency.

 

Contact us to register for a free evaluation of Forge V10.5.

Comparing ligand and protein electrostatics of Btk inhibitors

Abstract

Protein interaction potentials implemented in Flare,1 Cresset’s structure-based design software, were used to calculate a detailed map of the electrostatic character of the protein active site of Bruton’s tyrosine kinase2 (Btk). The interaction potential maps were compared to those of selected Btk ligands to get a detailed understanding of ligand binding and SAR. 3D-RISM analysis in Flare was applied to investigate the stability of the crystallographic water molecules populating the Btk active site.

Introduction

Bruton’s tyrosine kinase is a member of the Tec family of non-receptor tyrosine kinases. Recent literature findings2 indicate that Btk inhibition could be an attractive approach for the treatment of autoimmune diseases such as rheumatoid arthritis, a progressive autoimmune disease characterized by swelling and erosion of the joints.

The published X-ray crystal structure PDB:4ZLZ shows that the 4RV ligand interacts with the active site of Btk (Figure 1 – left) by making H-bond interactions with Glu475 and Met477 in the hinge region. The pyridyl ring is involved in a cation-pi interaction with Lys430, with the pyridyl nitrogen making a water-mediated interaction to the P-loop residues Phe413 and Gly414. The replacement of 4-methylpyridin-3-yl with small bicyclic heterocycles like indazole in 4L6 (PDB:4Z3V, Figure 1 – right), displacing the water molecule and making direct H-bond interactions with the P-loop, led to the discovery of ligands with improved potency towards Btk such as compounds 4L6, 1 and 2 (see Table 1).3


Figure 1. Left: X-ray crystal structure of 4RV (PDB:4ZLZ) in the active site of Btk making a water mediated hydrogen bond with the P-loop backbone. Right: X-ray crystal structure of 4L6 (PDB:4Z3V) making direct H-bond interactions with the P-loop backbone.

In this case study, we used the protein interaction potentials and the 3D-RISM method available in Flare to investigate the electrostatics of the active site of Btk and the stability of the crystallographic water molecules. This information was then used to understand the SAR of the molecules in Table 1.

Method

The 4ZLZ and 4Z3V ligand-protein complexes were downloaded from the Protein Data Bank into Flare, and carefully prepared using the Build Model4 tool from BioMolTech,5 to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes and assign optimal protonation states to the protein structures. Any truncated protein chains were capped as part of protein preparation.

The protein sequences were aligned in Flare using the COBALT6 multiple alignment tool and subsequently superimposed by means of a least squares fit of equivalent C.alpha carbon atoms.

Protein minimization

The active site of the prepared 4ZLZ and 4Z3V ligand-protein complexes was minimized in Flare using the XED force field7 and Normal conditions (gradient cutoff: 0.200 kcal/mol/Å, 2,000 maximum iterations). The ligand structures were included in the minimization of the active site.

3D-RISM analysis

The Reference Interaction Site Model (RISM) is a modern approach to solvation based on the Molecular Ornstein-Zernike equation.8 3D-RISM has seen increasing use as a method to investigate the location and stability of water molecules in a protein.

Conceptually, 3D-RISM is equivalent to running an infinite-time molecular dynamics simulation on the solvent (keeping the solute fixed), and then extracting the density of solvent particles. The output of a 3D-RISM calculation consists in a grid containing particle densities, one for oxygen and one for hydrogen atoms. A thermodynamic analysis then assigns a ΔG value to each position on the grid, representing the ‘happiness’ of a putative water molecule at that position of the grid relative to bulk water.

3D-RISM calculations in Flare use Cresset’s XED force field, which offers the advantage of incorporating both electronic anisotropy and a certain degree of polarizability, and accordingly improves the effectiveness of the method.

A 3D-RISM analysis was carried out on 4ZLZ and 4Z3V to investigate the stability of crystallographic water molecules surrounding the 4RV and 4L6 ligands bound to the active site of Btk.

The following conditions were used:

  • XED force field and charge method
  • 4Å grid spacing
  • 14Å grid external border width
  • Convergence tolerance: 10-8
  • Maximum number of iterations: 10,000
  • Total formal charge handling: neutralize with counterions.

Protein interaction potentials

Protein interaction potentials are an extension of Cresset molecular interaction potentials to proteins. Both are calculated using the XED force field. The approach is similar in principle to the calculation of ligand fields: the protein’s active site is flooded with probe atoms, and interaction potentials are calculated at each point. This method makes use of a distance-dependent dielectric function based on the work of Mehler,9 to better cope with the large number of charged groups in protein structures.

All the ligands in Table 1 belong to the same series as 4L6, so for this case study protein interaction potentials were only calculated and displayed for the active site of 4Z3V.

Ligand fields

To obtain a sensible pose for the ligands in Table 1, the corresponding 2D structures were docked into the ‘dry’ (i.e., not including crystallographic water molecules) active site of 4Z3V using the Lead Finder10 method implemented in Flare.

Cresset’s ligand fields were then calculated and compared to the 4Z3V protein interaction potentials, to investigate the SAR for the ligand series.

Results

3D-RISM analysis on 4ZLZ

At the end of a 3D-RISM run, a 3D-RISM water molecule chain is added to the protein structure. The water molecules in this chain occupy regions of high water density as predicted by 3D-RISM, and are colored according to the calculated ΔG for the whole water molecule, averaged over all orientations.

‘Happy’ water molecules (associated with a calculated negative ΔG) are colored in shades of green: these are water molecules which 3D-RISM predicts to be more stable in the protein than in bulk water, and hence more difficult to displace with a ligand.

‘Unhappy’ water molecules (associated with a calculated positive ΔG) are colored in shades of red: these are waters that are less stable relative to bulk water and hence more easily displaced by a ligand.

Figure 2 shows the results of the 3D-RISM calculation on 4ZLZ. The oxygen density surface (Figure 2 – left) clearly shows a region of localized water near the nitrogen of the pyridine, and the 3D-RISM localization algorithm (Figure 2 – right) suggests that a water molecule should exist in exactly the spot where it is seen in the crystal structure. The thermodynamic analysis indicates that this water molecule is neither particularly ‘happy’ nor particularly ‘unhappy’. This is consistent with the fact that this water molecule is displaceable (as proven by 4L6 and the other compounds in Table 1), but also indicates that the displacing group needs to have the correct electrostatics and shape to avoid losing affinity.

3D-RISM analysis on 4Z3V

The oxygen density surface for 4Z3V is shown in Figure 3 – left. The 3D-RISM localization algorithm correctly identifies the position of the majority of crystallographic water molecules surrounding the 4L6 ligand bound to the Btk active site: many of these water molecules are predicted to be ‘happy’. Accordingly, a selected subset of the stable water molecules was included in the calculation of protein interaction potentials for 4Z3V, as they were considered to be an integral part of the protein active site with respect to ligand binding.


Figure 2: 3D-RISM results on 4ZLZ. Left: oxygen isodensity surface at ρ=5. Right: localized 3D-RISM waters, colored by ΔG.


Figure 3: 3D-RISM results on 4Z3V. Left: oxygen isodensity surface at ρ=5. Right: localized 3D-RISM waters, colored by ΔG.

Protein interaction potentials for 4Z3V

As shown in Figure 4, the protein interaction potentials of both the ‘dry’ (not including crystallographic water molecules) and ‘wet’ (including stable crystallographic water molecules lining the active site) active site of 4Z3V match the 4L6 ligand fields in a satisfactory manner.

In particular:

  • the electron-rich cinnoline ring sits in a region of positive interaction potential in the middle of the 4Z3V active site;
  • the 5,6 hydrogens of the cinnoline ring sit near an area of negative interaction potential corresponding to the carbonyl of Leu408;
  • the carbonyl and the NH2 of 3-carboxamide sit respectively within and nearby an area of positive and negative interaction potential corresponding to the backbone NH of Met477 and the backbone carbonyl of Glu475 in the hinge region of Btk, with which they form H-bonds;
  • the 4-amino group on the cinnoline ring also sits nearby an area of negative interaction potential, corresponding to the carbonyls of Met477 and Leu408;
  • the electron-rich 5-membered ring of indazole sits in an area of positive interaction potential corresponding to the protonated side chain of Lys430 (not shown) and the backbone NH of Phe413, with the NH-group pointing towards a negative area corresponding to the backbone carbonyl of Gly414 with which it forms an H-bond.

The inclusion of stable water molecules in the calculation of protein interaction potentials confirms this scenario. In this case though, the region of positive protein interaction potential in the middle of the 4Z3V active site is much larger and embraces most of the cinnoline-indazole ring system. This is indeed fully consistent with the negative ligand field surrounding the cinnoline-indazole ring system (Figure 4 – bottom).

Also, the 4-amino group on the cinnoline ring sits in an area of negative interaction potential which nicely matches the positive ligand field corresponding to this group.


Figure 4: 4L6 superimposed to the protein interaction potentials of 4Z3V. Top-left: ‘dry’ active site, not including crystallographic water molecules. Top-right: ‘wet’ active site including stable water molecules. Bottom: Ligand fields for 4L6. Protein interaction potentials shown at isolevel = 3; ligand fields shown at isolevel = 2.

SAR of Btk inhibitors

A comparison of ligand fields with the protein interaction potentials for the active site of Btk provides some useful insight into the SAR of compounds in Table 1.

Compound 1

Compound 1 (pIC50 8.7) is one of the two most potent compounds in this data series,3 carrying a -OMe side chain on the indazole ring and a fluorine in position 5 of the cinnoline ring. The binding mode of 1 (Figure 5) is similar to that of 4L6. The compound makes H-bond interactions with Glu475 and Met477 in the hinge region, a cation-pi interaction with Lys430 (not shown), and H-bond interactions with the backbone of P-loop residues Phe413 and Gly414.

The fluorine group sits in a relatively large pocket close to a water molecule which it possibly displaces. The CH3 of the OMe group sits in an area of negative interaction potential.


Figure 5: Left: compound 1 (pIC50 = 8.7) superimposed to the protein interaction potentials for the active site of 4Z3V at isolevel = 3. Right: ligand fields for compound 1 at isolevel = 2.

Compound 2

Compound 2 is also one of the most active compounds in the data series3. Quite interestingly though, the NH on the indazole does not make an H-bond with Gly414, as it is turned on the other side, possibly making an
H-bond interaction with a nearby water molecule.


Figure 6: Compound 2 (pIC50 = 8.7) superimposed to the protein interaction potentials for the active site of 4Z3V at isolevel = 3.

Compounds 3 and 4

The good activity (pIC50=8.4) of compound 3 confirms that an H-bond donor on the bicyclic system is not an essential feature for a Btk ligand to reach good levels of activity. Quite interestingly, compound 4 (pIC50=7.7) is structurally very similar to 3, but significantly less active. The comparison of the ligand fields for these two compounds with the protein interaction potentials of the active site of 4Z3V provides a possible explanation, as shown in Figure 7. While for both compounds (Figure 7 – middle column) the negative ligand field shows a good complementarity with the positive interaction potential of the backbone NH of Phe413, the positive ligand field of 4 (Figure 7 – right column) does not match the negative interaction potential generated by the backbone carbonyl of Gly414.

For both compounds, the methyl group in position 7 of the cinnoline ring plays the same role of the methyl on the indazole ring of 4L6 in ensuring that the ligands achieve the correct conformation in the active site.


Figure 7: Compounds 3 and 4 superimposed to the protein interaction potentials for the active site of 4Z3V at isolevel = 3. Ligand fields shown at isolevel = 4.
Middle: positive interaction potentials superimposed to negative ligand fields.
Right: negative interaction potentials superimposed to positive ligand fields.

Conclusions

Protein interaction potentials and ligand fields, as implemented in Flare, are a powerful way of understanding the electrostatics of ligand-protein interactions. The inclusion of stable water molecules following a 3D-RISM analysis dramatically improves the precision of the method for the characterization of protein active sites. The information gained from protein interaction potentials can be used to inform ligand design, compare related proteins to identify selectivity opportunities, and understand SAR trends and ligand binding from the protein’s perspective.

References and links

1. http://www.cresset-group.com/products/flare/
2. C.R. Smith et al., J. Med. Chem. 2015, 58, 5437−5444
3. US patent 2015/0038510
4. V. Stroganov et al., Proteins 2011, 79(9), 2693-2710
5. https://www.biomoltech.com/
6. https://www.ncbi.nlm.nih.gov/tools/cobalt/re_cobalt.cgi
7. J.G. Vinter, J. Comput.-Aided Mol. Des. 1994, 8, 653-668
8. R. Skyner et. al., Phys. Chem. Chem. Phys. 2015, 17(9), 6174
9. E. L. Mehler, The Lorentz-Debye-Sack theory and dielectric screening of electrostatic effects in proteins and nucleic acids, in Molecular Electrostatic Potentials: Concepts and Applications, Theoretical and Computational Chemistry Vol. 3, 1996
10. O. V. Stroganov et al., J. Chem. Inf. Model. 2008, 48(12), 2371-2385

What can Torch do for you that TorchLite can’t?

Abstract

TorchLite is the powerful freeware 3D molecule viewer, editor and design tool from Cresset. However, there are situations in which modeling with TorchLite is simply not enough and you need to access the full power of Torch. This blog post highlights some of the features which make Torch a powerful molecular design tool for medicinal and synthetic chemists.

Introduction

You can see several interesting applications of TorchLite in our case studies and web clips. With TorchLite, you can view the results of ligand-based or structure-based virtual screening, understand the shape and electrostatic character of active molecules and design new molecules to match their pattern. But what are the differences between TorchLite and its big brother Torch? When should you start using Torch?

In this blog, I highlight some of the additional features available in Torch, but not in TorchLite, with examples of their application.

SAR analysis in TorchLite

The web clip Visualizing field changes to understand SAR shows how to quickly investigate the SAR of a small dataset of NaV1.7 inhibitors using TorchLite. Structures were manually sketched using the built-in 3D molecule editor, quickly minimized and saved in the Molecules table and NaV1.7 activity data manually entered. This works nicely for this small dataset, however, for larger compound sets manual editing and data entry is slow and open to human error. Also, manual editing and minimization in TorchLite cannot replace a full exploration of the conformational space of compounds, which ensures that diverse, low energy conformations are considered in the SAR analysis. Finally, while alignment is straightforward for the simple changes carried out in the web clip, a robust method for sensibly aligning the compounds is required when more complex structural changes are made.

This is the most important difference between the two packages: conformational exploration and alignment can be carried out in Torch (and Forge), but not in TorchLite.

SAR analysis in Torch

In Torch, molecules are aligned to one or more reference molecules using fixed conformations, which can be imported into Torch or calculated on the fly by the application.

Suitable reference molecules are highly active molecules, preferably in their bioactive (protein bound) conformation. This is usually either experimentally observed (when crystallographic information is available), or derived from a docking experiment or pharmacophore modeling (these methods are also available in Lead Finder and Field Templater, respectively).

Using a ‘Normal’ alignment, the conformation ensemble for each molecule in the data set is aligned to the reference molecule in two stages. In the first stage the field points around a molecule are used to generate an initial alignment. In the second stage the initial alignment is optimized to get the best possible similarity score. In this stage, it is possible for Torch to use an excluded volume, typically derived from the protein crystal structure, that defines a region of space around the reference molecule that acts as a constraint on the alignments.

Torch offers an additional method for automated molecular alignment. Using the Maximum Common Substructure (MCS) approach each ligand is initially fitted to the reference molecule using a common-substructure algorithm and then additional groups are the fitted using the best match of field points and shape. This substructure alignment can be regarded as a ligand-centric view of the match to the reference where the use of the field points alone is akin to a protein-centric view of the alignment.

Each method has their advantages:

  • Field points give an unbiased view of alignment with a score that can be used in, for example, virtual screening
  • The substructure approach highlights the differences between molecules that lie in the same series making them easier to interpret, particularly when using ligand-centric computational techniques such as the activity cliff analyses in Activity Miner and Activity Atlas, as in the example below.

Using alignment in SAR studies

In the case study Activity Atlas analysis of sodium channel antagonists. Part I: SAR of the right-hand side phenyl ring a dataset of 62 pyrrolopyrimidine NaV1.7 antagonists was downloaded from CheMBL, conformationally explored in Forge and aligned by MCS to the chosen reference compound.


Figure 1. The reference compound used to align the NaV1.7 data set.
The SAR of the data set was then analyzed using Activity Atlas, a probabilistic method of analyzing the SAR of a set of aligned compounds as a function of their electrostatic, hydrophobic and shape properties, available in Forge.

A more simple workflow can be implemented in Torch to quickly and effectively explore the SAR on the right-hand side phenyl ring (Figure 1) using Activity Miner, an optional module of Torch (included in Forge).

The ‘Substructure’ filter in Torch was used to select a subset of 17 compounds from the original data set which have the same scaffold and left-hand side substituent as Cmpd 1, but vary on the right-hand side phenyl, following the workflow shown in Figure 2.


Figure 2. Filter by substructure in Torch.
The lowest energy conformation of Cmpd 1 (one of the most active compounds in the data set) was then chosen as a reference structure, following an ‘accurate but slow’ (Max number of conformations: 200; RMS cut-off for duplicate conformers: 0.5; Gradient cut-off for conformer minimization: 0.1 kcal/mol; Energy window: 3 kcal/mol) conformation hunt within Torch. This was used to align the 17 compounds by Maximum Common Substructure, using again an ‘accurate but slow’ set-up for the conformation hunt.

The SAR of the right-hand substituted compounds can then be explored using the activity view maps calculated and displayed by Activity Miner.

The activity view shows a focus compound surrounded by its nearest neighbors according to the chosen similarity metric (Figure 3). In this view the height of each wedge corresponds to the ‘distance’ between the pair: a smaller wedge reflects very similar compounds.


Figure 3. Activity view map for Nav1.7 pIC50, showing the detailed SAR of the phenyl ring.
The color of the wedge reflects the direction the activity is going: red means the activity is decreasing; green means the activity is increasing between the pair.

The shading echoes the disparity, which relates to how steep the activity cliff is. The result is a focused view of the SAR around a chosen compound.

Figure 3 also shows the activity view around the unsubstituted phenyl (pIC50 6.6). This view clearly shows that para substitution is always detrimental for NaV1.7 activity: ortho substitution is beneficial, especially with a small halogen like Fluorine; and meta substitution is also in general beneficial. Ortho, ortho substitution, instead, is less tolerated.

Design of new molecules using Torch

One of the major advantages of field based alignment is that it is agnostic to the chemical series that is being aligned. This can be used to aid in the design of new compounds in Torch by aligning diverse actives to a common reference and then transferring key functional groups across series. In this example, I use the crystal structure of HDT, a potent Cyclin-Dependent Kinase inhibitor, bound to CDK2 (PDB code 1OIT) to modify the design of an oxime based inhibitor.

As can be seen in Figure 4, HDT interacts with the hinge region of the active site of CDK2 by making two H-bond interactions with the backbone carbonyl and NH of Leu 83, and a H-bond interaction with Lys 33. The sulphonamide group also makes H-bond interactions with Asp86 (not shown).


Figure 4. HDT bound to the CDK2 active site.
In this design experiment, more potent CDK2 inhibitors are designed starting from the 2D structure of compound CK3 (Figure 5), a smaller and less potent CDK2 inhibitor with a Ki 2200 nM using the interactions of HDT as a guide.
The 2D structure of CK3 (drawn with a favorite drawing package) was imported in Torch by copy/paste. CK3 was then aligned to HDT using an accurate but slow conformation hunt followed by a ‘Normal’ (field based) alignment.


Figure 5. Structure of CK3, an inhibitor of CDK2 (Ki 2200 nM).
Figure 6 shows the results of the alignment experiments. CK3 (grey) is nicely superimposed to HDT (pink) and it is straightforward to see which changes should be made to increase CDK2 potency, replacing the formamidine moiety with a phenyl ring, possibly decorated with a sulphonamide or other H-bond acceptor group in the para position.


Figure 6. CK3 (grey) aligned to HDT (pink).
This change can be easily done in the molecule editor available in Torch, using the reference structure as a guide. As changes are made in the editor, the similarity score (Figure 7) is updated on the fly by clicking on the ‘Minimize’ and ‘Optimize Alignment’ buttons. Once the editing is completed, clicking the ‘Align’ button in the molecule editor will prompt Torch to carry out a full conformation hunt and field alignment on the new design.


Figure 7. The Molecule Editor in Torch.
The structure of CK6, an analogue of CK3 with CDK2 Ki 70 nM, aligned to HDT in Torch are shown in Figure 8 (left). The superimposed crystal structures of CK6 and HDT as in the PDBs 1PXN and 1OIT, respectively shown in Figure 8 (right). The alignment in Torch almost perfectly matches the crystallographic alignment of these two ligands in the CDK2 active site.


Figure 8. Left: CK6 (grey) aligned to HDT (pink) using Torch. Right: superimposed crystal structures of CK6 (grey) and HDT (pink) as in PDB entries 1PXN and 1OIT.

Multi-Parameter Scoring

Multi-Parameter Scoring in Torch helps medicinal and synthetic chemists assess the overall physico-chemical profile of the compounds of interest using colors and radial plots. As can be seen in Figure 9, columns in Torch are colored according to a profile set up in the Torch preferences. Properties perfectly matching the desired profile are colored in green, those with an acceptable value in yellow, while those with an unacceptable value in red.

The profile can be tailored to the specific project needs in the Radial Plot Properties window. In this window, a weight can be also associated to each property based on its importance in the ideal project profile. The score and fit to the project profile for each molecule is then summarized in the radial plot.

The radial plot is based on the idea that molecule properties that are ‘perfect’ should be displayed at the center of the radial plot. Thus, a molecule with perfect or near perfect properties should have a radial plot with a small encapsulated area (shown in green). Conversely, poor properties would be plotted at the edge of the radial plot such that a molecule with sub-ideal properties would have a radial plot with a large enclosed area (this can be reversed using the Radial Plot Preferences).

In Figure 9, you can see the column coloring for the CDK2 project. Comparing the color coloring of CK3 and CK6, most properties have values matching the ideal property profile. CDK2 Ki has significantly improved from CK3 to CK6, while lipophilicity (SlogP) is less good in CK6. CK3+phenyl (Figure 9, Molecules table) is slightly less active than CK6 and its lipophilicity is high with respect to the other two compounds: another good reason for including a hydrophilic H-bond acceptor in the para position of the phenyl ring.

The radial plot properties are combined into a single score that represents the overall fit of molecule to the ideal project profile. Radial plots can be sorted and filtered based on this score, making it easier to select the best candidates for your projects.


Figure 9. Multi-parameter scoring in Torch.

Conclusion

This blog highlights some of the additional features in Torch, the powerful molecular design tool for medicinal and synthetic chemists.

Additional functionality available in Torch includes the capability to:

  • run virtual screening of up to 500 molecules
  • use Activity Atlas and 2D/3D-QSAR models built with Forge
  • create interactive multi-series scatter plots and histograms of biological or physical properties
  • import calculated and/or measured physical properties and data from an external web service through a REST interface.

Contact us to benefit from this functionality and try the full power of Torch.

November release of Spark reagent databases now available

The November release of the Spark reagent databases derived from eMolecules is now available.

As announced in the October newsletter, Spark users can now benefit from monthly releases of reagent databases derived from eMolecules’ building blocks collection. The rolling updates are intended to provide the very best availability information on the reagents that you wish to employ.

The updated databases can be downloaded now through the Spark Database update widget (instructions on the installing Spark databases page) or using a command line utility (such as wget, please contact us for details).

Activity Atlas analysis of sodium channel antagonists. Part I: SAR of the right-hand side phenyl ring

Abstract

Activity Atlas1 is a component of Forge2, Cresset’s powerful workbench for ligand design and SAR analysis. Activity Atlas models summarize the SAR for a series into a visual 3D model that informs design decisions and helps prioritize molecules for synthesis. In this case study, Activity Atlas’ activity cliff summary maps were used to analyze the SAR of a small series of Nav1.7 sodium channel antagonists. The objective was to investigate and understand the electrostatic, hydrophobic and shape features underlying receptor activity in a case where crystallographic information is not available for the target protein.

Introduction

Structural information is becoming commonly available even for those targets, such as GPCRs and ion channels, which until recently were considered difficult to crystallize.

For novel targets and new chemical series, however, X-ray data may still be difficult to obtain. Quite frequently the information available to project chemists during the early stages of a discovery project can be so scarce to severely hamper the applicability of traditional structure- and ligand- based computational approaches such as docking and pharmacophore modeling.

A method capable of quickly identifying and deciphering the most relevant features underlying protein-ligand interaction, starting from a very limited amount of Structure-Activity Relationship (SAR) data and no structural information, would be of invaluable help during the early stages of drug discovery projects.

We introduced Activity Atlas, a probabilistic method of analyzing the SAR of a set of aligned compounds as a function of their electrostatic, hydrophobic and shape

properties. The method uses a Bayesian approach to take a global view of the data in a qualitative manner. Results are displayed using Forge visualization capabilities to gain a better understanding of the features which underlie the SAR of your set of compounds.

In this case study, the activity cliff summary method in Activity Atlas was used to analyze the SAR of the right-hand side region (RHS, Figure 1) of a small data set of published3 pyrrolopyrimidine antagonists of voltage-gated sodium ion channel Nav1.7, a major regulator of human pain and an attractive target for the development of new and effective pain therapeutics.

The objective is to prove the usefulness of Activity Atlas maps in understanding the electrostatic, hydrophobic and shape features underlying biological activity in those cases where structural information about ligand-target interaction is not easily accessible.

The detailed SAR analysis of the other regions of the pyrrolopyrimidine antagonists will be presented in a future case study.

The data set

A small data set of 62 pyrrolopyrimidine Nav1.7 antagonists (Figure 1) originally published by Chakka et. al.3 was downloaded from ChEMBL.4

Figure 1_The reference compound used to align the data set
 Figure 1. The reference compound used to align the data set.
 

Nav1.7 pIC50 values for this data set span a 3-fold range from 4.4 to 7.7, with an even distribution shown in Figure 2. The data set includes five very weakly active

compounds whose activity was reported in the original paper as % inhibition only. These compounds were assigned a Nav1.7 pIC50 = 4.4 in the Forge project.

Figure 2_Distribution of Nav17 pIC50 values for the 62 pyrrolopyrimidine antagonists
Figure 2. Distribution of Nav1.7 pIC50 values for the 62 pyrrolopyrimidine antagonists.

Conformation hunt and alignment of compounds

The alignment workflow shown in Figure 3 was applied to align the 62 compounds in the data set.


Figure 3_The alignment workflow used in this case study
Figure 3. The alignment workflow used in this case study.

Cmpd1 (Figure 1, pIC50 7.7) was chosen as the reference compound, and its conformational space explored using a ‘very accurate but slow’ conformation hunt within Forge:

  • Max number of conformations: 1,000
  • RMS cut-off for duplicate conformers: 0.5
  • Gradient cut-off for conformer minimization: 0.1 kcal/mol
  • Energy window: 3 kcal/mol.

The use of a 3D similarity metric in Activity Atlas requires (as with 3D-QSAR) the generation of alignments for all compounds and is sensitive to misalignment and

alignment noise. The choice of a sensible conformation for the reference structure may be critical in those cases where no experimental information is available about the bioactive conformation of the ligands.

The low energy conformations of Cmpd1 were accordingly visually inspected, to select a small number of low energy conformers representative of its conformational space. These conformers were used as the reference structure in separate Forge projects to develop alternative alignments for the training set and generate distinct Activity Atlas models. These models were then checked for consistency, and further validated by exploring the detailed SAR of the data set with Activity Miner5, a module within Forge and Torch6 providing rapid navigation of complex SAR.

The training set compounds were aligned to each low energy conformation of Cmpd1 by Maximum Common Substructure using an ‘accurate but slow’ set-up for the conformation hunt:

  • Max number of conformations: 200
  • RMS cut-off for duplicate conformers: 0.5
  • Gradient cut-off for conformer minimization: 0.1 kcal/mol
  • Energy window: 3 kcal/mol.

Activity Atlas models

Activity Atlas models are calculated following a probabilistic approach which takes into account the probability that a molecule is correctly aligned.

This is done by associating a weight with each alignment based on its similarity score. Alignments with similarity higher than a certain threshold (which can either be automatically calculated by Forge, or manually defined by the user) are fully trusted. Alignments with similarity lower than the low similarity threshold are not trusted and discarded. Linear scaling is applied to calculate a weight to alignments which have an intermediate similarity score.

Each run of Activity Atlas performs three types of analysis: average of actives, activity cliff summary and regions explored analysis.
In this case study, the activity cliff summary analysis was used to explore the SAR of the 62 pyrrolopyrimidine Nav1.7 antagonists. This analysis helps you pinpoint the critical regions of SAR, providing a visual 3D summary of the activity cliffs for the data set derived from the Activity Miner module. The method is described in detail in the ‘Generating Activity Atlas models’ section of the Forge manual.

Figure 4_Activity cliff summary maps for the RHS phenyl ring
Figure 4. Activity cliff summary maps for the RHS phenyl ring, derived by aligning the 62 compounds in the data set to three representative low energy conformations of Cmpd1.

Results

The results of the activity cliff summary analysis for the phenyl ring on the RHS starting from representative low energy conformations of Cmpd1 are shown as 3D maps in Figure 4.

All the models give consistent results, and provide clear indications about the electrostatic, hydrophobic and shape features underlying Nav1.7 activity, as explained in detail in Figure 5.

Small halogens in the ortho and meta positions of the phenyl ring on the RHS of the molecule improve Nav1.7 activity, as shown by the negative electrostatic field (in cyan in Figure 5) and the associated areas of favorable shape (green areas). Areas of favorable/unfavorable hydrophobic interaction are not shown for clarity as they overlap largely with those of favorable and unfavorable shape.

Substituents which generate a more positive (or less negative) electrostatic field (in red in Figure 5) in the para and second meta positions are beneficial for activity.

Figure 5-Activity cliff summary map for Nav1.7 pIC50

Figure 5. Activity cliff summary map for Nav1.7 pIC50, showing the effect of different decoration patterns on the phenyl ring on the RHS of the compounds.

Steric bulk in the para position (magenta areas) instead is detrimental for Nav1.7 activity.

Finally, electron-withdrawing substituents which generate a more positive (or less negative, in red in Figure 5) electrostatic field below the plane of the ring are also beneficial for Nav1.7 activity.

These general trends were investigated in more detail by means of activity view maps calculated and displayed using Activity Miner.

The activity view shows a focus compound surrounded by its nearest neighbors according to the chosen similarity metric (Figure 6). In this view the height of each wedge corresponds to the ‘distance’ between the pair: a smaller wedge reflects very similar compounds.

The color of the wedge reflects the direction the activity is going: red means the activity is decreasing; green means the activity is increasing between the pair. The shading echoes the disparity, which relates to how steep the activity cliff is. The result is a focused view of the SAR around a particular compound.

Figure 6 shows the activity view around o-F-phenyl (pIC50 7.7), one of the most potent compounds in the data set.

Starting from the o-Cl substituent and going clockwise, it can be seen that replacing the o-F substituent in the focus compound (pIC50 7.7) with o-Cl (pIC50 7.3) or o-Me (pIC50 7.4) has a very slight detrimental effect on activity, as these substituents are associated with a less negative electrostatic field.

Introducing a second F in the meta position does not impact activity (pIC50 7.7), while the unsubstituted phenyl ring is much less active (pIC50 6.6), as it lacks the favorite small halogens in ortho, meta.

Replacing m-F with m-Cl again does not impact activity.

The introduction of a second o-F substituent (pIC50 6.8) instead causes a drop in activity, an effect not highlighted by the activity cliff summary, as only one example of ortho, ortho disubstitution is available in the data set.

Replacing o-F with o-CF3 (pIC50 7.1) causes a modest drop in activity, which is difficult to explain in terms of electronic effects: this compound is possibly an outlier to the general trend shown by the activity cliff summary maps.

Removal of the o-F substituent causes a drop in activity, as can be seen for m-F (pIC50 6.9), m-Me (pIC50 7.1) and m-Cl (pIC50 7.1).

Finally, the lack of small halogens in the ortho and meta positions, together with the introduction of unfavorable steric bulk in the para position, causes the dramatic drop in activity in p-F (pIC50 5.1): this substituent is also associated with a more negative electrostatic field.

Figure 6_Activity view map for Nav1.7 pIC50, showing the detailed SAR of the phenyl ring
Figure 6. Activity view map for Nav1.7 pIC50, showing the detailed SAR of the phenyl ring.

Conclusion

In this case study, Activity Atlas and Activity Miner were successfully applied to decipher the SAR of the right-hand side phenyl ring of a series of voltage-gated sodium ion channel Nav1.7 antagonists, starting from a very limited amount of SAR data and no available crystallographic information about the bioactive conformation.

The activity cliffs summary in Activity Atlas was used to get an overview of the SAR landscape, focusing on the prevalent SAR signals.

Activity Miner was used to drill down into the Activity Atlas maps to understand subtle molecule-to-molecule structure-activity changes and identify potential outliers.

The two methods used in combination were able to quickly identify and decipher the most relevant features underlying protein-ligand interaction.

The information derived from this analysis can be of invaluable help for drug discovery projects to inform design decisions and help prioritize molecules for synthesis.

Using the Spark reagent databases to identify bioisosteric R-group replacements

Giovanna Tedesco
Cresset, New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK

Abstract

The reagent databases1 available with Cresset’s Spark2 software for bioisosteric replacement were used to identify alternative decorations for a series of triazolopyridazine and 8-fluorotriazolopyridine selective inhibitors of the c-Met Kinase. The use of databases derived from available reagents ensured that the results could be tethered to molecules that were readily synthetically accessible.

Introduction

The overexpression of c-Met and/or hepatocyte growth factor (HGF), the amplification of the MET gene, and mutations in the c-Met kinase domain can activate signaling pathways that contribute to cancer progression by enabling tumor cell proliferation, survival, invasion, and metastasis.3,4 For these reasons, there has been significant interest in the discovery of small molecule c-Met inhibitors for the treatment of cancer. In particular, researchers at Amgen have recently published potent, selective, ATP- competitive and orally bioavailable small molecule inhibitors of c-Met belonging to the chemical classes of triazolopyridazine3 and 8-F-triazolopyridine.4

The published X-ray crystal structure of compound 43 (an early representative of the triazolopyridazine series, see Table 1) bound to c-Met (PDB 3CD8), shows that this molecule adopts a ‘U-shaped’ binding mode into the active site (Figure 1). A direct hydrogen bond is formed between the backbone NH of Met1160 (linker) and the quinoline nitrogen. A second hydrogen bonding interaction can be observed between N1 of the inhibitor

and the backbone NH of Asp1222. The triazolopyridazine core makes a π-stacking interaction with Tyr1230.  Finally, the aromatic C-H in position 7 makes an electrostatic interaction with the carbonyl of Arg1208.

Based on this experimental information, researchers at Amgen speculated that modifications of the C-6 phenyl group on the triazolopyridazine core would modulate the π-stacking interaction with Tyr1230 allowing for increased potency, and started a chemical exploration based on the synthesis of C-6 aryl and heteroaromatic analogues.3

The same strategy was applied to the exploration of 8-fluorotriazolopyridine compounds.4

The 3D structure of compound 43 was used as the starting point for this case study, where Spark was used in combination with the Cresset supplied reagent databases which are based on eMolecules building blocks.5 The aim of this experiment is to verify whether our methodology could have facilitated the chemical exploration work at Amgen, correctly identifying, among the results of a single Spark run, the most active C-6 monocyclic heterocycles published in refs. 3, 4.

X-ray crystal structure of compound 43 in the active site of c-Met _PDB 3DC8 
Figure 1. X-ray crystal structure of compound 43 in the active site of c-Met (PDB 3DC8).

Table 1. SAR of triazolopyridazine and 8-fluorotriazolopyridine compounds against c-Met.

 

Table1

a) Inhibition of c-Met kinase activity

b) Inhibition of HGF-mediated c-Met phosphorylation in PC3 cells

Method

The published X-ray crystal structure of compound 43 bound into to the active site of c-Met (PDB 3CD8) was downloaded into Forge.6 The structure of the ligand was minimized and used as the Starter molecule for the Spark experiment (Figure 2 – left). The ‘Accurate but slow’ conditions for scoring the Spark search results were fine-tuned by setting the gradient cutoff for minimization to 0.200 kcal/mol/A, and by setting a constraint on the positive field point mapping the interaction of compound

43 with Arg1208 in the c-Met kinase (Figure 2 – right). This introduced a score penalty for those results that did not match the constrained field point.  Finally, to focus the experiment on small monocyclic heterocycles, bicyclic fragments and substituted phenyl fragments were filtered out during the search using an appropriate SMARTS filter using the ‘Advanced Filters’ panel options (see Figure 3).

The experiment was run on a database of 9.5K aromatic boronic acids derived from eMolecules (Figure 3) building blocks to closely replicate the chemistry used in the original publication.3,4

Left_ starter molecule used in the Spark experiment. Right_ constraint associated to the positive field point mapping the interaction of compound 43 with Arg1208 in c-Met

Figure 2. Left: starter molecule used in the Spark experiment. Right: constraint associated to the positive field point mapping the interaction of compound 43 with Arg1208 in c-Met.

eMolecules reagent databases_left and Advanced Filters options_right

Figure 3. eMolecules reagent databases (left) and Advanced Filters options (right).

Results

As can be seen in Figure 4, the initial Spark experiment was able to identify the large majority of the monocyclic heterocycles used to explore the C-6 position of c-Met Kinase inhibitors published in ref. 3 (Table 1). In particular, 3-thienyl (10k), 2-thienyl (10j), 5-isothiazolyl (a close analogue of 3-methyl-isothiazol-5-yl used for compound 10m), 4-methyl-2-thienyl (compound 10l), were correctly identified among the 15 top ranking Spark results.

The Spark experiment was also able to correctly identify C-6 heterocycles used in subsequent iterations of the project to explore the 8-fluorotriazolopyridine scaffold (Table 1). However, while 2-pyridyl (compound 10a), 4-thiazolyl (10d) and 2-methyl-5-thiazolyl (10c) rank reasonably high in the list of results, 1-methyl-4-pyrazolyl (10e) and 3-methyl-5-isoxazolyl (10b) are correctly retrieved, but with a lower rank.

This is disappointing, however, compound 43 is approximately 3-10 times less potent in terms of c-Met

enzyme activity, and  20 times less potent in the cellular assay, than the most active heterocyclic compounds published in ref. 3 (10m and 10l). The Spark search was then repeated using 10m (which has a better pharmacokinetic profile than 10l3) as the starter molecule, to verify whether any improvement in the ranking of these two substituents could be achieved by starting from a more active compound, which is expected to even better fit the electrostatic and steric requirements of the c-Met binding site. The 3D conformation used for 10m was obtained by means of a field/shape alignment with the X-ray structure of compound 43 carried out within Forge.

The results of this second experiment are summarized in Figure 5. The ranking of 1-methyl-4-pyrazolyl was significantly improved, while no improvement was observed for 3-methyl-5-isoxazolyl.

A final Spark search carried out with compound 10m as a starter molecule on an expanded set of reagent databases (boronic acids and aromatic halides), suggested some interesting alternative small heterocycles which could have been tried, shown in Figure 6.

R-groups associated with known active inhibitors of c-Met found by Spark

Figure 4. R-groups associated with known active inhibitors of c-Met found by Spark.

Ranking of 1-methyl-4-pyrazolyl and 3-methyl-5-isoxazolyl using 10m as the starter molecule

Figure 5. Ranking of 1-methyl-4-pyrazolyl and 3-methyl-5-isoxazolyl using 10m as the starter molecule.

Novel potential replacement fragments identified by Spark

Figure 6. Novel potential replacement fragments identified by Spark.

Strain and torsion frequency analysis

As reported in refs. 3,4, it was hypothesized by the Amgen authors that co-planarity would enhance potency towards c-Met, presumably due to an optimal configuration for π-stacking with Tyr1230. In evaluating the results of a Spark experiment for this target it is therefore important to ensure that potential replacement fragments can adopt a realistic planar conformation.

Two types of analysis are available in Spark to monitor the above. The first is a calculation of the strain of the newly formed bond from the potential replacement fragment and the scaffold. The strain is calculated by performing a 30 degree torsion scan for that bond in the result molecule, and calculating the energy difference between the torsion chosen by Spark in the result molecule and the lowest energy torsion found during the scan. Values lower than 2 are largely insignificant.

Additionally, the Torsion Library7-9 method is used to assess the torsion associated with the newly formed bond, as well as the torsions associated with all rotatable bonds within the bioisostere fragment. The method is based on an analysis of the Cambridge Structural Database10 (CSD), and reports the frequency with which a specific torsion is experimentally observed. Torsions associated with a low frequency are a possible cause for concern and should be further investigated.

As can be seen in Figure 4, all the fragments identified by the Spark experiment and reported in refs. 3,4 can adopt the required planar conformation, with no significant strain associated to the newly formed bond: torsional frequencies for this bond range from ‘medium’ to ‘high’, and should accordingly be realistic based on the experimental data in the CSD.

Figure 6 shows the strain and the torsional frequency for the potential novel decorations identified by Spark. In this case, there are no concerns associated with the conformations chosen by Spark.

Availability of reagents

Whenever the new eMolecules reagent databases are used in a Spark experiment, availability information is displayed in the results table (see Figures 4 and 6). This information is important for planning laboratory activity taking into account realistic delivery timelines. For example, for three of the fragments shown in Figure 6, shipment is to be expected within 1-5 days from order. Delivery times for 5-pyrimidinyl and 5-oxazolinyl boronic acids are longer: the former can be shipped within 4 weeks from order, while the latter needs to be synthesized and this may take up to 12 weeks.

Searching for the reagents of interest in the eMolecules site enables a check of real-time availability information.

Conclusions

In this case study, a Spark R-group replacement experiment successfully identified the majority of active monocyclic heterocycles used by Amgen in the discovery of new potent triazolopyridazine and 8-fluorotriazolopyridine inhibitors of c-Met kinase.

The results suggest that working in successive rounds of optimization, choosing for each Spark experiment the starter molecule with the best activity profile, is an excellent strategy to rapidly identify the R-groups associated with the highest activity or optimal overall profile.

Access to reagent availability information plays an important role in deciding which fragments should be included in each round of optimization. Reagents with short delivery times should be preferred during the initial stages of the project to facilitate quick SAR information gathering, which will enable a more informed choice of fragments to explore in the successive rounds of lead optimization.

References and Links

http://www.cresset-group.com/products/spark/current-spark-databases/

http://www.cresset-group.com/products/spark

3 Albrecht B. K., et al., J. Med. Chem. 2008, 51, 2879–2882

4 Peterson, E. A., et al., Med. Chem. 2015, 58, 2417−2430

https://www.emolecules.com/info/building-blocks

http://www.cresset-group.com/products/forge/

7 Torsion Library method, jointly developed by the University of Hamburg Center for Bioinformatics, Hamburg, Germany and F. Hoffman-La-Roche Ltd., Basel, Switzerland

8 Schärfer, C. et al., Med. Chem., 2013, 56, 2016-28

9 Guba, W., et al., Chem. Inf. Model., 2016, 56, (1), 1-5

10 http://www.ccdc.cam.ac.uk/

Spark V10.4 released

A new version of Spark, our scaffold hopping and bioisostere replacement tool, is now released. V10.4 includes many new or improved features and gives access to new and updated chemical diversity.

The development of our applications is guided by our customers and this release is bursting with new features and science that you have asked for. A few of these are described below, however, I suggest that you use the software for yourself to discover the other features and see them in action.

Highlights

  • New Cresset reagent databases derived from eMolecules’ building blocks, replacing previous reagents based on ZINC, include availability information for every result
  • New analysis of the conformation of every result using the Torsion Library method of Guba et al. that is based on an analysis of the Cambridge Structural Database (CSD)
  • New configurable connection to external REST service for properties that enables you to add your own data and properties to the Spark experiment
  • Improved Radial Plots to support enhanced multi-parameter optimization.

New Cresset reagent databases derived from eMolecules’ building blocks

The new Spark reagent databases are derived from eMolecules’ building blocks and replace the previous reagents based on ZINC. These new databases enable Spark users to select the most promising results from their experiment with confidence that the corresponding reagents will be commercially available from reliable suppliers and access up-to-date availability information.

The chemically intuitive rules for R-group database creation have been refined to improve the accuracy of the chemistry incorporated into the new reagent databases. Over 20 different reagent databases are provided by Cresset using the updated rules, which can be easily modified to suit your preferences. If you think something is missing then let us know and we can add it to the list in minutes.

Customers with a database generator license can use our rules to process their own available reagents, giving rapid suggestions for the next set of compounds to be made using the reagents currently in your lab.

eMolecules
Figure 1: Make the most of the chemical diversity available from eMolecules to define the next move for your projects.

New Results table columns, enabling the analysis of the frequency of torsions and of attachment point type

New Results table columns are available within Spark V10.4, to facilitate the analysis of the quality of the results obtained from your Spark experiment and the assessment of chemical feasibility.

The ‘TorsFreq Frag’ and ‘TorsFreq’ column values are computed by analysing the frequency of torsions, as recorded in the CSD, using the Torsion Library method jointly developed by the University of Hamburg (Center for Bioinformatics) and F. Hoffman-La-Roche Ltd. The analysis is carried out for all dihedrals associated with rotatable bonds within the bioisosteric replacement and for each new bond formed in the result molecule. Torsions associated with a low frequency are a possible cause for concerns and should be further investigated.

Spark has always enabled you to restrict your search to fragments that link through a specific atom type. This feature enables you to search for bioisosteres that would work with your synthetic scheme. In this release we have added the ‘Attachment Point Type’ into the main result sheet to enable you to perform a wider search and then focus on the results that are of interest to you. This facilitates the assessment of chemical feasibility, enabling you to focus on those results which match the chemical strategy you have in mind for your project.

New external REST service for properties

One of the most requested features by customers is the ability to include corporate or externally-computed data for any compound into the Results tables. Spark V10.4 can connect to an external web service, through a REST interface, to import external properties and data computed or retrieved by such web services as additional columns in the Results tables. Using the new service you can bring in external predictions for new designs or simply use the corporate algorithm for calculating logP. Once imported the properties can be used in the Radial Plot, Tiles View and for coloring molecules and table cells enabling you to monitor the overall property profile of the results your Spark experiment.

Multi-parameter optimization in Spark

Radial plots were introduced in SparkV10.3 to provide a graphical representation of numerical data. These initial radial plots created a simple picture to show how a molecule fits the physicochemical profile of a project with the idea that parameters are within an ideal range, an unacceptable range or somewhere in between. In this release this representation is enhanced by introducing the option to combine all the scores in the radial plot together into a single number scaled between zero and 1 that represents how well the result molecule fits your project profile (Figure 2).

MPO
Figure 2: Enhanced radial plot.

Thus Spark results with a radial plot score of 1 fit the project profile perfectly while those with zero lie outside the desirable property space in all aspects. Since not all properties are equally important, Spark enables a weighting factor that can be applied to each property (Figure 3). The weight is used to scale the contribution to the final score. This is useful when you want to focus on one property more than another, for example you are prepared to have a non-ideal value for MW if the logP and TPSA are within the ideal range or you want a visual representation of that property but not have it count towards the score. External properties and data computed or retrieved from the external REST service can also be included into the radial plot.

Radialplotprop
Figure 3: The configuration of the radial plot now includes a weight to apply to each property in combining the properties into a single score.

Try Spark V10.4

This release represents a significant improvement in the usability and flexibility of the leading bioisostere application. We encourage you to upgrade your version of Spark at your earliest convenience.

If you are not currently a Spark customer, please download a free evaluation.

Contact us if you have queries relating to this release.

Converting patent data into 3D maps of SAR

Giovanna Tedesco
Cresset, New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK

Abstract

Activity Atlas1 is a novel, qualitative method available in Forge2, Cresset’s powerful workbench for ligand design and SAR analysis. Activity Atlas is particularly useful to condense large data tables into a single picture, summarizing structure-activity data into highly visual 3D maps that inform the design and optimization of new compounds. In this case study, Activity Atlas was used to analyze the Structure-Activity Relationships (SAR) of a large data set of Orexin 2 receptor ligands taken from the US patent literature, with the objective to quickly investigate and understand the electrostatic, hydrophobic and shape features underlying the receptor activity of a recently published scaffold.

Introduction

Whenever a new research project is initiated, or transferred across teams, familiarization with the prior art for the project must be completed in the shortest possible time so as to avoid wasting resources investing in directions already explored in the past. Equally, new patent publications on a project of current interest can inform optimization decisions on an in house series.

Historical information, both in house and published, is often available in electronic format, however, exploring the known SAR for a target can be a tedious and time consuming exercise for the project team because of the volume of data implied.

Activity Atlas is a probabilistic method of analyzing the SAR of a set of aligned compounds as a function of their electrostatic, hydrophobic and shape properties. The method uses a Bayesian approach to take a global view of the data in a qualitative manner. Results are displayed using Forge’s visualization capabilities to gain a better understanding of the features which underlie the SAR of your compounds.

In this case study, the activity cliff summary method in Activity Atlas was used to find the critical SAR regions of a large data set of published Orexin 2 Receptor ligands. The objective was to understand the electrostatic, hydrophobic and shape features underlying receptor activity and to demonstrate the applicability of this method to the SAR analysis of large data sets.

Crystal structure of Suvorexant bound to the human Orexin 2 receptor

The Orexin system is composed of two widely expressed G-protein coupled receptors: Orexin 1 and Orexin 2 receptors (OX1R and OX2R, respectively), which respond to two peptide agonists (orexin-A and orexin-B) in the central nervous system to regulate sleep and other behavioral functions in humans3. The structure of Suvorexant4 (potent therapeutic inhibitor of the Orexin system) bound to human OX2R was recently solved5 at 2.5Å resolution. The X-ray structure (Figure 1) reveals how Suvorexant binds to OX2R adopting a pi-stacked horseshoe-like conformation deep in the orthosteric pocket, stabilizing a network of extracellular salt bridges and blocking transmembrane helix motions necessary for activation. Most of the ligand contacts involve van der Waals interactions or aromatic packing. Suvorexant’s tertiary amide carbonyl forms a strong hydrogen bond with Asn324 (Figure 1) and only a few other direct polar interactions with the OX2R binding site. Several water-mediated hydrogen bonds form bridges between Suvorexant and polar amino acids such as Asn324 and His350.

Figure 1 – Crystal structure of Suvorexant bound to the human Orexin 2 receptor
Figure 1. Crystal structure of Suvorexant bound to the human Orexin 2 receptor.

Data set

A large data set of approximately 400 compounds with available OX2R data (expressed as nM Ki) gathered from the ‘US patent’ data source was downloaded from BindingDB.6 These records cover patent information published between 2013 and 2014 by Janssen7 and Merck8.

The set is composed of two main chemical series: for both, the most potent compound was selected as a reference structure for that series (Table 1).

Conformation hunt and alignment of compounds

The two reference compounds in Table 1 were aligned to the published X-ray crystal structure of Suvorexant (PDB 4S0V) by field-based alignment within Forge following a ‘very accurate but slow’ conformation hunt:

  • Max number of conformations: 1000
  • RMS cut-off for duplicate conformers: 0.5
  • Gradient cut-off for conformer minimization: 0.1 kcal/mol
  • Energy window: 3 kcal/mol.

The 400 compounds in the data set were then aligned to the appropriate reference structure by maximum common substructure alignment following an ‘accurate but slow’ set-up for the conformation hunt:

  • Max number of conformations: 200
  • RMS cut-off for duplicate conformers: 0.5
  • Gradient cut-off for conformer minimization: 0.1 kcal/mol
  • Energy window: 3 kcal/mol.

‘Permissive’ maximum common substructure matching rules (in which substructure matches ignore element but take into account hybridization, so that for example cyclohexane matches morpholine but not benzene) were used for the alignment.

Table 1. The data sets and reference structures included in this case study.
Table 1. The data sets and reference structures included in this case study.
The use of a 3D similarity metric in Activity Atlas requires (as with 3D-QSAR) the generation of alignments for all compounds and can be sensitive to misalignment and alignment noise.
For this reason, visual inspection of alignments is recommended, to ensure that there are no anomalies present and that manual intervention can be used to improve sub-optimal alignments.
Accordingly, the alignment of some compounds were manually adjusted to increase consistency across the whole data set.

Activity Atlas models

Activity Atlas models are calculated following a probabilistic approach which takes into account the probability that a molecule is correctly aligned, rather than assuming that the top scoring or the selected preferred alignment is the correct alignment.

This is done by associating a weight with each alignment based on its similarity score. Alignments with similarity higher than a certain threshold (which can either be automatically calculated by Forge, or manually defined by the user) are fully trusted. Alignments with similarity lower than the low similarity threshold are not trusted and discarded. Linear scaling is applied to associate a proper weight to alignments which have an intermediate similarity score.

Likewise, a weight is also associated with each molecule based on its activity. Molecules whose activity is higher than a certain threshold (which again can either be automatically calculated by Forge, or manually defined by the user) are considered fully active. Molecules whose activity is lower than the low activity threshold are considered inactive. Molecules with intermediate activity are considered only partially active.

Activity Atlas calculates and displays as 3D visualizations the:

  • ‘Activity cliff summary’: what do the activity cliffs tell us about the SAR?
  • ‘Average of actives’: what do active molecules have in common?
  • ‘Regions explored’: where have I been? For a new molecule, would making it increase our understanding?

The regions explored analysis also calculates a novelty score for each molecule.

In this case study, the activity cliff summary analysis was applied to both series with the objective of interpreting and understanding their SAR.

Results

Figure 2 shows the two reference compounds (grey) in Table 1 superimposed to the crystal structure of Suvorexant (pink) bound to the OX2 receptor. Both the Jannsen and the Merck compounds superimpose very well with Suvorexant, with the tertiary amide carbonyl pointing towards Asn324 (which makes a hydrogen bond interaction with the corresponding carbonyl in Suvorexant). This may indicate a common binding mode for the two series of compounds.

Figure 2. Reference compounds for each data set
Figure 2. Reference compounds for each data set (grey) superimposed to the crystal structure of Suvorexant (pink) bound to the OX2 receptor (PDB code 4S0V).
This hypothesis will be further explored with the activity cliff summary analysis of Activity Atlas.

Activity cliff summary analysis of the Janssen data set

The Janssen data set consists of 377 compounds spanning an OX2R pKi range from 5 to 8. 5.

Figure 3. Activity cliff summary 3D maps
Figure 3. Activity cliff summary 3D maps for the Janssen data set, superimposed to the most active compound (OX2R pKi 8. 5).

SAR of left side phenyl ring

The activity cliff summary analysis for this data set indicates that OX2R activity is increased by having a hydrophobic substituent on the ortho position of the left side phenyl ring, as shown by the favorable (green) areas in Figure 3.

The preferred decorations on this ring are characterized by a stronger positive (red) field at the edge of the ortho substituent, as well as by a stronger negative (cyan) field wrapping the meta and ortho substituents above and below the plane of the molecule.

As shown in Figure 4, large differences in the electrostatic fields surrounding the left side of the molecule are associated with dramatic changes in OX2R pKi.

Accordingly, the choice of an appropriate heterocycle and electronegative groups such as small halogens for the decoration of the left side phenyl ring, which help creating the right pattern of positive and negative electrostatic fields around the molecule, is crucial for modulating OX2R activity.

Figure 4. Electrostatic field differences for compound 362 compared to compound 614
Figure 4. Electrostatic field differences for compound 362 compared to compound 614. Color coding: red, more positive fields; cyan, more negative fields.

SAR of right side aromatic ring

Steric bulk and hydrophobicity on the para position of the pyrimidine ring on the right side of the molecule are also beneficial for OX2R, as shown by the green areas in Figure 3.

Also in this case the choice of an appropriate decoration can have a dramatic impact on pKi, as shown in Figure 5 below.

Figure 5. Steric bulk and hydrophobicity
Figure 5. Steric bulk and hydrophobicity (gold field) in para position of the pyrimidine ring are beneficial for OX2R activity.

Activity cliff summary analysis of the Merck data set

Figure 6. Activity cliff summary 3D maps
Figure 6. Activity cliff summary 3D maps for the Merck data set.
This data set is formed by 34 compounds spanning a 4-fold OX2R activity range (pKi from 5 to 9). The activity cliff summary analysis maps, shown in Figure 6, are quite different from those of the Janssen set.

This is surprising, given the excellent fit to Suvorexant of both the Janssen and the Merck reference structures (Figure 2), which seems to indicate a common binding more for the two series.

SAR of the left side phenyl ring

As for the SAR of the left side phenyl ring, the favorable positive and negative field areas associated with the ortho and meta substituents in the Janssen data set no longer appear, even though steric bulk/hydrophobicity in this position are still favorable (green areas).

There is instead a strong signal related to the para position. Bulky substituents in para are detrimental for OX2R activity (magenta areas), while those associated to a stronger positive (or weaker negative) field, shown in red, are beneficial for activity.

For example, as shown in Figure 7 below, compound 397 (para-Cl) is more active than 389, carrying a bulkier OEt which enters into the area of unfavorable (magenta) shape.

Figure 7. Steric bulk in the para position is detrimental for OX<sub>2</sub>R activity
Figure 7. Steric bulk in the para position is detrimental for OX2R activity. Color coding: Magenta, unfavorable shape.

Compound 397 (para-Cl) is again more active than compound 387 (Figure 8), decorated with a CN which is only slightly bulkier than Cl, but associated to a more negative field.


Figure 8. A more negative field in the para position is detrimental for OX<sub>2</sub>R activity
Figure 8. A more negative field in the para position is detrimental for OX2R activity. Color coding: red, more positive fields; cyan, more negative fields.
However, a more in depth analysis of the Merck data set shows that only three different heterocyclic substituents have been explored in the ortho position of the left-side phenyl ring, namely pyridine, triazole and oxadiazole, with the latter tried only in one compound. As can be seen from the electrostatic field difference maps in Figure 9A, pyridine and triazole generate very similar electrostatic fields and are not surprisingly associated to similar OX2R pKi values. The same substituents were explored also in the Janssen set and also in that case were not associated to any significant change in activity. Pyrimidine and oxadiazole instead generate slightly more different electrostatic fields (Figure 9B) and this is reflected in a more marked difference in OX2R pKi in the Merck series.


Figure 9A. Electrostatic field differences for compound 404 and compound 390
Figure 9A. Electrostatic field differences for compound 404 and compound 390.
Figure 9B. Electrostatic field differences for compound 407 and compound 408
Figure 9B. Electrostatic field differences for compound 407 and compound 408.
Color coding: red, more positive fields; cyan more negative fields.
It is possible that a positive field area is beneficial for activity for both series, but that in the Merck set SAR this effect cannot be appreciated for lack of sufficient chemical exploration.

SAR of the right side pyridine

Going back to the activity cliff summary maps in Figure 6, the lack of SAR on the right side pyridine is to be expected, as this substituent does not vary across the data set.

SAR of the tertiary amide carbonyl

The favorable negative field area associated with the tertiary amide carbonyl of the Merck compounds is consistent with the fact that in Suvorexant bound to OX2R this carbonyl makes an H-bond interaction with Asn324. Assuming a similar binding mode for the Merck compounds, decorations on the left side of the molecule which strengthen the negative field associated with the C=O are should be beneficial for activity.

Conclusions

Activity Atlas is a new method for summarizing the SAR for a series into a visual 3D model that can be used to inform new molecule design. In this case study, Activity Atlas proved to be an invaluable tool for quickly summarizing, analyzing and understanding the SAR of a large collection of compounds gathered from US patent information.

In particular, the Activity Atlas activity cliff summaries highlighted in quick and highly visual manner both the commonalities across the series and the dissimilarities in SAR potentially related to minor changes in the binding mode.
Both represent important information for deciding future directions for the project: the commonalities potentially highlighting areas of chemical exploration so far unexploited; while the dissimilarities may offer a way of further refining the overall profile of the chemical series of interest.

References and links

1. http://www.cresset-group.com/activity-atlas/
2. http://www.cresset-group.com/products/forge/
3. Li, J., et al., Br. J. Pharmacol. 171, 332-350 (2014)
4. Winrow, C.J., et al., Br. J. Pharmacol. 171, 283-293 (2014)
5. Yin, J., et al., Nature 519, 247-250 (2015)
6. https://www.bindingdb.org
7. US Patent 8,653,263 B2
8. US Patent 2013/0102619 A1

Displacing crystallographic water molecules with Spark

Abstract

Cresset’s Spark1 software for bioisosteric replacement was used to carry out a water displacement experiment starting from the X-ray crystal structure of a selective inhibitor of Bruton’s tyrosine kinase2. The use of databases derived from available reagents ensured that the results could be tethered to molecules that were readily synthetically accessible. The availability of a sufficiently diverse source of reagents was crucial in demonstrating the feasibility of this approach.

Introduction

Bruton’s tyrosine kinase (Btk) is a member of the Tec family of non-receptor tyrosine kinases. Recent literature findings2 indicate that Btk inhibition could be an attractive approach for the treatment of autoimmune diseases such as rheumatoid arthritis, a progressive autoimmune disease characterized by swelling and erosion of the joints3.

A fragment-based drug design approach was recently2 applied to the discovery of non-covalent, potent inhibitors of Btk inhibitors with Lck selectivity (Lymphocyte-specific protein tyrosine kinase, a target playing a key role in T-cell activation).

Among the most interesting hits identified with this approach, compound 2 (Table 1) was selected for further optimization. Position 8 of the cinnoline ring of fragment 2 was explored using the Suzuki−Miyaura4 synthetic methodology, starting from a series of monocyclic boronic acids/esters. This initial SAR exploration led to the discovery of compound 8 (Table 1), which shows improved potency and selectivity with respect to fragment 2.

The published X-ray crystal structure of compound 8 in the active site of Btk (PDB 4ZLZ) shows a water-mediated hydrogen bond from the pyridyl nitrogen to the P-loop backbone residues Phe413 and Gly414 of Btk2 (Figure 1 – left). The replacement of 4-methylpyridin-3-yl in compound 8 with small bicyclic heterocycles displacing the water molecule and making direct H-bond interactions with the P-loop led to the discovery of compounds 10 and 11 (Table 1), with a 10-fold improved potency towards Btk.

The 3D structure of compound 8 and the bridging water molecule were used as the starting point for this Spark case study. The aim of this experiment is to verify whether our methodology is able to displace the bridging water molecule and correctly identify the same alternative indazole fragments.

Table 1. SAR exploration of fragment hit 2
SAR exploration of fragment hit 2

Spark reagent databases: accessing available chemical diversity

Spark’s approach to scaffold hopping and R-group replacement uses Cresset’s field-based technology5 6 to identify viable replacements for a selected portion of a reference compound using a series of fragments. In this case study we chose to use standard reagent databases7 supplied by Cresset which are based on the available chemicals directory. This gives the opportunity to rapidly search all R-groups that could be introduced at a selected position. However, an optional Database Generator module enables the creation of fragment databases that are derived from corporate compound registries or inventory systems, linking your available chemistry directly to the Spark experiment.

Method

The published X-ray crystal structure of compound 8 bound into to the active site of Btk (PDB 4ZLZ) was downloaded into Forge8. The structure of the ligand was minimized and then combined with the water molecule mediating the H-bond interaction with the P-loop backbone residues of Btk to make a single molecule entry. The merging of the two 3D structures was done using the ‘combine selected pair into single molecule’ feature available in Forge. The unique entry thus created (see Figure 1 – right) was used as the Starter molecule for the Spark experiment (Figure 2 – left).

In this water displacement experiment, we want the Spark search to be driven mainly by the electrostatic fields, rather than by the usual combination of fields and shape.

For this reason a constraint was added to the negative and positive field points of the water molecule using the Spark Field Constraints Editor (Figure 2 – right). This introduced a score penalty for those results that did not match the constrained field points.

Furthermore, the ‘Normal’ conditions for scoring the Spark search results were fine-tuned to 90% Field and 10% shape, using the Btk protein as a ‘hard’ excluded volume, to constrain the size of the potential replacement fragments.


X-ray crystal structure and 3D structures
Figure 1. Left: X-ray crystal structure of compound 8 in the active site of Btk making a water mediated hydrogen bond with the P-loop backbone. Right: 3D structures and field points of compound 8 and of the bridging water molecule combined into a single entry.
Color coding of field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

The gradient cutoff for minimization was set to 0.200 kcal/mol/A, removing at the same time the automatic constraint of fragment size to ensure that the results of the search were not too biased by the size of the starter molecule.

Finally, to focus the experiment on small bicyclic heterocycles, monocyclic fragments were filtered out from the list of potential results using an appropriate SMARTS filter.

Two runs of Spark were carried out using the above conditions. The initial experiment was run on a database of 775 boronic acids to closely replicate the chemistry used in the original publication2, 4.

Combined 3D structures and constraints associated to the field points
Figure 2. Left: the combined 3D structures of compound 8 and the bridging water molecule used as a starter molecule in the Spark experiment. Right: constraints associated to the field points of the water molecule.
Color coding of field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

In the second experiment, the ZINC7 database of commercial aromatic halides (41K fragments) was also searched to explore a larger chemical diversity, starting from the assumption that the appropriate boronic acid/ester could be obtained from any interesting commercial aryl halide at the cost of an additional synthetic effort.

Results

The top scoring compound from the initial search (boronic acids only) is compound 10 (Table 1). As can be seen in Fig. 3 – right, this compound superimposes very well with the starter molecule and matches the constrained field points in a satisfactory manner. However, compound 11, which would presumably superimpose even better with the conformation of the ortho-methyl-pyridin-3-yl group of compound 8, was not found in this search, due to the limited chemical diversity of the database searched.

In the second Spark search, which was run on a much larger collection of reagents (boronic acids and aryl halides), compound 11 (Fig. 3 – center and Fig. 4) is the top scoring result, while compound 10 ranks 4th in the list (Fig. 4).

The original paper2 also reports the indole-substituted compound 9 (Table 1), quite similar in terms of 2D structure to the much more potent indazole compounds 10 and 11. This fragment is available in both the databases searched, but is not retrieved by Spark. The indole fragment in fact cannot match the constrained negative field point of the bridging water molecule, as shown in Fig. 5, where compound 9 is shown superimposed to the starter molecule in Forge. The lack of this relevant interaction explains the much lower potency of compound 9, with a Btk IC50 = 850nM (Table 1).

Figure 4 shows a tile view of the 16 top scoring results from the second Spark experiment. Several different flavors of the indazole fragment carrying different substitution patterns are represented in this list. Alternative bicyclic fragments are also proposed, which may provide useful ideas for a further exploration of this target.

Electrostatics starter molecule_Compound 11_Compound 10
Figure 3. Left: electrostatics of starter molecule. Center: compound 11 (Btk IC50 = 4.0 nM). Right: compound 10 (Btk IC50 = 12 nM)
Color coding of fields/field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.


Tile view of top scoring Spark results
Figure 4. Tile view of the top scoring Spark results for the second experiment.


Compound 9 superimposed to starter molecule
Figure 5. Compound 9 (right) superimposed to the starter molecule of the Spark experiment (left).

Conclusions

In this case study Spark successfully managed to displace the crystallographic water molecule bridging the interaction between compound 8 and the P-loop of Btk, replacing it with small, synthetically accessible bicyclic heterocycles.

Availability of appropriate sources of chemical diversity is still a key factor in determining the success of any bioisosteric replacement experiment.

For this reason, the creation of fragment databases derived from corporate compound registries or inventory systems, linking your available chemistry directly to the Spark experiment, is highly recommended.

References and links

1. http://www.cresset-group.com/products/spark/
2. Smith, C. R. et al., J. Med. Chem. 2015, 58, 5437−5444
3. Firestein, G. S., Nature 2003, 423 (6937), 356−361
4. Miyaura, N., Suzuki, A. et. al., J. Am. Chem. Soc. 1989, 111 (1), 314−321.
5. J. Chem. Inf. Model., 2006, 46, 665-676.
6. http://www.cresset-group.com/science/field-technology/
7. Spark fragment databases come from commercial compounds, ChEMBL, ZINC and VEHICLe.
8. http://www.cresset-group.com/products/forge/