New release of Spark databases

A new release of the Spark™ fragment and reagent databases is now available for download, to accompany the release of Spark 10.5. These are designed to provide you with an excellent source of new biososteres, whilst also ensuring that the results of your Spark experiment are tethered to molecules which are readily synthetically accessible.

Fragment Databases

In this release we have made significant additions to the source of our fragment databases. The new Spark ‘Commercial’ databases (replacing the previous ZINC fragments) use the combination of ZINC15 and the eMolecules Screening Compounds and include significant new chemical diversity.

The Spark ‘ChEMBL’ databases have also been updated and are based on release 23 of ChEMBL.

In all cases, the compounds in the entire source collection were filtered to remove potentially toxic or reactive fragments. They were then fragmented, and the frequency with which any fragment appeared in the original source database annotated. The fragments were then sorted by frequency and labelled according to the number of bonds that were broken to obtain the fragment, as shown in the table below.

Spark Category Database Total number of fragments (to nearest 1000) Frequency
Commercial VeryCommon 64,000 Fragments which appear in more than 650 molecules
Common 137,000 Fragments which appear in 140-649 molecules
LessCommon 256,000 Fragments which appear in 35-139 molecules
Rare 401,000 Fragments which appear in 12-34 molecules
VeryRare 675,000 Fragments which appear in 5-11 molecules
ExtremelyRare 749,000 Fragments which appear in 3-4 molecules
ChEMBL Common 306,000 Fragments which appear in more than 6 molecules
Rare 506,000 Fragments which appear in 2-6 molecules
Very rare 570,000 Fragments which appear in a single molecule

 

Overall, the new Spark databases include over 3 Million fragments which can be used to identify novel bioisosteres for your project. Figure 1 plots the number of fragments in each database per connection point count.


Figure 1: Count of fragments in Spark ‘Commercial’ (from ZINC15 and eMolecules’ Screening Compounds) and ‘ChEMBL’ (from ChEMBL23) databases split by the number of connection points of each fragment.An analysis of the numbers of fragments in common between the ‘Commercial’ and ‘ChEMBL’ Spark databases (expressed as percent overlap with respect to ChEMBL) reveals that the databases overall show an excellent level of complementarity.

% overlap with ChEMBL Very Common Common Less Common Rare Very Rare Extremely Rare
ChEMBL common 16% 18% 15% 10% 8% 4%
ChEMBL rare 1% 5% 9% 10% 10% 7%
ChEMBL very rare 0% 2% 4% 6% 7% 5%

Not surprisingly, the most common fragments for each database significant overlap. However, the majority of ‘rare’ fragments appear to be unique to each database, showing that the original ZINC plus eMolecules’ Screening Compounds and the ChEMBL collections occupy quite distinct parts of chemical space.

Reagent databases

Monthly updates of the Spark reagent databases, derived from the eMolecules building blocks using an enhanced set of rules for chemical transformation, will continue also in this release. The February edition includes over 500,000 reagents with up-to-date availability information, to make it easy for you to move from the results of a Spark experiment to ordering the reagents you require to turn these results into reality.

The number of fragments in each reagent database is plotted in Figure 2.


Figure 2: Number of fragments in the Spark eMolecules reagent databases.
Each fragment in the eMolecules database is linked back to both the eMolecules ID for the source reagent and its availability. The advanced filtering capabilities in Spark (Figure 3) make it very easy to choose the optimal set of reagents for your experiment based on the Spark similarity score, preferred chemistry (as encoded by the reagent database which generated the result), availability information and overall physico-chemical profile of the results molecules.


Figure 3: Spark reagent results include availability information from eMolecules.
The eMolecules IDs for the favorite reagents can be easily exported from Spark and used to purchase the compounds from the eMolecules building blocks database, as shown in the web clip How to use the eMolecules reagents databases in Spark.

Create your own database

Spark fragment and reagent databases provide an excellent source of new bioisosteres. However, if you have access to significant proprietary chemistry, to specialized reagents, or simply want to only consider fragments from reagents that you have in stock then the creation of custom databases will add value to your Spark experiments.

The Spark Database Generator is a user-friendly interface within Spark that lets you easily create custom databases.


Figure 4: The Spark Database Generator.

Conclusion

This release of the fragment databases significantly increases the chemical diversity available to Spark users, while the monthly updates of the reagent databases ensure that the results of your Spark experiment are tethered to molecules which are readily synthetically accessible.

We are confident that these new and updated Spark fragment and reagent databases, combined with databases from your corporate collections generated with the Spark Database Generator, will provide an even better range bioisosteres for your project.

Please contact us to update to the latest databases, if you wish to access the Spark Database Generator, or to find out how Spark can impact your project.

Spark V10.5 release offers improved usability and flexibility, and new science

I am delighted to announce the release of a new version of Spark™, our scaffold hopping and bioisostere replacement tool. The focus of V10.5 is on advanced workflows and improved database management but also includes new science and many new and improved features.

The most interesting new features are presented below, and I encourage you to try this new release for yourself to see them in action.

Highlights

  • New wizards to support ligand growing and linking, macrocyclization and water replacement experiments
  • Enhanced Spark database update functionality
  • New pharmacophore constraints
  • Enhancements in search algorithm and advanced options.

New Spark wizards

The new Spark wizards will help you set-up advanced bioisostere replacements experiments in a user friendly and scientifically robust manner.


Figure 1. The new Spark project wizard.
The ‘Ligand Growing Experiment’ wizard (Figure 2) can be used to grow a starter molecule into new space, guided by existing ligands mapping a different region of the same active site. This was possible in previous versions of Spark (see case study Using Spark Reagent Databases to Find the Next Move) but the new wizard makes the workflow easier and more accessible.

The ‘Water Replacement’ wizard (Figure 2) can be used to search for a group which will displace a crystallographic water molecule near your ligand. Again the new wizard significantly improves the workflow for this popular Spark experiment that we have detailed previously (see case study Displacing crystallographic water molecules with Spark).


Figure 2. Results of ligand growing and water replacement Spark experiments.
The ‘Join Two Ligands’ (to  find a linker that joins two ligands sitting in the same active site) and the ‘Macrocyclization’ (to cyclize a molecule by joining two atoms with a linker) wizards are new workflows, which we have been testing internally in the last few years. Full case studies for these workflows are in preparation, but you can see an example of result you can get in Figure 3.


Figure 3. Results of joining two ligands and macrocyclization Spark experiments.
In developing the wizards, a number of additional features to support these advanced experiments have been fine-tuned. These include:

  • New ‘Ligand Growing’ and ‘Ligand Joining / Macrocyclization’ calculation methods to support ligand growing, ligand joining, macrocyclization and water replacement experiments
  • A starter molecule sitting within a protein’s active site can now be downloaded directly from the RCSB
  • Hydrogen atoms can now be replaced in the starter molecule
  • New ‘merge’ functionality to merge molecules in the wizards (when appropriate) and in the Manage/Edit References dialog
  • The starter molecule region selector now includes both 2D and 3D display.

These advanced bioisostere replacement experiments work well because of Spark’s product-centric approach. In Spark, result molecules are compared to the starter molecule, and a similarity score is calculated, only when the new molecules have been formed, minimized, and their fields and field points re-created. This approach, combined with the power of the Cresset XED force field, enables Spark to work with a higher level of precision, by avoiding fragment scoring limitations, allowing for neighboring group effects and for the electronic influence of replacing a moiety on the rest of the molecule and vice versa.

Enhanced Spark database update functionality

Spark V10.5 offers considerable improvements to the Spark database update functionality, to make this process more user friendly and efficient.

The Spark search dialog now alerts you when an updated version of the databases is available, before you start a search. Databases which need updating are marked by a green icon (Figure 4): and if the databases are selected for the Spark search, an ‘Update’ link appears at the bottom of the window, taking you directly to the Spark Database Updater. You will find this particularly beneficial if you use the eMolecules reagent databases where our monthly release schedule provides you with the latest availability information.


Figure 4. The Spark search dialog now alerts users when an updated version of the databases is available. If the databases which need updating are selected for the search, a link appears at the bottom of the window which takes you directly to the Spark Database Updater.
The Spark Database Updater has also been improved. The databases are now categorized as ‘Cresset’ and ‘Cresset reagents’, to make it easy to locate those you wish to update (Figure 5). Furthermore, you can now download or update all the displayed databases in one go by clicking the ‘Install or Update Displayed Databases’ button at the bottom of the window.


Figure 5. The Spark Database Updater now shows different categories of Spark databases and includes a button to install or update all the displayed databases in one go.
Finally, a new ‘sparkdbupdate’ binary is available to enable you to update all or selected Spark databases from the command line.

What’s new in Spark searches

A significant number of improvements and new features to the Spark searches have been made in this release.

Field and pharmacophore constraints

Field and pharmacophore constraints can be used to bias the Spark search towards results which fit the known SAR or your expectations, by introducing a penalty which down-scores results that do not satisfy the constraint.

Field constraints enable you to specify that a particular type of field must be present in the Spark result. For example, you may want to a constrain a positive field where you want an interaction but this can be matched by both H-bond donors and other electropositive features such as the aromatic hydrogens of the compound in Figure 6 – right.

Pharmacophore constraints, new in Spark V10.5, ensure that the desired pharmacophore features are matched by an atom of a similar type in the Spark results. In Figure 6, a pharmacophore constraint was introduced to ensure that all Spark results contain a H-bond acceptor.

While field and pharmacophoric constraints are a powerful way of fine tuning Spark results, we recommend that they are using sparingly, as they will be introducing a bias in your experiment. For example, introducing a pharmacophore constraint on the indazole NH of the PDB 4Z3V ligand in Figure 6 – left would not have matched the aromatic hydrogens of the active ligand in Figure 6 – right.


Figure 6. Left: Ligand from PDB: 4Z3V with pharmacophore and field constraints. Right: Active BTK ligand which satisfies both constraints.
Read more about field and pharmacophore constraints in the Forge V10.5 and Blaze V10.3 release announcements.

Enhancements to the Spark search algorithm

Spark V10.5 also includes enhancements to the search algorithm and associated advanced options, for example:

  • New functionality to weigh specific fields independently when scoring
  • New similarity metrics to provide alternate scoring methods for the alignments
  • New widget for adding field and pharmacophore constraints
  • New ‘Flexibility’ filter which can be applied (together with SlogP and TPSA filters) to the whole molecule when performing a Spark search, to limit the results to the desired physico-chemical space.

Other new features and improvements

This Spark V10.5 release also includes a variety of additional new functionalities and improvements to the interface (Figure 7). These include:

  • New ‘Send to Flare™’ functionality to send either all results, favorite results or selected results to Flare, including as appropriate the starter and reference molecule(s) and the protein
  • New Storyboard window, to capture scenes recording all details from the 3D window that can be easily recalled when needed, including capability to annotate and rename scenes
  • New stereo view functionality
  • New support for touch screen displays and HiDPI
  • New Flexibility column in Spark Results tables
  • Improved performance of Spark database generation
  • Improved ‘View Parent Structure for Selected Result’ functionality now including both substructure and identity search
  • Improved ‘Grid’ button functionality
  • Improved display of protein ribbons, offering a choice of different ribbon styles and capability to show ribbons for the active site only
  • Improved look and feel of the GUI with re-designed toolbars and updated and clearer icons for a more modern and sleek interface.


Figure 7. The Spark V10.5 GUI.

Try Spark V10.5

This release represents a significant improvement in the usability and flexibility of the leading bioisostere application. I encourage you to upgrade your version of Spark at your earliest convenience, and to download the keyboard shortcut guides for Spark V10.5 and Spark V10.5 Molecule Editor.

If you are not currently a Spark customer, please request a free evaluation.

Contact us if you have queries relating to this release.

Forge V10.5 release delivers new functionality for molecule alignment, and more ….

V10.5 of ForgeTM, the powerful computational chemistry suite for understanding structure activity relationship (SAR) and design, is now available. This release introduces significant enhancements to molecule alignment, plus the new Conformation Explorer, to visualize and inspect conformational populations. Also included are a large number of GUI styling and usability improvements.

Improved molecule alignment

Molecule alignment is the core experiment in Forge. It is key to developing robust qualitative or quantitative SAR models, building FieldTemplater pharmacophore hypotheses, understanding  the design of new compounds and small scale virtual screening experiments (for larger scale virtual screening use Blaze). V10.5 enables fine-tuning of alignment results by introducing appropriate constraints, an optimized substructure alignment algorithm, and new similarity scoring options.

Field and pharmacophore constraints

Field and pharmacophore constraints bias the alignment algorithm by introducing a penalty which down-scores results that do not satisfy the constraint. This provides you with a mechanism for ensuring that the results that you get from your alignment experiment fit with the known SAR or with your expectations.

With field constraints, you can specify that a particular type of field must be present in the aligned molecule. For example, you may want to a constrain a positive field where you want an interaction but this can be matched by both H-bond donors and other electropositive features such as the aromatic hydrogens in the example below.

V10.5 introduces the new pharmacophore constraints, which ensures that your desired pharmacophore features (e.g., Donor H, Acceptor, Cation, Anion) are matched by an atom of a similar type in the alignment results. A pharmacophore constraint can be used when you are certain that a particular interaction requires transfer of electrons (as in H-bonding or metal binding) in addition to the electrostatic character of the interaction.

Pharmacophore constraints introduce a tighter constraint on the alignment than a field constraint. Where field constraints allow matches across chemical features, pharmacophore constraints are limited to matching specific functional groups (e.g., specific donor-acceptor interactions): alignments that do not place a suitable atom on top or close to the constrained atom cause a penalty to be applied to the score. However, pharmacophore constraints in Forge V10.5 go beyond traditional H-bond donor/acceptor definitions to include, for example, covalent centres and metal binding motifs giving the ability to ensure that key warheads always align in the correct positions.

While field and pharmacophoric constraints are a powerful way of fine tuning alignment results, we recommend that they are using sparingly, as they will be introducing a bias in your Forge experiment. E.g., introducing a pharmacophore constraint on the indazole NH of the PDB 4Z3V ligand in Figure 1 – left would not have matched the aromatic hydrogens of the active ligand in Figure 1 – right.


Figure 1. Left: Ligand from PDB: 4Z3V with pharmacophore and field constraints. Right: Active BTK ligand which satisfies both constraints.

Improved alignment and scoring

Enhancements to alignment and scoring, accessed from the advance options panel, include:

  • Option to require full ring matches, and to bias the alignment towards a specific substructure specified by a SMARTS pattern, in the maximum common substructure alignment algorithm
  • New functionality to weigh specific fields independently when scoring
  • New similarity metrics to provide alternate scoring methods for the alignments
  • New widget for adding field and pharmacophore constraints.

New Conformation Explorer

Molecular conformations are central to Forge. The conformation hunter does a good job of generating a diverse range of energetically accessible conformations. V10.5 gives you the opportunity to more easily inspect the conformations generated for your molecules, enabling you to interact with and edit the populations.

In the new Conformation Explorer, you can inspect a set of conformations with respect to energies, measured distances/angles/torsions, as well as calculate the CSD torsion frequency for each rotatable bond to assess the feasibility of the generated conformations.

Conformations are listed in order of increasing relative conformational energy. Unrealistic conformations or those which are not deemed interesting can be selected and removed from the conformation population for that molecule. Preferred conformations can be promoted to the reference role in Forge with the click of a button.

CSD torsion frequencies can be calculated for all rotatable bonds. These are based on the Torsion Library which contains hundreds of rules for small molecule conformations derived from the Cambridge Structural Database (CSD) and curated by molecular design experts. CSD torsion frequencies are useful to highlight cases where the torsion angle in a calculated conformation is not one that is frequently observed in the CSD, and accordingly is a possible cause for concern.

Distances, angles and torsions can be measured for each conformation and those values can be used for filtering or generating a histogram plot.

Conformation energies can also be plotted in an interactive histogram plot. In Figure 2, the column or bucket with the blue highlight reflects the current conformations shown in the 3D view; the grey columns or buckets reflect to conformations which do not pass the set of filters.

Conformations can be filtered by energy, CSD torsion frequency and calculated distances, angles, torsions. Smart coloring includes coloring by energy and by CSD torsion frequency.


Figure 2. The Conformation Explorer in Forge. Rotatable bonds are colored and labelled by CSD torsion frequency.

Other new features and improvements

This V10.5 release also includes a variety of additional new functionalities and improvements to the Forge interface, including:

  • Enhanced Molecule Editor with a more intuitive layout, featuring a radial plot that is updated as changes are made to a molecule and the new ‘Save a copy’ button to store your molecule directly into the project without leaving the editor
  • New support for touch screen displays
  • Enhanced stereo view functionality with improved accessibility
  • New functionality to export Activity Atlas™ models as surfaces from the GUI
  • New Forge surface command-line binary to export Cresset field surfaces (positive, negative, hydrophobic and vdW)
  • New functionality to sort disparity matrixes in Activity Miner™ by Forge project tags, enabling easier identification of molecules of interest
  • New capability to export molecules by drag-and-drop to the Windows desktop (Windows only)
  • New capability to annotate and re-name Storyboard scenes
  • New tagging of project molecules from the 3D window and according to cluster membership, as calculated in Activity Miner
  • New ‘Send to Flare’ functionality
  • Improved grid view function
  • Improved display of protein ribbons, offering a choice of different ribbon styles and the capability to show ribbons for the active site only
  • Improved look and feel of the GUI with re-designed toolbars and updated and clearer icons for a more modern and sleek interface.

Upgrade to Forge V10.5

Upgrade at your earliest convenience to try the new Conformation Explorer and pharmacophore constraints in Forge, together with the many new and improved features in this release.

Evaluate Forge

If you are not currently a Forge customer, download a free evaluation.

Sneak peek at Forge V10.5

New versions (V10.5) of Forge™ and Torch™  are due out next month. This release offers new science and functionality and plenty of improvements that significantly enhance both applications. Below is a sneak peek at some of the new functionality in Forge.

Pharmacophore constraints in alignment

In this release of Forge we have included the new options to constrain the alignments using specific pharmacophoric features. As in Blaze, constraints (e.g., DonorH, Acceptor, Cation, Anion, covalent center) can be added to reference molecules and must be matched in the alignment or a penalty will be applied to the score. Pharmacophore constraints will be useful in those cases (such as specific kinase targets or metal chelators) where explicit interactions dominate the alignments.

Alignment

Molecule alignment is significantly improved in V10.5. New and enhanced functionality include:

  • Improved substructure alignment algorithm
  • New capability to specify the substructure you wish to match by writing a SMARTS pattern
  • New alternative similarity metrics
  • New individual field similarity weighting
  • Improved field and pharmacophore constraints editor, to define field and pharmacophore constraints and add specified field points in the desired position in 3D.

The result of all these improvements will be significantly improved generation of alignments that match your expectations without manual interference.

Conformation explorer

Molecular conformations are central to what we do. We think that our conformation hunter does a good job of generating a diverse range of energetically accessible conformations. However, we wanted to give you the opportunity to more easily explore the conformations of your molecules, enabling you to interact with and edit the populations.

The conformation explorer is a new tool in Forge for visualizing and analyzing conformation analysis results. Within the conformation explorer you can:

  • Visualize all the conformations created for each molecule in your Forge project
  • Delete unwanted conformations
  • Calculate and plot distances, angles and torsions
  • Calculate the CSD torsion frequency for all rotatable bonds
  • Filter conformations by energy, CSD torsion frequency and calculated distances, angles, torsions
  • Smart coloring of conformations includes coloring by energy and by CSD torsion frequency.

 


Figure 1: The conformation explorer in Forge. Rotatable bonds are colored and labelled by CSD torsion frequency.

 

Contact us to register for a free evaluation of Forge V10.5.

Comparing ligand and protein electrostatics of Btk inhibitors

Abstract

Protein interaction potentials implemented in Flare,1 Cresset’s structure-based design software, were used to calculate a detailed map of the electrostatic character of the protein active site of Bruton’s tyrosine kinase2 (Btk). The interaction potential maps were compared to those of selected Btk ligands to get a detailed understanding of ligand binding and SAR. 3D-RISM analysis in Flare was applied to investigate the stability of the crystallographic water molecules populating the Btk active site.

Introduction

Bruton’s tyrosine kinase is a member of the Tec family of non-receptor tyrosine kinases. Recent literature findings2 indicate that Btk inhibition could be an attractive approach for the treatment of autoimmune diseases such as rheumatoid arthritis, a progressive autoimmune disease characterized by swelling and erosion of the joints.

The published X-ray crystal structure PDB:4ZLZ shows that the 4RV ligand interacts with the active site of Btk (Figure 1 – left) by making H-bond interactions with Glu475 and Met477 in the hinge region. The pyridyl ring is involved in a cation-pi interaction with Lys430, with the pyridyl nitrogen making a water-mediated interaction to the P-loop residues Phe413 and Gly414. The replacement of 4-methylpyridin-3-yl with small bicyclic heterocycles like indazole in 4L6 (PDB:4Z3V, Figure 1 – right), displacing the water molecule and making direct H-bond interactions with the P-loop, led to the discovery of ligands with improved potency towards Btk such as compounds 4L6, 1 and 2 (see Table 1).3


Figure 1. Left: X-ray crystal structure of 4RV (PDB:4ZLZ) in the active site of Btk making a water mediated hydrogen bond with the P-loop backbone. Right: X-ray crystal structure of 4L6 (PDB:4Z3V) making direct H-bond interactions with the P-loop backbone.

In this case study, we used the protein interaction potentials and the 3D-RISM method available in Flare to investigate the electrostatics of the active site of Btk and the stability of the crystallographic water molecules. This information was then used to understand the SAR of the molecules in Table 1.

Method

The 4ZLZ and 4Z3V ligand-protein complexes were downloaded from the Protein Data Bank into Flare, and carefully prepared using the Build Model4 tool from BioMolTech,5 to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes and assign optimal protonation states to the protein structures. Any truncated protein chains were capped as part of protein preparation.

The protein sequences were aligned in Flare using the COBALT6 multiple alignment tool and subsequently superimposed by means of a least squares fit of equivalent C.alpha carbon atoms.

Protein minimization

The active site of the prepared 4ZLZ and 4Z3V ligand-protein complexes was minimized in Flare using the XED force field7 and Normal conditions (gradient cutoff: 0.200 kcal/mol/Å, 2,000 maximum iterations). The ligand structures were included in the minimization of the active site.

3D-RISM analysis

The Reference Interaction Site Model (RISM) is a modern approach to solvation based on the Molecular Ornstein-Zernike equation.8 3D-RISM has seen increasing use as a method to investigate the location and stability of water molecules in a protein.

Conceptually, 3D-RISM is equivalent to running an infinite-time molecular dynamics simulation on the solvent (keeping the solute fixed), and then extracting the density of solvent particles. The output of a 3D-RISM calculation consists in a grid containing particle densities, one for oxygen and one for hydrogen atoms. A thermodynamic analysis then assigns a ΔG value to each position on the grid, representing the ‘happiness’ of a putative water molecule at that position of the grid relative to bulk water.

3D-RISM calculations in Flare use Cresset’s XED force field, which offers the advantage of incorporating both electronic anisotropy and a certain degree of polarizability, and accordingly improves the effectiveness of the method.

A 3D-RISM analysis was carried out on 4ZLZ and 4Z3V to investigate the stability of crystallographic water molecules surrounding the 4RV and 4L6 ligands bound to the active site of Btk.

The following conditions were used:

  • XED force field and charge method
  • 4Å grid spacing
  • 14Å grid external border width
  • Convergence tolerance: 10-8
  • Maximum number of iterations: 10,000
  • Total formal charge handling: neutralize with counterions.

Protein interaction potentials

Protein interaction potentials are an extension of Cresset molecular interaction potentials to proteins. Both are calculated using the XED force field. The approach is similar in principle to the calculation of ligand fields: the protein’s active site is flooded with probe atoms, and interaction potentials are calculated at each point. This method makes use of a distance-dependent dielectric function based on the work of Mehler,9 to better cope with the large number of charged groups in protein structures.

All the ligands in Table 1 belong to the same series as 4L6, so for this case study protein interaction potentials were only calculated and displayed for the active site of 4Z3V.

Ligand fields

To obtain a sensible pose for the ligands in Table 1, the corresponding 2D structures were docked into the ‘dry’ (i.e., not including crystallographic water molecules) active site of 4Z3V using the Lead Finder10 method implemented in Flare.

Cresset’s ligand fields were then calculated and compared to the 4Z3V protein interaction potentials, to investigate the SAR for the ligand series.

Results

3D-RISM analysis on 4ZLZ

At the end of a 3D-RISM run, a 3D-RISM water molecule chain is added to the protein structure. The water molecules in this chain occupy regions of high water density as predicted by 3D-RISM, and are colored according to the calculated ΔG for the whole water molecule, averaged over all orientations.

‘Happy’ water molecules (associated with a calculated negative ΔG) are colored in shades of green: these are water molecules which 3D-RISM predicts to be more stable in the protein than in bulk water, and hence more difficult to displace with a ligand.

‘Unhappy’ water molecules (associated with a calculated positive ΔG) are colored in shades of red: these are waters that are less stable relative to bulk water and hence more easily displaced by a ligand.

Figure 2 shows the results of the 3D-RISM calculation on 4ZLZ. The oxygen density surface (Figure 2 – left) clearly shows a region of localized water near the nitrogen of the pyridine, and the 3D-RISM localization algorithm (Figure 2 – right) suggests that a water molecule should exist in exactly the spot where it is seen in the crystal structure. The thermodynamic analysis indicates that this water molecule is neither particularly ‘happy’ nor particularly ‘unhappy’. This is consistent with the fact that this water molecule is displaceable (as proven by 4L6 and the other compounds in Table 1), but also indicates that the displacing group needs to have the correct electrostatics and shape to avoid losing affinity.

3D-RISM analysis on 4Z3V

The oxygen density surface for 4Z3V is shown in Figure 3 – left. The 3D-RISM localization algorithm correctly identifies the position of the majority of crystallographic water molecules surrounding the 4L6 ligand bound to the Btk active site: many of these water molecules are predicted to be ‘happy’. Accordingly, a selected subset of the stable water molecules was included in the calculation of protein interaction potentials for 4Z3V, as they were considered to be an integral part of the protein active site with respect to ligand binding.


Figure 2: 3D-RISM results on 4ZLZ. Left: oxygen isodensity surface at ρ=5. Right: localized 3D-RISM waters, colored by ΔG.


Figure 3: 3D-RISM results on 4Z3V. Left: oxygen isodensity surface at ρ=5. Right: localized 3D-RISM waters, colored by ΔG.

Protein interaction potentials for 4Z3V

As shown in Figure 4, the protein interaction potentials of both the ‘dry’ (not including crystallographic water molecules) and ‘wet’ (including stable crystallographic water molecules lining the active site) active site of 4Z3V match the 4L6 ligand fields in a satisfactory manner.

In particular:

  • the electron-rich cinnoline ring sits in a region of positive interaction potential in the middle of the 4Z3V active site;
  • the 5,6 hydrogens of the cinnoline ring sit near an area of negative interaction potential corresponding to the carbonyl of Leu408;
  • the carbonyl and the NH2 of 3-carboxamide sit respectively within and nearby an area of positive and negative interaction potential corresponding to the backbone NH of Met477 and the backbone carbonyl of Glu475 in the hinge region of Btk, with which they form H-bonds;
  • the 4-amino group on the cinnoline ring also sits nearby an area of negative interaction potential, corresponding to the carbonyls of Met477 and Leu408;
  • the electron-rich 5-membered ring of indazole sits in an area of positive interaction potential corresponding to the protonated side chain of Lys430 (not shown) and the backbone NH of Phe413, with the NH-group pointing towards a negative area corresponding to the backbone carbonyl of Gly414 with which it forms an H-bond.

The inclusion of stable water molecules in the calculation of protein interaction potentials confirms this scenario. In this case though, the region of positive protein interaction potential in the middle of the 4Z3V active site is much larger and embraces most of the cinnoline-indazole ring system. This is indeed fully consistent with the negative ligand field surrounding the cinnoline-indazole ring system (Figure 4 – bottom).

Also, the 4-amino group on the cinnoline ring sits in an area of negative interaction potential which nicely matches the positive ligand field corresponding to this group.


Figure 4: 4L6 superimposed to the protein interaction potentials of 4Z3V. Top-left: ‘dry’ active site, not including crystallographic water molecules. Top-right: ‘wet’ active site including stable water molecules. Bottom: Ligand fields for 4L6. Protein interaction potentials shown at isolevel = 3; ligand fields shown at isolevel = 2.

SAR of Btk inhibitors

A comparison of ligand fields with the protein interaction potentials for the active site of Btk provides some useful insight into the SAR of compounds in Table 1.

Compound 1

Compound 1 (pIC50 8.7) is one of the two most potent compounds in this data series,3 carrying a -OMe side chain on the indazole ring and a fluorine in position 5 of the cinnoline ring. The binding mode of 1 (Figure 5) is similar to that of 4L6. The compound makes H-bond interactions with Glu475 and Met477 in the hinge region, a cation-pi interaction with Lys430 (not shown), and H-bond interactions with the backbone of P-loop residues Phe413 and Gly414.

The fluorine group sits in a relatively large pocket close to a water molecule which it possibly displaces. The CH3 of the OMe group sits in an area of negative interaction potential.


Figure 5: Left: compound 1 (pIC50 = 8.7) superimposed to the protein interaction potentials for the active site of 4Z3V at isolevel = 3. Right: ligand fields for compound 1 at isolevel = 2.

Compound 2

Compound 2 is also one of the most active compounds in the data series3. Quite interestingly though, the NH on the indazole does not make an H-bond with Gly414, as it is turned on the other side, possibly making an
H-bond interaction with a nearby water molecule.


Figure 6: Compound 2 (pIC50 = 8.7) superimposed to the protein interaction potentials for the active site of 4Z3V at isolevel = 3.

Compounds 3 and 4

The good activity (pIC50=8.4) of compound 3 confirms that an H-bond donor on the bicyclic system is not an essential feature for a Btk ligand to reach good levels of activity. Quite interestingly, compound 4 (pIC50=7.7) is structurally very similar to 3, but significantly less active. The comparison of the ligand fields for these two compounds with the protein interaction potentials of the active site of 4Z3V provides a possible explanation, as shown in Figure 7. While for both compounds (Figure 7 – middle column) the negative ligand field shows a good complementarity with the positive interaction potential of the backbone NH of Phe413, the positive ligand field of 4 (Figure 7 – right column) does not match the negative interaction potential generated by the backbone carbonyl of Gly414.

For both compounds, the methyl group in position 7 of the cinnoline ring plays the same role of the methyl on the indazole ring of 4L6 in ensuring that the ligands achieve the correct conformation in the active site.


Figure 7: Compounds 3 and 4 superimposed to the protein interaction potentials for the active site of 4Z3V at isolevel = 3. Ligand fields shown at isolevel = 4.
Middle: positive interaction potentials superimposed to negative ligand fields.
Right: negative interaction potentials superimposed to positive ligand fields.

Conclusions

Protein interaction potentials and ligand fields, as implemented in Flare, are a powerful way of understanding the electrostatics of ligand-protein interactions. The inclusion of stable water molecules following a 3D-RISM analysis dramatically improves the precision of the method for the characterization of protein active sites. The information gained from protein interaction potentials can be used to inform ligand design, compare related proteins to identify selectivity opportunities, and understand SAR trends and ligand binding from the protein’s perspective.

References and links

1. http://www.cresset-group.com/products/flare/
2. C.R. Smith et al., J. Med. Chem. 2015, 58, 5437−5444
3. US patent 2015/0038510
4. V. Stroganov et al., Proteins 2011, 79(9), 2693-2710
5. https://www.biomoltech.com/
6. https://www.ncbi.nlm.nih.gov/tools/cobalt/re_cobalt.cgi
7. J.G. Vinter, J. Comput.-Aided Mol. Des. 1994, 8, 653-668
8. R. Skyner et. al., Phys. Chem. Chem. Phys. 2015, 17(9), 6174
9. E. L. Mehler, The Lorentz-Debye-Sack theory and dielectric screening of electrostatic effects in proteins and nucleic acids, in Molecular Electrostatic Potentials: Concepts and Applications, Theoretical and Computational Chemistry Vol. 3, 1996
10. O. V. Stroganov et al., J. Chem. Inf. Model. 2008, 48(12), 2371-2385

What can Torch do for you that TorchLite can’t?

Abstract

TorchLite is the powerful freeware 3D molecule viewer, editor and design tool from Cresset. However, there are situations in which modeling with TorchLite is simply not enough and you need to access the full power of Torch. This blog post highlights some of the features which make Torch a powerful molecular design tool for medicinal and synthetic chemists.

Introduction

You can see several interesting applications of TorchLite in our case studies and web clips. With TorchLite, you can view the results of ligand-based or structure-based virtual screening, understand the shape and electrostatic character of active molecules and design new molecules to match their pattern. But what are the differences between TorchLite and its big brother Torch? When should you start using Torch?

In this blog, I highlight some of the additional features available in Torch, but not in TorchLite, with examples of their application.

SAR analysis in TorchLite

The web clip Visualizing field changes to understand SAR shows how to quickly investigate the SAR of a small dataset of NaV1.7 inhibitors using TorchLite. Structures were manually sketched using the built-in 3D molecule editor, quickly minimized and saved in the Molecules table and NaV1.7 activity data manually entered. This works nicely for this small dataset, however, for larger compound sets manual editing and data entry is slow and open to human error. Also, manual editing and minimization in TorchLite cannot replace a full exploration of the conformational space of compounds, which ensures that diverse, low energy conformations are considered in the SAR analysis. Finally, while alignment is straightforward for the simple changes carried out in the web clip, a robust method for sensibly aligning the compounds is required when more complex structural changes are made.

This is the most important difference between the two packages: conformational exploration and alignment can be carried out in Torch (and Forge), but not in TorchLite.

SAR analysis in Torch

In Torch, molecules are aligned to one or more reference molecules using fixed conformations, which can be imported into Torch or calculated on the fly by the application.

Suitable reference molecules are highly active molecules, preferably in their bioactive (protein bound) conformation. This is usually either experimentally observed (when crystallographic information is available), or derived from a docking experiment or pharmacophore modeling (these methods are also available in Lead Finder and Field Templater, respectively).

Using a ‘Normal’ alignment, the conformation ensemble for each molecule in the data set is aligned to the reference molecule in two stages. In the first stage the field points around a molecule are used to generate an initial alignment. In the second stage the initial alignment is optimized to get the best possible similarity score. In this stage, it is possible for Torch to use an excluded volume, typically derived from the protein crystal structure, that defines a region of space around the reference molecule that acts as a constraint on the alignments.

Torch offers an additional method for automated molecular alignment. Using the Maximum Common Substructure (MCS) approach each ligand is initially fitted to the reference molecule using a common-substructure algorithm and then additional groups are the fitted using the best match of field points and shape. This substructure alignment can be regarded as a ligand-centric view of the match to the reference where the use of the field points alone is akin to a protein-centric view of the alignment.

Each method has their advantages:

  • Field points give an unbiased view of alignment with a score that can be used in, for example, virtual screening
  • The substructure approach highlights the differences between molecules that lie in the same series making them easier to interpret, particularly when using ligand-centric computational techniques such as the activity cliff analyses in Activity Miner and Activity Atlas, as in the example below.

Using alignment in SAR studies

In the case study Activity Atlas analysis of sodium channel antagonists. Part I: SAR of the right-hand side phenyl ring a dataset of 62 pyrrolopyrimidine NaV1.7 antagonists was downloaded from CheMBL, conformationally explored in Forge and aligned by MCS to the chosen reference compound.


Figure 1. The reference compound used to align the NaV1.7 data set.
The SAR of the data set was then analyzed using Activity Atlas, a probabilistic method of analyzing the SAR of a set of aligned compounds as a function of their electrostatic, hydrophobic and shape properties, available in Forge.

A more simple workflow can be implemented in Torch to quickly and effectively explore the SAR on the right-hand side phenyl ring (Figure 1) using Activity Miner, an optional module of Torch (included in Forge).

The ‘Substructure’ filter in Torch was used to select a subset of 17 compounds from the original data set which have the same scaffold and left-hand side substituent as Cmpd 1, but vary on the right-hand side phenyl, following the workflow shown in Figure 2.


Figure 2. Filter by substructure in Torch.
The lowest energy conformation of Cmpd 1 (one of the most active compounds in the data set) was then chosen as a reference structure, following an ‘accurate but slow’ (Max number of conformations: 200; RMS cut-off for duplicate conformers: 0.5; Gradient cut-off for conformer minimization: 0.1 kcal/mol; Energy window: 3 kcal/mol) conformation hunt within Torch. This was used to align the 17 compounds by Maximum Common Substructure, using again an ‘accurate but slow’ set-up for the conformation hunt.

The SAR of the right-hand substituted compounds can then be explored using the activity view maps calculated and displayed by Activity Miner.

The activity view shows a focus compound surrounded by its nearest neighbors according to the chosen similarity metric (Figure 3). In this view the height of each wedge corresponds to the ‘distance’ between the pair: a smaller wedge reflects very similar compounds.


Figure 3. Activity view map for Nav1.7 pIC50, showing the detailed SAR of the phenyl ring.
The color of the wedge reflects the direction the activity is going: red means the activity is decreasing; green means the activity is increasing between the pair.

The shading echoes the disparity, which relates to how steep the activity cliff is. The result is a focused view of the SAR around a chosen compound.

Figure 3 also shows the activity view around the unsubstituted phenyl (pIC50 6.6). This view clearly shows that para substitution is always detrimental for NaV1.7 activity: ortho substitution is beneficial, especially with a small halogen like Fluorine; and meta substitution is also in general beneficial. Ortho, ortho substitution, instead, is less tolerated.

Design of new molecules using Torch

One of the major advantages of field based alignment is that it is agnostic to the chemical series that is being aligned. This can be used to aid in the design of new compounds in Torch by aligning diverse actives to a common reference and then transferring key functional groups across series. In this example, I use the crystal structure of HDT, a potent Cyclin-Dependent Kinase inhibitor, bound to CDK2 (PDB code 1OIT) to modify the design of an oxime based inhibitor.

As can be seen in Figure 4, HDT interacts with the hinge region of the active site of CDK2 by making two H-bond interactions with the backbone carbonyl and NH of Leu 83, and a H-bond interaction with Lys 33. The sulphonamide group also makes H-bond interactions with Asp86 (not shown).


Figure 4. HDT bound to the CDK2 active site.
In this design experiment, more potent CDK2 inhibitors are designed starting from the 2D structure of compound CK3 (Figure 5), a smaller and less potent CDK2 inhibitor with a Ki 2200 nM using the interactions of HDT as a guide.
The 2D structure of CK3 (drawn with a favorite drawing package) was imported in Torch by copy/paste. CK3 was then aligned to HDT using an accurate but slow conformation hunt followed by a ‘Normal’ (field based) alignment.


Figure 5. Structure of CK3, an inhibitor of CDK2 (Ki 2200 nM).
Figure 6 shows the results of the alignment experiments. CK3 (grey) is nicely superimposed to HDT (pink) and it is straightforward to see which changes should be made to increase CDK2 potency, replacing the formamidine moiety with a phenyl ring, possibly decorated with a sulphonamide or other H-bond acceptor group in the para position.


Figure 6. CK3 (grey) aligned to HDT (pink).
This change can be easily done in the molecule editor available in Torch, using the reference structure as a guide. As changes are made in the editor, the similarity score (Figure 7) is updated on the fly by clicking on the ‘Minimize’ and ‘Optimize Alignment’ buttons. Once the editing is completed, clicking the ‘Align’ button in the molecule editor will prompt Torch to carry out a full conformation hunt and field alignment on the new design.


Figure 7. The Molecule Editor in Torch.
The structure of CK6, an analogue of CK3 with CDK2 Ki 70 nM, aligned to HDT in Torch are shown in Figure 8 (left). The superimposed crystal structures of CK6 and HDT as in the PDBs 1PXN and 1OIT, respectively shown in Figure 8 (right). The alignment in Torch almost perfectly matches the crystallographic alignment of these two ligands in the CDK2 active site.


Figure 8. Left: CK6 (grey) aligned to HDT (pink) using Torch. Right: superimposed crystal structures of CK6 (grey) and HDT (pink) as in PDB entries 1PXN and 1OIT.

Multi-Parameter Scoring

Multi-Parameter Scoring in Torch helps medicinal and synthetic chemists assess the overall physico-chemical profile of the compounds of interest using colors and radial plots. As can be seen in Figure 9, columns in Torch are colored according to a profile set up in the Torch preferences. Properties perfectly matching the desired profile are colored in green, those with an acceptable value in yellow, while those with an unacceptable value in red.

The profile can be tailored to the specific project needs in the Radial Plot Properties window. In this window, a weight can be also associated to each property based on its importance in the ideal project profile. The score and fit to the project profile for each molecule is then summarized in the radial plot.

The radial plot is based on the idea that molecule properties that are ‘perfect’ should be displayed at the center of the radial plot. Thus, a molecule with perfect or near perfect properties should have a radial plot with a small encapsulated area (shown in green). Conversely, poor properties would be plotted at the edge of the radial plot such that a molecule with sub-ideal properties would have a radial plot with a large enclosed area (this can be reversed using the Radial Plot Preferences).

In Figure 9, you can see the column coloring for the CDK2 project. Comparing the color coloring of CK3 and CK6, most properties have values matching the ideal property profile. CDK2 Ki has significantly improved from CK3 to CK6, while lipophilicity (SlogP) is less good in CK6. CK3+phenyl (Figure 9, Molecules table) is slightly less active than CK6 and its lipophilicity is high with respect to the other two compounds: another good reason for including a hydrophilic H-bond acceptor in the para position of the phenyl ring.

The radial plot properties are combined into a single score that represents the overall fit of molecule to the ideal project profile. Radial plots can be sorted and filtered based on this score, making it easier to select the best candidates for your projects.


Figure 9. Multi-parameter scoring in Torch.

Conclusion

This blog highlights some of the additional features in Torch, the powerful molecular design tool for medicinal and synthetic chemists.

Additional functionality available in Torch includes the capability to:

  • run virtual screening of up to 500 molecules
  • use Activity Atlas and 2D/3D-QSAR models built with Forge
  • create interactive multi-series scatter plots and histograms of biological or physical properties
  • import calculated and/or measured physical properties and data from an external web service through a REST interface.

Contact us to benefit from this functionality and try the full power of Torch.

November release of Spark reagent databases now available

The November release of the Spark reagent databases derived from eMolecules is now available.

As announced in the October newsletter, Spark users can now benefit from monthly releases of reagent databases derived from eMolecules’ building blocks collection. The rolling updates are intended to provide the very best availability information on the reagents that you wish to employ.

The updated databases can be downloaded now through the Spark Database update widget (instructions on the installing Spark databases page) or using a command line utility (such as wget, please contact us for details).

Activity Atlas analysis of sodium channel antagonists. Part I: SAR of the right-hand side phenyl ring

Abstract

Activity Atlas1 is a component of Forge2, Cresset’s powerful workbench for ligand design and SAR analysis. Activity Atlas models summarize the SAR for a series into a visual 3D model that informs design decisions and helps prioritize molecules for synthesis. In this case study, Activity Atlas’ activity cliff summary maps were used to analyze the SAR of a small series of Nav1.7 sodium channel antagonists. The objective was to investigate and understand the electrostatic, hydrophobic and shape features underlying receptor activity in a case where crystallographic information is not available for the target protein.

Introduction

Structural information is becoming commonly available even for those targets, such as GPCRs and ion channels, which until recently were considered difficult to crystallize.

For novel targets and new chemical series, however, X-ray data may still be difficult to obtain. Quite frequently the information available to project chemists during the early stages of a discovery project can be so scarce to severely hamper the applicability of traditional structure- and ligand- based computational approaches such as docking and pharmacophore modeling.

A method capable of quickly identifying and deciphering the most relevant features underlying protein-ligand interaction, starting from a very limited amount of Structure-Activity Relationship (SAR) data and no structural information, would be of invaluable help during the early stages of drug discovery projects.

We introduced Activity Atlas, a probabilistic method of analyzing the SAR of a set of aligned compounds as a function of their electrostatic, hydrophobic and shape

properties. The method uses a Bayesian approach to take a global view of the data in a qualitative manner. Results are displayed using Forge visualization capabilities to gain a better understanding of the features which underlie the SAR of your set of compounds.

In this case study, the activity cliff summary method in Activity Atlas was used to analyze the SAR of the right-hand side region (RHS, Figure 1) of a small data set of published3 pyrrolopyrimidine antagonists of voltage-gated sodium ion channel Nav1.7, a major regulator of human pain and an attractive target for the development of new and effective pain therapeutics.

The objective is to prove the usefulness of Activity Atlas maps in understanding the electrostatic, hydrophobic and shape features underlying biological activity in those cases where structural information about ligand-target interaction is not easily accessible.

The detailed SAR analysis of the other regions of the pyrrolopyrimidine antagonists will be presented in a future case study.

The data set

A small data set of 62 pyrrolopyrimidine Nav1.7 antagonists (Figure 1) originally published by Chakka et. al.3 was downloaded from ChEMBL.4

Figure 1_The reference compound used to align the data set
 Figure 1. The reference compound used to align the data set.
 

Nav1.7 pIC50 values for this data set span a 3-fold range from 4.4 to 7.7, with an even distribution shown in Figure 2. The data set includes five very weakly active

compounds whose activity was reported in the original paper as % inhibition only. These compounds were assigned a Nav1.7 pIC50 = 4.4 in the Forge project.

Figure 2_Distribution of Nav17 pIC50 values for the 62 pyrrolopyrimidine antagonists
Figure 2. Distribution of Nav1.7 pIC50 values for the 62 pyrrolopyrimidine antagonists.

Conformation hunt and alignment of compounds

The alignment workflow shown in Figure 3 was applied to align the 62 compounds in the data set.


Figure 3_The alignment workflow used in this case study
Figure 3. The alignment workflow used in this case study.

Cmpd1 (Figure 1, pIC50 7.7) was chosen as the reference compound, and its conformational space explored using a ‘very accurate but slow’ conformation hunt within Forge:

  • Max number of conformations: 1,000
  • RMS cut-off for duplicate conformers: 0.5
  • Gradient cut-off for conformer minimization: 0.1 kcal/mol
  • Energy window: 3 kcal/mol.

The use of a 3D similarity metric in Activity Atlas requires (as with 3D-QSAR) the generation of alignments for all compounds and is sensitive to misalignment and

alignment noise. The choice of a sensible conformation for the reference structure may be critical in those cases where no experimental information is available about the bioactive conformation of the ligands.

The low energy conformations of Cmpd1 were accordingly visually inspected, to select a small number of low energy conformers representative of its conformational space. These conformers were used as the reference structure in separate Forge projects to develop alternative alignments for the training set and generate distinct Activity Atlas models. These models were then checked for consistency, and further validated by exploring the detailed SAR of the data set with Activity Miner5, a module within Forge and Torch6 providing rapid navigation of complex SAR.

The training set compounds were aligned to each low energy conformation of Cmpd1 by Maximum Common Substructure using an ‘accurate but slow’ set-up for the conformation hunt:

  • Max number of conformations: 200
  • RMS cut-off for duplicate conformers: 0.5
  • Gradient cut-off for conformer minimization: 0.1 kcal/mol
  • Energy window: 3 kcal/mol.

Activity Atlas models

Activity Atlas models are calculated following a probabilistic approach which takes into account the probability that a molecule is correctly aligned.

This is done by associating a weight with each alignment based on its similarity score. Alignments with similarity higher than a certain threshold (which can either be automatically calculated by Forge, or manually defined by the user) are fully trusted. Alignments with similarity lower than the low similarity threshold are not trusted and discarded. Linear scaling is applied to calculate a weight to alignments which have an intermediate similarity score.

Each run of Activity Atlas performs three types of analysis: average of actives, activity cliff summary and regions explored analysis.
In this case study, the activity cliff summary analysis was used to explore the SAR of the 62 pyrrolopyrimidine Nav1.7 antagonists. This analysis helps you pinpoint the critical regions of SAR, providing a visual 3D summary of the activity cliffs for the data set derived from the Activity Miner module. The method is described in detail in the ‘Generating Activity Atlas models’ section of the Forge manual.

Figure 4_Activity cliff summary maps for the RHS phenyl ring
Figure 4. Activity cliff summary maps for the RHS phenyl ring, derived by aligning the 62 compounds in the data set to three representative low energy conformations of Cmpd1.

Results

The results of the activity cliff summary analysis for the phenyl ring on the RHS starting from representative low energy conformations of Cmpd1 are shown as 3D maps in Figure 4.

All the models give consistent results, and provide clear indications about the electrostatic, hydrophobic and shape features underlying Nav1.7 activity, as explained in detail in Figure 5.

Small halogens in the ortho and meta positions of the phenyl ring on the RHS of the molecule improve Nav1.7 activity, as shown by the negative electrostatic field (in cyan in Figure 5) and the associated areas of favorable shape (green areas). Areas of favorable/unfavorable hydrophobic interaction are not shown for clarity as they overlap largely with those of favorable and unfavorable shape.

Substituents which generate a more positive (or less negative) electrostatic field (in red in Figure 5) in the para and second meta positions are beneficial for activity.

Figure 5-Activity cliff summary map for Nav1.7 pIC50

Figure 5. Activity cliff summary map for Nav1.7 pIC50, showing the effect of different decoration patterns on the phenyl ring on the RHS of the compounds.

Steric bulk in the para position (magenta areas) instead is detrimental for Nav1.7 activity.

Finally, electron-withdrawing substituents which generate a more positive (or less negative, in red in Figure 5) electrostatic field below the plane of the ring are also beneficial for Nav1.7 activity.

These general trends were investigated in more detail by means of activity view maps calculated and displayed using Activity Miner.

The activity view shows a focus compound surrounded by its nearest neighbors according to the chosen similarity metric (Figure 6). In this view the height of each wedge corresponds to the ‘distance’ between the pair: a smaller wedge reflects very similar compounds.

The color of the wedge reflects the direction the activity is going: red means the activity is decreasing; green means the activity is increasing between the pair. The shading echoes the disparity, which relates to how steep the activity cliff is. The result is a focused view of the SAR around a particular compound.

Figure 6 shows the activity view around o-F-phenyl (pIC50 7.7), one of the most potent compounds in the data set.

Starting from the o-Cl substituent and going clockwise, it can be seen that replacing the o-F substituent in the focus compound (pIC50 7.7) with o-Cl (pIC50 7.3) or o-Me (pIC50 7.4) has a very slight detrimental effect on activity, as these substituents are associated with a less negative electrostatic field.

Introducing a second F in the meta position does not impact activity (pIC50 7.7), while the unsubstituted phenyl ring is much less active (pIC50 6.6), as it lacks the favorite small halogens in ortho, meta.

Replacing m-F with m-Cl again does not impact activity.

The introduction of a second o-F substituent (pIC50 6.8) instead causes a drop in activity, an effect not highlighted by the activity cliff summary, as only one example of ortho, ortho disubstitution is available in the data set.

Replacing o-F with o-CF3 (pIC50 7.1) causes a modest drop in activity, which is difficult to explain in terms of electronic effects: this compound is possibly an outlier to the general trend shown by the activity cliff summary maps.

Removal of the o-F substituent causes a drop in activity, as can be seen for m-F (pIC50 6.9), m-Me (pIC50 7.1) and m-Cl (pIC50 7.1).

Finally, the lack of small halogens in the ortho and meta positions, together with the introduction of unfavorable steric bulk in the para position, causes the dramatic drop in activity in p-F (pIC50 5.1): this substituent is also associated with a more negative electrostatic field.

Figure 6_Activity view map for Nav1.7 pIC50, showing the detailed SAR of the phenyl ring
Figure 6. Activity view map for Nav1.7 pIC50, showing the detailed SAR of the phenyl ring.

Conclusion

In this case study, Activity Atlas and Activity Miner were successfully applied to decipher the SAR of the right-hand side phenyl ring of a series of voltage-gated sodium ion channel Nav1.7 antagonists, starting from a very limited amount of SAR data and no available crystallographic information about the bioactive conformation.

The activity cliffs summary in Activity Atlas was used to get an overview of the SAR landscape, focusing on the prevalent SAR signals.

Activity Miner was used to drill down into the Activity Atlas maps to understand subtle molecule-to-molecule structure-activity changes and identify potential outliers.

The two methods used in combination were able to quickly identify and decipher the most relevant features underlying protein-ligand interaction.

The information derived from this analysis can be of invaluable help for drug discovery projects to inform design decisions and help prioritize molecules for synthesis.

Using the Spark reagent databases to identify bioisosteric R-group replacements

Giovanna Tedesco
Cresset, New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK

Abstract

The reagent databases1 available with Cresset’s Spark2 software for bioisosteric replacement were used to identify alternative decorations for a series of triazolopyridazine and 8-fluorotriazolopyridine selective inhibitors of the c-Met Kinase. The use of databases derived from available reagents ensured that the results could be tethered to molecules that were readily synthetically accessible.

Introduction

The overexpression of c-Met and/or hepatocyte growth factor (HGF), the amplification of the MET gene, and mutations in the c-Met kinase domain can activate signaling pathways that contribute to cancer progression by enabling tumor cell proliferation, survival, invasion, and metastasis.3,4 For these reasons, there has been significant interest in the discovery of small molecule c-Met inhibitors for the treatment of cancer. In particular, researchers at Amgen have recently published potent, selective, ATP- competitive and orally bioavailable small molecule inhibitors of c-Met belonging to the chemical classes of triazolopyridazine3 and 8-F-triazolopyridine.4

The published X-ray crystal structure of compound 43 (an early representative of the triazolopyridazine series, see Table 1) bound to c-Met (PDB 3CD8), shows that this molecule adopts a ‘U-shaped’ binding mode into the active site (Figure 1). A direct hydrogen bond is formed between the backbone NH of Met1160 (linker) and the quinoline nitrogen. A second hydrogen bonding interaction can be observed between N1 of the inhibitor

and the backbone NH of Asp1222. The triazolopyridazine core makes a π-stacking interaction with Tyr1230.  Finally, the aromatic C-H in position 7 makes an electrostatic interaction with the carbonyl of Arg1208.

Based on this experimental information, researchers at Amgen speculated that modifications of the C-6 phenyl group on the triazolopyridazine core would modulate the π-stacking interaction with Tyr1230 allowing for increased potency, and started a chemical exploration based on the synthesis of C-6 aryl and heteroaromatic analogues.3

The same strategy was applied to the exploration of 8-fluorotriazolopyridine compounds.4

The 3D structure of compound 43 was used as the starting point for this case study, where Spark was used in combination with the Cresset supplied reagent databases which are based on eMolecules building blocks.5 The aim of this experiment is to verify whether our methodology could have facilitated the chemical exploration work at Amgen, correctly identifying, among the results of a single Spark run, the most active C-6 monocyclic heterocycles published in refs. 3, 4.

X-ray crystal structure of compound 43 in the active site of c-Met _PDB 3DC8 
Figure 1. X-ray crystal structure of compound 43 in the active site of c-Met (PDB 3DC8).

Table 1. SAR of triazolopyridazine and 8-fluorotriazolopyridine compounds against c-Met.

 

Table1

a) Inhibition of c-Met kinase activity

b) Inhibition of HGF-mediated c-Met phosphorylation in PC3 cells

Method

The published X-ray crystal structure of compound 43 bound into to the active site of c-Met (PDB 3CD8) was downloaded into Forge.6 The structure of the ligand was minimized and used as the Starter molecule for the Spark experiment (Figure 2 – left). The ‘Accurate but slow’ conditions for scoring the Spark search results were fine-tuned by setting the gradient cutoff for minimization to 0.200 kcal/mol/A, and by setting a constraint on the positive field point mapping the interaction of compound

43 with Arg1208 in the c-Met kinase (Figure 2 – right). This introduced a score penalty for those results that did not match the constrained field point.  Finally, to focus the experiment on small monocyclic heterocycles, bicyclic fragments and substituted phenyl fragments were filtered out during the search using an appropriate SMARTS filter using the ‘Advanced Filters’ panel options (see Figure 3).

The experiment was run on a database of 9.5K aromatic boronic acids derived from eMolecules (Figure 3) building blocks to closely replicate the chemistry used in the original publication.3,4

Left_ starter molecule used in the Spark experiment. Right_ constraint associated to the positive field point mapping the interaction of compound 43 with Arg1208 in c-Met

Figure 2. Left: starter molecule used in the Spark experiment. Right: constraint associated to the positive field point mapping the interaction of compound 43 with Arg1208 in c-Met.

eMolecules reagent databases_left and Advanced Filters options_right

Figure 3. eMolecules reagent databases (left) and Advanced Filters options (right).

Results

As can be seen in Figure 4, the initial Spark experiment was able to identify the large majority of the monocyclic heterocycles used to explore the C-6 position of c-Met Kinase inhibitors published in ref. 3 (Table 1). In particular, 3-thienyl (10k), 2-thienyl (10j), 5-isothiazolyl (a close analogue of 3-methyl-isothiazol-5-yl used for compound 10m), 4-methyl-2-thienyl (compound 10l), were correctly identified among the 15 top ranking Spark results.

The Spark experiment was also able to correctly identify C-6 heterocycles used in subsequent iterations of the project to explore the 8-fluorotriazolopyridine scaffold (Table 1). However, while 2-pyridyl (compound 10a), 4-thiazolyl (10d) and 2-methyl-5-thiazolyl (10c) rank reasonably high in the list of results, 1-methyl-4-pyrazolyl (10e) and 3-methyl-5-isoxazolyl (10b) are correctly retrieved, but with a lower rank.

This is disappointing, however, compound 43 is approximately 3-10 times less potent in terms of c-Met

enzyme activity, and  20 times less potent in the cellular assay, than the most active heterocyclic compounds published in ref. 3 (10m and 10l). The Spark search was then repeated using 10m (which has a better pharmacokinetic profile than 10l3) as the starter molecule, to verify whether any improvement in the ranking of these two substituents could be achieved by starting from a more active compound, which is expected to even better fit the electrostatic and steric requirements of the c-Met binding site. The 3D conformation used for 10m was obtained by means of a field/shape alignment with the X-ray structure of compound 43 carried out within Forge.

The results of this second experiment are summarized in Figure 5. The ranking of 1-methyl-4-pyrazolyl was significantly improved, while no improvement was observed for 3-methyl-5-isoxazolyl.

A final Spark search carried out with compound 10m as a starter molecule on an expanded set of reagent databases (boronic acids and aromatic halides), suggested some interesting alternative small heterocycles which could have been tried, shown in Figure 6.

R-groups associated with known active inhibitors of c-Met found by Spark

Figure 4. R-groups associated with known active inhibitors of c-Met found by Spark.

Ranking of 1-methyl-4-pyrazolyl and 3-methyl-5-isoxazolyl using 10m as the starter molecule

Figure 5. Ranking of 1-methyl-4-pyrazolyl and 3-methyl-5-isoxazolyl using 10m as the starter molecule.

Novel potential replacement fragments identified by Spark

Figure 6. Novel potential replacement fragments identified by Spark.

Strain and torsion frequency analysis

As reported in refs. 3,4, it was hypothesized by the Amgen authors that co-planarity would enhance potency towards c-Met, presumably due to an optimal configuration for π-stacking with Tyr1230. In evaluating the results of a Spark experiment for this target it is therefore important to ensure that potential replacement fragments can adopt a realistic planar conformation.

Two types of analysis are available in Spark to monitor the above. The first is a calculation of the strain of the newly formed bond from the potential replacement fragment and the scaffold. The strain is calculated by performing a 30 degree torsion scan for that bond in the result molecule, and calculating the energy difference between the torsion chosen by Spark in the result molecule and the lowest energy torsion found during the scan. Values lower than 2 are largely insignificant.

Additionally, the Torsion Library7-9 method is used to assess the torsion associated with the newly formed bond, as well as the torsions associated with all rotatable bonds within the bioisostere fragment. The method is based on an analysis of the Cambridge Structural Database10 (CSD), and reports the frequency with which a specific torsion is experimentally observed. Torsions associated with a low frequency are a possible cause for concern and should be further investigated.

As can be seen in Figure 4, all the fragments identified by the Spark experiment and reported in refs. 3,4 can adopt the required planar conformation, with no significant strain associated to the newly formed bond: torsional frequencies for this bond range from ‘medium’ to ‘high’, and should accordingly be realistic based on the experimental data in the CSD.

Figure 6 shows the strain and the torsional frequency for the potential novel decorations identified by Spark. In this case, there are no concerns associated with the conformations chosen by Spark.

Availability of reagents

Whenever the new eMolecules reagent databases are used in a Spark experiment, availability information is displayed in the results table (see Figures 4 and 6). This information is important for planning laboratory activity taking into account realistic delivery timelines. For example, for three of the fragments shown in Figure 6, shipment is to be expected within 1-5 days from order. Delivery times for 5-pyrimidinyl and 5-oxazolinyl boronic acids are longer: the former can be shipped within 4 weeks from order, while the latter needs to be synthesized and this may take up to 12 weeks.

Searching for the reagents of interest in the eMolecules site enables a check of real-time availability information.

Conclusions

In this case study, a Spark R-group replacement experiment successfully identified the majority of active monocyclic heterocycles used by Amgen in the discovery of new potent triazolopyridazine and 8-fluorotriazolopyridine inhibitors of c-Met kinase.

The results suggest that working in successive rounds of optimization, choosing for each Spark experiment the starter molecule with the best activity profile, is an excellent strategy to rapidly identify the R-groups associated with the highest activity or optimal overall profile.

Access to reagent availability information plays an important role in deciding which fragments should be included in each round of optimization. Reagents with short delivery times should be preferred during the initial stages of the project to facilitate quick SAR information gathering, which will enable a more informed choice of fragments to explore in the successive rounds of lead optimization.

References and Links

http://www.cresset-group.com/products/spark/current-spark-databases/

http://www.cresset-group.com/products/spark

3 Albrecht B. K., et al., J. Med. Chem. 2008, 51, 2879–2882

4 Peterson, E. A., et al., Med. Chem. 2015, 58, 2417−2430

https://www.emolecules.com/info/building-blocks

http://www.cresset-group.com/products/forge/

7 Torsion Library method, jointly developed by the University of Hamburg Center for Bioinformatics, Hamburg, Germany and F. Hoffman-La-Roche Ltd., Basel, Switzerland

8 Schärfer, C. et al., Med. Chem., 2013, 56, 2016-28

9 Guba, W., et al., Chem. Inf. Model., 2016, 56, (1), 1-5

10 http://www.ccdc.cam.ac.uk/

Spark V10.4 released

A new version of Spark, our scaffold hopping and bioisostere replacement tool, is now released. V10.4 includes many new or improved features and gives access to new and updated chemical diversity.

The development of our applications is guided by our customers and this release is bursting with new features and science that you have asked for. A few of these are described below, however, I suggest that you use the software for yourself to discover the other features and see them in action.

Highlights

  • New Cresset reagent databases derived from eMolecules’ building blocks, replacing previous reagents based on ZINC, include availability information for every result
  • New analysis of the conformation of every result using the Torsion Library method of Guba et al. that is based on an analysis of the Cambridge Structural Database (CSD)
  • New configurable connection to external REST service for properties that enables you to add your own data and properties to the Spark experiment
  • Improved Radial Plots to support enhanced multi-parameter optimization.

New Cresset reagent databases derived from eMolecules’ building blocks

The new Spark reagent databases are derived from eMolecules’ building blocks and replace the previous reagents based on ZINC. These new databases enable Spark users to select the most promising results from their experiment with confidence that the corresponding reagents will be commercially available from reliable suppliers and access up-to-date availability information.

The chemically intuitive rules for R-group database creation have been refined to improve the accuracy of the chemistry incorporated into the new reagent databases. Over 20 different reagent databases are provided by Cresset using the updated rules, which can be easily modified to suit your preferences. If you think something is missing then let us know and we can add it to the list in minutes.

Customers with a database generator license can use our rules to process their own available reagents, giving rapid suggestions for the next set of compounds to be made using the reagents currently in your lab.

eMolecules
Figure 1: Make the most of the chemical diversity available from eMolecules to define the next move for your projects.

New Results table columns, enabling the analysis of the frequency of torsions and of attachment point type

New Results table columns are available within Spark V10.4, to facilitate the analysis of the quality of the results obtained from your Spark experiment and the assessment of chemical feasibility.

The ‘TorsFreq Frag’ and ‘TorsFreq’ column values are computed by analysing the frequency of torsions, as recorded in the CSD, using the Torsion Library method jointly developed by the University of Hamburg (Center for Bioinformatics) and F. Hoffman-La-Roche Ltd. The analysis is carried out for all dihedrals associated with rotatable bonds within the bioisosteric replacement and for each new bond formed in the result molecule. Torsions associated with a low frequency are a possible cause for concerns and should be further investigated.

Spark has always enabled you to restrict your search to fragments that link through a specific atom type. This feature enables you to search for bioisosteres that would work with your synthetic scheme. In this release we have added the ‘Attachment Point Type’ into the main result sheet to enable you to perform a wider search and then focus on the results that are of interest to you. This facilitates the assessment of chemical feasibility, enabling you to focus on those results which match the chemical strategy you have in mind for your project.

New external REST service for properties

One of the most requested features by customers is the ability to include corporate or externally-computed data for any compound into the Results tables. Spark V10.4 can connect to an external web service, through a REST interface, to import external properties and data computed or retrieved by such web services as additional columns in the Results tables. Using the new service you can bring in external predictions for new designs or simply use the corporate algorithm for calculating logP. Once imported the properties can be used in the Radial Plot, Tiles View and for coloring molecules and table cells enabling you to monitor the overall property profile of the results your Spark experiment.

Multi-parameter optimization in Spark

Radial plots were introduced in SparkV10.3 to provide a graphical representation of numerical data. These initial radial plots created a simple picture to show how a molecule fits the physicochemical profile of a project with the idea that parameters are within an ideal range, an unacceptable range or somewhere in between. In this release this representation is enhanced by introducing the option to combine all the scores in the radial plot together into a single number scaled between zero and 1 that represents how well the result molecule fits your project profile (Figure 2).

MPO
Figure 2: Enhanced radial plot.

Thus Spark results with a radial plot score of 1 fit the project profile perfectly while those with zero lie outside the desirable property space in all aspects. Since not all properties are equally important, Spark enables a weighting factor that can be applied to each property (Figure 3). The weight is used to scale the contribution to the final score. This is useful when you want to focus on one property more than another, for example you are prepared to have a non-ideal value for MW if the logP and TPSA are within the ideal range or you want a visual representation of that property but not have it count towards the score. External properties and data computed or retrieved from the external REST service can also be included into the radial plot.

Radialplotprop
Figure 3: The configuration of the radial plot now includes a weight to apply to each property in combining the properties into a single score.

Try Spark V10.4

This release represents a significant improvement in the usability and flexibility of the leading bioisostere application. We encourage you to upgrade your version of Spark at your earliest convenience.

If you are not currently a Spark customer, please download a free evaluation.

Contact us if you have queries relating to this release.