Forge V10.6: Choose the molecules to make, and understand why you should make them

读中文。

I am delighted to announce the availability of Forge™ V10.6, our powerful computational chemistry suite for understanding structure-activity relationship (SAR) and new molecule design. The focus of this release is on new and improved methods to generate robust Quantitative Structure-Activity Relationship (QSAR) models with strong predictive ability.

Choose the molecules to make next

Project chemists generally know which molecules they can make with a reasonably good chance of them being active. They often have too many clever ideas and are looking for ways of filtering and prioritizing lists of tangible compounds, arrays and small libraries.

Having a predictive QSAR model is a terrific way of doing this – you send your molecules into the model and get immediate feedback on whether making a compound is a good or bad idea.

However, getting a robust, predictive QSAR model is not always straightforward, and this is still a pain point for many of our users. You need a training data set of reasonable size, good activity data (e.g., pKi, pIC50) spanning a sufficiently large range, good descriptors and good modeling algorithms.

While we can’t help with the need of having a training data set of reasonable size and spread of activity, we can help with the rest.

The new Machine Learning (ML) methods in Forge, namely Support Vector Machines (SVM), Relevance Vector Machines (RVM) and Random Forests (RF) significantly expand the range of available QSAR model building options beyond the previous Field QSAR and k-Nearest Neighbors (kNN) regression options (Figure 1). Having access to a panel of well known, robust statistical tools gives you more opportunities to build a predictive model useful in project work.


Figure 1. The new Machine Learning methods significantly expand the range of QSAR model building options in Forge V10.6.

What about the descriptors?

Forge 3D electrostatic (based on Cresset’s XED force field) and volume descriptors are relevant for molecular recognition, and accordingly work very well for modeling activity and selectivity. These are used by Field QSAR and the new ML methods, while kNN can use either 3D electrostatic/shape or 2D fingerprint similarity.

New methods in action in a practical example

For this experiment, I have re-used an aligned data set of Orexin 2 Receptor ligands from the US patent literature,1 which I previously presented in a case study on Activity Atlas™, a method in Forge for qualitatively summarizing the SAR for a series into a visual 3D model.

I split the 377 Orexin 2 ligands into two subsets: a training set of 302 compounds which I used to build the QSAR models, and a test set of 75 molecules which were used solely to assess their predictive ability.

Figure 2 shows the results obtained with Field QSAR and the ML methods in generating predictive models for OX2R pKi. Field QSAR, kNN and RF models were built using default conditions; for SVM and RVM, Forge suggested a fine tuning of the model building conditions as the training set is large.


Figure 2. Performance of Field QSAR and ML methods on the Orexin 2 data set. Training set = 302 molecules used to build the models. Test set = 75 additional molecules used solely to assess the predictive activity of the models.

‘r2 Training Set’ is used to check the ability of each model to fit the data in the training set. It ranges from 1 (perfect fit) to 0 (no fit). From Figure 1, I can see that all models (except kNN in this case) give excellent results in fitting. However, this is hardly surprising as ML methods are well known for their ability to fit data of any type.

A more realistic check of the quality of the model comes from ‘r2 Training Set CV’. In cross-validation (CV), a part of the compounds in the training set is temporarily excluded from the model and the remaining compounds are used to build a model which is then used to predict the activity of the excluded compounds. Not surprisingly, ‘r2 Training Set CV’ is always lower than ‘r2 Training Set’, but the results for Field QSAR, RF, RVM and especially SVM are still good (kNN does not calculate this statistics).

Finally, ‘r2 Test Set’ gives an idea as realistic as possible of the performance of the model in real project work, as the model is asked to predict the activity of compounds it has never seen before. Most methods give reasonably good results, with SVM clearly outperforming the other methods with a more than respectable r2 test set = 0.59.

In a real project, I would not hesitate to choose SVM for filtering and prioritizing my list of ‘to-make’ compounds, with confidence that this is the best predictive power I can get for this specific data set.

What about kNN? It didn’t work very well on this data set; does it mean that it is not a good method? Not really. kNN is a robust, well known method particularly useful when working with multiple compound series, or with biological data which are derived from different sources. The fact that it didn’t work particularly well in this case does not exclude good performance in other projects.

This is the whole point of having several model building methods available: you can choose the one which gives best performance in your specific project.

If you think it must have been boring to calculate all these models separately, then I have good news: you don’t really have to. The default option in Forge is to automatically run all the ML models and pick the best one for you (Figure 3).


Figure 3. The Automatic model building option in Forge runs all the available ML methods and picks the best model for the output.

Understand why you should make the molecules you have chosen

A significant part of a project chemist’s work is to design the next generation of active molecules. To achieve this, you need to understand what are the features which make some compounds active, and which are those that undermine the activity in others. In other words, you need to interpret the model.

Unfortunately, ML algorithms won’t help you here: they are complicated equations which cannot be easily translated back to 3D in terms of ligand-protein interactions.

Luckily, Forge provides you with two additional tools: Field QSAR 3D views and the Activity Cliffs Summary in Activity Atlas.

Field QSAR, when successful, gives you the best of both worlds, i.e., predictions and interpretation.

Activity Atlas is qualitative only (no predictions) and is great for understanding the SAR for your data using activity cliffs analysis, especially when the SAR landscape is jagged.

Activity Atlas in V10.6 includes a new Activity Cliffs Summary algorithm which generates more detailed SAR maps reducing the reliance on individual compounds, especially useful for small and medium sized data sets.

In Figure 4, you can see the Field QSAR maps compared to the new Activity Cliffs Summary maps for the Orexin 2 data set.


Figure 4. Top: Field QSAR electrostatic (left) and steric (right) coefficients.  Bottom: Activity Cliffs Summary of Electrostatics (left) and Activity Cliffs Summary of Shape (right). Color coding: red = more positive electrostatic favors activity; blue = more negative electrostatic favors activity; green = favorable steric bulk; magenta = unfavorable steric clash.

Both types of maps clearly and consistently indicate where more positive (red) or negative (blue) electrostatics favors activity, and where steric bulk is favorable (green) or forbidden (magenta), providing invaluable indications for ligand design.

I don’t have ‘top quality’ data, but I still need a model

Sometimes the data you have are not as clean as you would like for the purposes of QSAR modeling. You may have % of inhibition data rather than pIC50s or pKis; data generated with different assays; or simply data which are qualitative in nature.

The new ML methods in Forge will work just as well to build classification models for sorting new molecules into existing categories (e.g., active/inactive). Forge will also provide appropriate visual tools (such as the confusion matrix, Figure 5) and classification performance metrics (Precision, Recall, Informedness) to assess the performance of the model and decide if it is good enough to be used in project work.


Figure 5. Confusion matrix for and useful statistics for an Orexin 2 classification model.

Improved graphics and GUI

In Forge V10.6 you will experience strong performance, great pictures and new smooth transitions between storyboard scenes thanks to new graphic engine which generates enhanced 3D objects (Figure 6).


Figure 6. The new graphic engine in Forge V10.6 generates great pictures.

This release includes also many other GUI and usability improvements, including:

  • An improved QSAR Model widget including relevant information and plots for the regression and classification models, PCA component plots, notes, and a ‘pop-up’ button to visually compare different models (Figure 7)
  • An improved interface for handling categorical data in support of classification models
  • Improved Blaze™ results window, showing an enrichment plot and statistics for each Blaze refinement level
  • New function to automatically assign selected molecules to roles, based on their Murcko scaffold
  • New function to run clustering from the main Forge GUI, specifying the desired similarity metric and threshold
  • New option to use all the available local CPUs, relaxing the 16-CPUs limitation of previous Forge releases
  • More responsive GUI for large projects, with improved performance on common operations such as application of filters, calculation and interaction with custom plots, exporting data
  • Faster, more robust and less memory-consuming calculation of Activity Miner™ and Activity Atlas large similarity matrices
  • Improved 2D display of molecules
  • Improved Activity Miner GUI
  • Improved plots now showing a regression line for selected molecules
  • Improved structural filters now including pre-defined filters for Ring, Aromatic Ring, Non-ring atom, Chiral atom, H-bond donor and H-bond acceptor
  • Improved Filters window, now including a green/red toggle to control whether each filter is enabled or disabled.

Figure 7. Compare different QSAR models with the new ‘pop-up’ button in the QSAR Model widget.

Stay tuned for more

Sign up for our newsletter to receive product release announcements, request your free evaluation or contact us to learn more about how Forge can help advance your project.

  1. US patent number 8,653,263B2

Sneak peek at Forge V10.6: Model building focus and much more

读中文。

While the development team is busy giving the finishing touches to Forge V10.6, let’s have a quick look at what is new in this release.

Improved predictions through new models

Forge users told us that the development of QSAR models with strong predictive ability was still a pain point for their projects. Not surprisingly, this is what made us focus on model building in this release.

Forge V10.6 comes with a full panel of well-known and robust Machine Learning (ML) methods (Support Vector Machines, Relevance Vector Machines, Random Forests, kNN classification) which complement those available in previous versions (Field QSAR and kNN regression).

These ML methods can be used to build both regression and classification models, and this is reflected in a QSAR Model widget completely re-designed to provide relevant visualizations and statistics for both model types (Figure 1). While each regression and classification model can be built individually, there is an option in Forge to automatically run all the ML models and pick the best one for you.


Figure 1. Left: Observed vs. Predicted Activity graph for a SVM regression model. Right: Confusion matrix and statistics for a SVM classification model.

Generating qualitative models on small datasets

Activity Atlas is a qualitative method for summarizing the SAR for a series into visual 3D maps that can be used to inform new molecule design. Forge V10.6 includes a new Activity Cliff Summary method which generates more detailed SAR maps by slightly downsizing the importance of the strongest activity cliffs.

You may want to use the new flavor of the method for understanding the SAR of small to medium size data sets, as this will provide a finer level of detail. For larger data sets (e.g., for quickly understanding patent SAR information), the original algorithm will help you focus on the prevalent SAR signals.

More responsive GUI for larger projects

Working with large projects (more than 1,000 molecules with multiple alignments and QSAR models) will be much more efficient in Forge V10.6. You will see improvements in the performance of common operations such as application of filters, calculation and interaction with custom plots, exporting data. The calculation of the large similarity matrices in Activity Miner and Activity Atlas will also be faster, more robust and use less memory.

Furthermore, there is now an option to set-up Forge to use all the available local CPUs, if appropriate, as we have relaxed the 16-CPUs limitation in the previous release of the software.


Figure 2. Forge running on multiple local CPUs.

Improved interface to Blaze for virtual screening

The improved Blaze results window now shows an enrichment plot and statistics for each Blaze refinement level.


Figure 3. Improved interface to Blaze in Forge.

Stay tuned for more

Subscribe to our newsletter to receive the product release announcement, or contact us to learn more about Forge.

Flare Viewer: Free access to Flare for structure-based design

We are pleased to announce the introduction of Flare Viewer, a free licensing option of Flare, our structure-based design application. With Flare Viewer you can easily visualize and analyze your protein-ligand complexes, use our proprietary electrostatics to design new ligands, and communicate your ideas with high quality graphics and pictures.

Focus on ligands

Read in protein-ligand complexes by opening a file in a local or remote disk location, downloading multiple entries from the Protein Data Bank, or by drag-and-drop from your desktop if you are a Windows user. Ligands can be moved into the dedicated ligands table by drag-and-drop, with each ligand keeping the association with the protein it belongs to. Here they can be easily organized into custom groups, to keep your project tidy.

The dedicated ligand table and interactive menu gives easy access to all ligand actions: for example, sorting on any column, control visibility, tagging and filtering on structure, tags and numerical and text columns. A physico-chemical profile is calculated for every ligand and summarized in a fully customizable radial plot and multi-parametric score to help you design and select the ligands with the best fit to your ideal project profile.


Figure 1: The ligand-centric organization of Flare gives easy access to all ligand actions.

Explore ligand-protein interactions

Flare calculates and displays a variety of ligand-protein interactions. These include H-bonds, steric clashes, aromatic-aromatic, cation-pi interactions and more, also including water-mediated and intra-molecular interactions as an option.

Each ligand can be displayed with its associated protein in grid mode making comparisons between ligands or proteins straightforward.


Figure 2: Each ligand can be displayed with its associated protein, making it easy to compare the interactions of different ligands.

Iterative molecular design meets ligand electrostatics

Understanding ligand electrostatics is key in the design of improved ligands. In Flare, electrostatic interaction potentials calculated with the Cresset XED force field can be visualized as ligand fields or by mapping the electrostatic potential onto the ligand’s molecular surface.


Figure 3: Ligand electrostatics can be shown as ligand fields (left) and by mapping the electrostatic potential of the ligand on its surface (right). Color coding: cyan = negative electrostatics; red = positive electrostatics.

Designing new ligands in Flare gives you immediate feedback on electrostatic changes in the context of the protein active site. In the molecule editor, the ligand or a selected part of the ligand can be minimized ensuring bonds, angles and torsions have low energy values.


Figure 4: The Molecule Editor.

Compare multiple proteins

Multiple protein structures can be imported in the same project and displayed in the same frame of reference using the sequence alignment and superimposition functions in Flare. You can choose the protein to superimpose to, whether all proteins are to move and if all residues or selected residues are superimposed. The protein structure can be optimized by flipping flexible residues or changing tautomeric and charge states for relevant residues.

Once opened, the proteins will sit in a dedicated table where all their components (chains, ligands, crystallographic waters and cofactors) are clearly visible, enabling a rapid inspection of specific chains or residues.

Protein surfaces can be displayed and colored by solid color, atom, secondary structure and hydrophobicity, and saved in a dedicated protein surface window.


Figure 5: Comparing multiple protein-ligand complexes is made easy by working in grid mode, showing ribbons and applying different protein surfaces styles.
Important scenes can be captured and annotated in the Storyboard to be recalled when needed.  Images can be easily copied and exported, with many options to configure the image or file size.

A dedicated extended atom picking widget enables complex queries and gives you full control on what is selected and displayed in the 3D window.

Protein viewer with an intuitive GUI

The ribbon menu structure of Flare makes it easy to identify the commands and controls you are looking for, as all actions are always visible and organized in a logical structure.


Figure 6: All actions are always visible in the Flare ribbon menu.

Upgrade to the Flare Python API

Upgrading Flare Viewer to include the Flare Python API will enable you to create your own workflows, automate common tasks, add custom controls and context menus, access Python modules such as the RDKit cheminformatics toolkit, NumPy, SciPy, and Matplotlib. We also provide a collection of featured python extensions that extend the existing Flare functionality.

Discover Flare Viewer

See the features of Flare Viewer, and download your free 1 year license.

Bespoke free licensing options for academic users are also available; see the announcement.

Flare for Academics

We believe that the lively academic environment is an amazing source of new scientific ideas, algorithms and computational methods. Flare for Academics is a free* licensing option of Flare, our structure-based design software, which has specifically been designed for academic users.

Flare for Academics is a user-friendly environment where academic users can easily develop and test their ideas and methods, or plug-in the most interesting open-source algorithms. It extends on the functionality of Flare Viewer to provide an excellent platform for drug discovery, with a focus on ligand design and electrostatics.

Discover the power of the Python API

The Flare Python API gives academic researchers the opportunity to make their science more accessible through integration into a user-friendly environment.

An environment to build upon and create great science

You will benefit from a robust, commercial standard SBDD environment that enables focus on science by utilizing methods such as protein preparation, protein minimization and multi-core docking. Access is also given to the RDKit cheminformatics toolkit, NumPy, SciPy, and Matplotlib, which are all integral to Flare. Beyond these, virtually any other Python module can be pip-installed making Flare infinitely extendable. An ever-growing collection of featured python extensions that enhance the existing Flare functionality are also provided, these include: plotting, protein mutation, and custom workflows (see also the new Jupyter Notebook integration).


Figure 1. The ‘Extensions’ tab in Flare 2.0.

Low-level access to the graphical user interface and internal processes

The Flare Python API not only provides an environment to develop your own algorithms but also a way to deploy them across a wider user base. The API provides access to all elements of the Flare interface through addition of user-defined controls and context menus.

For example, you may add custom controls into an existing Flare ribbon, or create a new Flare ribbon for Python scripts you frequently use. Custom-created controls in Flare can be created as small or large buttons, spin boxes, custom sliders, or complex dialogues with signals and call-back functions (Figure 2).


Figure 2. Some types of custom controls which can be added to a Flare ribbon.

Automate and distribute Flare calculations

Whenever you need to carry out a completely automated task, for example the overnight preparation of a panel of proteins followed by docking of several ligand series, the most convenient option is to write a Python script that runs outside the Flare GUI. It can then be distributed on a cluster via a queueing system for maximum performance. The pyflare binary is a Python interpreter giving you access to Flare functions using either custom developed or Cresset released scripts.

Upgrade Flare with the Jupyter QtConsole

The native GUI of Flare embeds the Python Console and Python Interpreter widgets. The Python Console is the simplest option to run one-line commands. With the Python Interpreter you can handle slightly more complex scripts: for example, you can load a script, interactively edit it inside Flare and then save your modifications. Both the Python Console and the Python Interpreter have a multi-tab interface that makes it possible to work on multiple Python snippets at the same time.

Python enthusiasts can easily upgrade Flare with the Jupyter QtConsole for access to all the Jupyter features, e.g.: TAB completion, auto-indentation, syntax highlighting, context help, inline graphics, and more. Using this widget, you can type Python commands, examine molecules and draw plots, all in the same window.

Upgrade Flare with the Jupyter Notebook

The Flare Python Notebook is an instance of the Jupyter Notebook embedded into the Flare GUI. It has direct access to the Flare GUI objects and methods, offers an even richer interface and enables editing and running individual code cells.


Figure 3. The Python Qt-Console (left) and Python Notebook (right) in Flare.

Not just a viewer

Flare for Academics is not just a viewer, but a complete, user friendly platform for iterative molecule design in drug discovery.

Multiple protein structures can be easily imported in the Flare project and displayed in the same frame of reference using the sequence alignment and superimposition functions.

Flare’s protein preparation will enable you to optimize your protein-ligand structures by adding hydrogen atoms, optimizing hydrogen bonds, removing atomic clashes and assigning optimal protonation states. Further optimization of the protein active site can be achieved by protein minimization based on the XED force field, and by manually flipping flexible residues or changing tautomeric and charge states for relevant residues.


Figure 4. Flare for Academics is user friendly platform for iterative molecule design in drug discovery.

 
Smart visualization of protein-ligand complexes in grid mode facilitates the comparison between ligand or proteins. The display of a variety of non-bonded ligand-protein interactions makes it easy to understand the different binding modes for your ligands.

The ligand-centric structure of Flare includes a dedicated ligand table and interactive menu giving easy access to all ligand actions, such as sorting on any column, control visibility, tagging and filtering on structure, tags and numerical and text columns, grouping of ligands in custom-created roles. In the ligand table, each molecule is associated to calculated physico-chemical properties, a radial plot and a multi-parametric score to help you design and select the ligands that best match the ideal project profile. Ligand electrostatic interaction potentials calculated with the XED force field can be visualized in the 3D window and in the molecule editor, and used to inform ligand design.

Multi-core docking experiments can be run to predict the 3D structure of flexible ligands in the active site of your protein. Docking in Flare uses Lead Finder™ to provide excellent pose prediction and detailed feedback on new molecule designs.

Discover Flare for Academics

See the features of Flare for Academics, and apply for your 1 year license.

* In most countries; contact us to see if you are eligible for a free license.

Which macrocycle should I try first? Picking the best linkers with Flare™ and Spark™

At Cresset, we enjoy seeing our products work in synergy. By combining the most recent scientific methods and workflows we deliver solutions to address molecule design challenges. In this post, we use the new Electrostatic Complementarity™ (EC) maps and scores in Flare to help the post-processing of a Spark macrocyclization experiment.

Using Electrostatic Complementarity in Flare to post-process the Spark results

In the case study Using Spark to design macrocycle BRD4 inhibitors, we used Spark, our bioisostere replacement and scaffold hopping tool, to design macrocyclization strategies for non-macrocyclic, pyridone BRD4 inhibitors and evaluate results against experimental data reported by Wang et al [1]. The results showed that Spark successfully reproduced the experimental data.

In a real drug discovery project where no retrospective data is available, it would be useful to have criteria based on the existing knowledge of the system under study helping a further post-processing of the Spark results. Here we show how to use Spark in synergy with the EC maps and scores in Flare, our structure-based design tool, to pick the most promising candidates for synthesis.

Electrostatic interactions are essential for molecular recognition and are also key contributors to the binding free energy ΔG of protein-ligand complexes. Assessing the electrostatic match between ligands and binding pockets provides important insights into why ligands bind and what can be changed to improve binding.

The 100 top scoring results from the BRD4 Spark experiment were opened in Flare using the ‘Send to Flare’ functionality in Spark, which also transfers the related starter molecule (compound 1 in Figure 1) and excluded volume protein (5UEY). The protein was prepared in Flare, removing the water molecules that do not make clear interactions with both the ligand and protein. EC scores and maps were then calculated for compound 1 and the experimentally validated macrocycle 2 reported by Wang et al. towards the same 5UEY protein, as shown in Figure 1. As expected, the EC maps for both compounds show good complementarity to the protein and a very similar EC R score of 0.52/0.53 (Pearson’s r correlation coefficient). Spark linkers showing similar (or better) maps/score should provide interesting ideas for synthesis.


Figure 1: EC maps and scores for compound 1 and macrocycle 2, calculated towards protein 5UEY. Color coding: green = good complementarity; red = electrostatic clash.

Picking the winners

Figure 2 shows a couple of the most interesting linkers in terms of EC score.


Figure 2: EC maps and scores (top panel) for two ‘matching’ Spark linkers, calculated towards protein 5UEY. Color coding: green = good complementarity; red = electrostatic clash. The bottom panel shows electrostatic potential maps for the same Spark results. Color coding: cyan = negative electrostatic; red = positive electrostatic.

In the first example (Figure 2 – left), the π-system in the double bond linker complements the positive electrostatic field at the NH proton of His437 better than compound 1 or a fully saturated linker of similar length, as in macrocycle 2.

Another interesting example of good electrostatic match is the mercaptoethanol linker (Figure 2 – right). The negative electrostatic field of the thioether group is also in close proximity to the polarized NH of His437.

For both compounds, the increase in EC towards the protein is due to the introduction of a more negative ligand electrostatic in the region near His437, as shown by the electrostatic potential maps for both linkers (Figure 2 – bottom).

Discarding the losers

In contrast, an analysis of the EC maps for two of the linkers with the lowest EC scores (Figure 3) immediately highlights the reasons why these should be down-prioritized.


Figure 3. EC maps and scores (top panel) for two ‘clashing’ Spark linkers, calculated towards protein 5UEY. Color coding: green = good complementarity; red = electrostatic clash. The bottom panel shows electrostatic potential maps for the same Spark results. Color coding: cyan = negative electrostatic; red = positive electrostatic.
These linkers expose an area of negative interaction potential towards the carbonyl of Asn443, resulting in a strong electrostatic clash.

Conclusion

Are you surprised that a few linkers with low EC ended up among the top 100 scoring Spark results? Don’t forget that Spark works on ligand similarity. In macrocyclization (and fragment linking) experiments we are stretching the method to explore regions in space where ‘no ligand has gone before’.

In such cases, adding protein information is clearly highly beneficial to help post-processing. EC maps in Flare are an intuitive visual method for rationalizing the choice of the best ideas to progress, while EC scores provide a rapid way of scoring and filtering the 500 Spark results in just a few minutes.

To try Spark or Flare on your projects, request your free evaluation.

  1. Wang, L.; McDaniel, K. F.; Kati, W. M. Fragment-Based, Structure-Enabled Discovery of Novel Pyridones and Pyridone Macrocycles as Potent Bromodomain and Extra-Terminal Domain (BET) Family Bromodomain Inhibitors. J. Med. Chem. 2017, 60 (9), 3828–3850.

Investigating the SAR of XIAP ligands with Electrostatic Complementarity maps and scores

Abstract

Electrostatic Complementarity™ maps implemented in Flare™,1 Cresset’s structure-based design application, were used to investigate the protein-ligand electrostatic interactions and the Structure-Activity Relationship (SAR) of a small set of inhibitors of the X-linked IAP (XIAP)-caspase protein-protein interaction. A good correlation was also obtained between XIAP-BIR3 affinity and the Electrostatic Complementarity scores for the same data set.

Introduction

Inhibitor of apoptosis proteins (IAPs) are key regulators of antiapoptotic and pro-survival signaling pathways.2-4 Their deregulation occurs in various cancers and is associated with tumor growth, resistance to treatment and poor prognosis. This makes them an attractive target for anticancer drug discovery.5-7  The best characterized IAP, X-linked IAP (XIAP), exerts its antiapoptotic activity by binding and inactivation of caspases 3, 7, and 9 via its BIR domains. Disruption of the protein-protein interaction (PPI) between XIAP-BIR domains and caspases via small molecules is a promising strategy to inhibit XIAP. However, drugging PPIs can be particularly challenging due to their unusual binding interfaces, which are unlike classical binding sites generally flat and large.8

A recent paper from Astex9 reports that the XIAP-BIR3 activity of the small dataset of antagonists in Table 1 is increased by the introduction of electron-withdrawing substituents on the indoline ring, and shows a nice correlation between the XIAP-BIR3 pIC50 and Hammett’s σp.

In this case study, we used the Electrostatic Complementarity maps available in Flare to investigate the protein-ligand electrostatic interactions and the SAR of the molecules in Table 1. Electrostatic Complementarity scores calculated with Flare were used to quantitatively model XIAP-BIR3 pIC50.

Table 1. XIAP-BIR3 affinity of C-6 substituted indolines.8

Method

Protein preparation

The 5C7A ligand-protein complex was downloaded from the Protein Data Bank into Flare and prepared using the Build Model10 tool from BioMolTech,11 to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes and assign optimal protonation states to the protein structure. Any truncated protein chain was capped as part of protein preparation. The binding site was visually inspected to check for correct protonation states of ligands and amino acid side chains and re-optimize water orientations of suboptimal water hydrogen bonding networks. We chose to keep only water molecules in and close to the binding site that have at least 2 hydrogen bonding contacts to the protein or at least 1 hydrogen bond to ligand and protein for electrostatic complementarity calculations. As many of the modeled binding modes (e.g., compounds 9, 11, 15, 16) clash with the flexible side chain of Lys297 (Figure 1), the side chain atoms were minimized with the XED force field12 for each ligand. The resulting receptors were used to compute the electrostatic complementarity of the respective compounds.


Figure 1. The PDB: 5C7A ligand-protein complex.

Data set construction

The compounds in Table 1 were drawn using the molecule editor in Flare, starting from the crystal structure of the ligand in PDB:5C7A (compound 7 in Table 1). The 11 compounds were then aligned in Forge13 to the 5C7A ligand, using a Maximum Common Substructure alignment to minimize the conformational noise in the common indoline-piperazine scaffold.

Electrostatic Complementarity surfaces and scores

Electrostatic Complementarity maps and scoring functions are an extension of Flare’s Protein Interaction Potentials based on Cresset’s polarizable XED force field. In contrast to classical force fields that rely on atom-centered charges, XED enables description of anisotropic charge distribution around atoms which is usually only possible with ab initio approaches. Polarization effects and description of atomic charge anisotropy are especially useful for computing electrostatic properties of aromatic or unsaturated hydrocarbons, sp2 hybridized oxygen atoms, sp or sp2 hybridized nitrogen atoms, and aromatic halogens (sigma hole of Cl, Br, and I).14-16

To calculate the Electrostatic Complementarity map for a ligand towards a protein of interest, the solvent-accessible surface is first placed over the ligand. A calculation of electrostatic potentials due to the ligand and the protein is then carried out at each vertex on the surface.

These potentials are then scaled, added together, and normalized to yield the Electrostatic Complementarity score. Perfect electrostatic complementarity means that at each vertex point the ligand electrostatic potential value is paired with a protein electrostatic potential value of the same magnitude with reverse sign. Regions of the ligand surface where there is electrostatic complementarity with the protein are colored green, while the regions where there is a electrostatic clash are colored red. A more detailed description of the electrostatic potential and complementarity methodology will be presented elsewhere.17

The Electrostatic Complementarity scores quantify the ligand-protein electrostatic complementarity with three different metrics suitable for diverse protein-ligand scenarios.

The first computed score (‘Complementarity’) is the normalized surface integral of the complementarity score over the surface of the ligand (effectively the average value of that score over the surface of the ligand).

The other two scores (‘Complementarity r’ and ‘Complementarity rho’) are the Pearson’s correlation coefficient and the Spearman rank correlation coefficient, respectively, which are computed on the raw ligand and protein electrostatic potentials sampled on the surface vertices.

All three measures range from 1 (perfect complementarity) to -1 (perfect clash) but have different characteristics. The Complementarity score includes some compensation for desolvation effects, and so may be more robust when these are significant. The Pearson and Spearman correlation coefficients can provide a better indication of ligand activity in some cases, but are more susceptible to noise (r more than rho). The Spearman’s rho number is more robust against background electric fields, which may be useful if the computed protein electric potential is being biased by a large net charge on the protein.

The calculation is fast and predictive: scoring a hundred ligands normally takes less than a couple of minutes on an average laptop and gives important insights into protein-ligand electrostatics, which typically correlate with compound activity.

Mapping the electrostatics of the XIAP active site

The Electrostatic Complementarity map of compound 7 in the XIAP active site (PDB: 5C7A, Figure 2 – left) shows a strong electrostatic clash (red) in the region above the indoline ring. This is caused by an area of negative electrostatic potential in the protein’s active site, generated by the backbone carbonyl of Gly306 and the phenolic oxygen of Tyr324 (Figure 2 – middle), clashing with the negative electrostatic field associated with the indoline ring (Figure 2 – right). A less pronounced electrostatic clash can be seen between the positive electrostatic field of the protonated side chain of Lys297 (Figure 2 – middle) and the positive electrostatic field of the sigma hydrogens of the indoline ring (Figure 2 – right).

According to this map (and in agreement with the reported correlation8), electron-withdrawing substituents which make the indoline ring less electron-rich are expected to increase XIAP binding. Substituents associated with a more negative (or less positive) electrostatic field, favoring the interaction with the protonated side chain of Lys297, should also be beneficial.


Figure 2. Left: Electrostatic Complementarity map for the PDB:5C7A ligand (green: good complementarity; red: electrostatic clash). Middle: protein electrostatic potential map for PDB:5C7A (red: positive; cyan: negative). Right: ligand fields for the ligand in PDB:5C7A (red: positive; cyan: negative.

Electrostatic Complementarity and XIAP SAR

Figure 3 shows the Electrostatic Complementarity maps for the compounds in Table 1, shown in order of increasing XIAP-BIR3 activity from left to right.

A clear trend can be observed as we move from the electron-donating substituents (-NH2, -OMe), to the electron-withdrawing substituents -F, -Cl, -SO2Me. These make the indoline ring less electron-rich, reducing the clash with the negative electrostatic of the XIAP active site.


Figure 3. Electrostatic complementarity maps for some of the ligands in Table 1 (green: good complementarity; red: electrostatic clash).

 The substituents for the three most potent compounds are also associated with a negative ligand field of their own (Figure 4), favoring the interaction with the protonated side chain of Lys297, according to our initial hypothesis.


Figure 4. Negative ligand fields (cyan) for compounds 17, 15 and 16.

 These qualitative observations are confirmed by the nice correlation (r2 = 0.671) between XIAP-BIR3 pIC50 and the ‘Complementarity rho’ score shown in Figure 5.


Figure 5. Plot of XIAP-BIR3 pIC50 versus Complementarity rho.

Electrostatic Complementarity scores and MW

We monitored the correlation between MW and XIAP-BIR3 affinity/Complementarity rho to verify whether the Electrostatic Complementarity scores provide information which goes beyond the use of simple physico-chemical descriptors for drug design.

The correlation between MW and XIAP-BIR3 pIC50 (r2 = 0.613, Figure 6 – left), would possibly point towards a space filling effect as the simplest explanation of the changes in XIAP affinity in this data set.

However, the low correlation between Complementarity rho and MW (Figure 6 – right) confirms that the Electrostatic Complementarity scores are size independent.

Using the Electrostatic Complementarity scores for quantitative SAR modeling, therefore, generates trends completely independent from size effects.

Furthermore, Electrostatic Complementarity maps provide visual insight into ligand-protein binding and SAR which cannot be derived from traditional, simple physico-chemical descriptors such as MW and Hammett’s σp, thus providing invaluable information for drug design.


Figure 6. Left: Plot of XIAP-BIR3 pIC50 versus MW. Right: Plot of Complementarity rho versus MW.

Conclusions

Application of Electrostatic Complementarity to a reported XIAP-BIR3 data set showed that our method can detect and quantify electrostatic differences in XIAP ligands that cause changes in bioactivity. Electrostatic Complementarity scores and maps in Flare V2, based on Cresset’s polarizable XED force field, provide rapid activity prediction with visual feedback on new molecule designs. They provide useful information for understanding ligand binding and SAR and can be used for rapidly ranking of new molecule designs.

References and Links

  1. https://www.cresset-group.com/flare
  2. Salvesen, G. S. et al., Rev. Mol. Cell Biol. 2002, 3 (6), 401-10
  3. Gyrd-Hansen et al., Nat. Cancer 2010, 10 (8), 561-74
  4. Silke, J. et al., Cold Spring Harbor Perspect. Biol. 2013, 5 (2), a008730
  5. I et al., Clin. Cancer Res. 2004, 10 (11), 3737-3744
  6. Mizutani, Y. et al., Int. J. Oncol. 2007, 30 (4), 919-925
  7. Fulda S. et al., Nat. Rev. Drug Discovery 2012, 11 (2), 109 -124
  8. Arkin, M. R. et al., Chem. Biol. 2014, 21 (9), 1102-1114
  9. Chessari, G. et al., J. Med. 2015, 58 (16), 6574-6588
  10. V. Stroganov et al., Proteins 2011, 79 (9), 2693-2710
  11. https://www.biomoltech.com/
  12. https://www.cresset-group.com/science/field-technology/calculating-field-patterns/
  13. https://www.cresset-group.com/products/forge/
  14. Vinter, J. G., Comput. Aided Mol. Des. 1994, 8 (6), 653–668
  15. Vinter, J. G., Comput. Aided Mol. Des. 1996, 10 (5), 417–426
  16. Chessari, G. et al., Chem. Eur. J. 2002, 8 (13), 2860–2867
  17. Bauer, M. R. & Mackey, M. D. et al., manuscript in preparation

Outstanding new 3D graphics in Spark 10.5.5

A new patch level release of Spark™, our scaffold hopping and bioisostere replacement application, is now available for download by all Spark users. Spark 10.5.5 includes considerable improvements to the look and feel, rendering and performance of the graphics of the 3D window.


Figure 1. Improved 3D graphics in Spark 10.5.5.

Spark 10.5.5 also includes a small number of additional improvements and bug fixes:

  • Improved support for the configuration of proxy servers
  • Improved Spark start up times when using databases sitting in a remote location on a slow connection
  • Improved support for high-dpi displays
  • Fixed issue which caused the effect of the application of pharmacophore constraints to be overestimated in some circumstances
  • Fixed issue on macOS which prevented to dock back in the desired position any dock window moved outside of the main Spark interface
  • Fixed rare issue in the wizard where in some occasions the desired hydrogen atom could not be picked for replacement.

Download Spark 10.5.5

To ensure you benefit from the improved 3D graphics, and other improvements and bug fixes, keep an eye out for an email with download links and upgrade Spark at your earliest convenience.

If you are not currently a Spark customer, please request a free evaluation.

Contact us if you have queries.

Launch of Cresset Python extensions for Flare

We have launched three new repositories for Python scripts that extend the functionality of FlareTM, our structure-based design platform. These new repositories are available to all Flare users free of charge.

Last month, Paolo Tosco, explained the advantages and opportunities offered by the Flare Python API to computational chemists and developers.

But what if you are not familiar with Python scripting, and you just want to use one of the scripts developed by us, which we showed at the Cresset User Group Meeting 2018? Or if you would like to run Flare tasks from the command line? Or maybe you know Python well, but could benefit from some scripting examples, just to get yourself started with the Flare Python API. These new repositories provide the solutions. They are:

  • Flare Python extensions – Cresset written scripts that extend the functionality of Flare
  • Flare Python pyflare – Command line scripts that use pyflare to create command line workflows
  • Flare Python developers – Example scripts that can be used by developers as templates to write their own extensions.

Below I will discuss the different types of scripts and show you some interesting examples of additional functions you can add to Flare through scripting.

Flare API extensions: Use the power of Python within the Flare GUI

Download the Flare API extensions.

What do we mean by ‘extensions’? These are a collection of Python scripts which add powerful new functionality to Flare. After installing them, a new ribbon tab called ‘Extensions’ will be added to the Flare GUI, containing buttons to access this new functionality.


Figure 1. The ‘Extensions’ tab in Flare 2.0.

As you hover with the mouse over each of the buttons, a tooltip will appear providing a short explanation of the extension’s function.

For example, the ‘Ramachandran Plot’ extension will show the Ramachandran plot for the protein of interest.


Figure 2. Ramachandran plot for PDB:5C7A calculated with the ‘Ramachandran Plot’ extension.

Another nice snippet of extension functionality is the ‘Show in RCSB’ addition to the Proteins table context menu.


Figure 3. Choosing the ‘Show in RCSB’ extension from the context menu opens the PDB:5C7A entry in the RCSB.

If you are used to the highly interactive environment of Jupyter® notebooks, then you should definitely install the ‘Python QtConsole’ extension, which adds a Jupyter QtConsole dock to the built-in Python Interpreter and Python Console docks. The Python QtConsole provides all the nifty Jupyter features, i.e., TAB completion, auto-indentation, syntax highlighting, context help, inline graphics, and more.


Figure 4. The highly interactive environment provided by the Python QtConsole.

Finally, there is a whole group of Cresset extensions dedicated to making Flare communicate with other Cresset products. For example, choosing the ‘Align’ extension will enable you to run a Forge alignment for the ligands in your Flare project, without leaving Flare. You will need a Forge license for this to work; click here if you wish to request a free evaluation of Forge.


Figure 5. The ‘Align’ extension.

pyflare scripts: run Flare from the command line

These scripts allow all the main Flare functions to be accessed through the pyflare command line Python interpreter.


Figure 6. Running a Flare Python script outside Flare using the pyflare interpreter.

This is useful when you need to carry out a completely automated task, for example an overnight preparation of a panel of proteins followed by docking of several ligand series, distributing it on a cluster via a queueing system for maximum performance.

Download pyflare scripts to:

  • Dock ligands to a protein using the Lead Finder™ algorithm
  • Prepare your protein
  • Calculate Electrostatic Complementarity™ scores
  • Minimize the protein active site
  • Run a 3D-RISM analysis
  • Run a WaterSwap analysis
  • Calculate and export protein field surfaces.

Scripting examples for developers

This GitLab repository contains a few interesting scripting examples to help Python developers get started with writing their own extensions and scripts with the Flare API.

Give it a try

These examples can be downloaded for free from GitLab by all Flare customers, clicking on the links above and following the download instructions.

If you questions about the use of Python extensions for Flare, feel free to contact Cresset support.

Request a free evaluation of Flare.

Flare™ V2 released: Introducing the new science and functionality of Cresset’s structure-based design application

Version 2 of Flare™, our application for fresh insights into structure-based design, is now available. I will briefly introduce the new science and functionality included in this version, which will be presented in full at the Cresset User Group Meeting on June 21-22.

Predicting activity using Electrostatic Complementarity™: You asked, we listened

Since the initial release of Flare, you repeatedly asked us to develop a smart way of quantifying the complementarity of ligand vs. protein electrostatics and suggested that this would be a rapid method for prioritizing new molecule designs.

Your requests have led to the introduction of Electrostatic Complementarity (EC) scores and maps in Flare V2, based on Cresset’s polarizable XED force field. These provide rapid activity prediction with visual feedback on new molecule designs, and prove invaluable for understanding ligand binding, structure-activity relationships and the ranking new molecule designs.

Figure 1. Left: the less active XIAP analog is less complementary (red region) to the PDB: 5C7D binding site than the more active XIAP analog in the centre (green region).  Right: the Electrostatic Complementarity score is highly correlated with the experimental activity of XIAP analogs.

Electrostatic Complementarity scores quantify the ligand-protein electrostatic complementarity with three different metrics suitable for diverse protein-ligand scenarios. The calculation is fast and predictive: scoring a hundred ligands normally takes less than a couple of minutes on an average laptop and gives good correlation with activity in the majority of cases.

Electrostatic Complementarity maps are based on a calculation of electrostatic potentials for the ligand and the protein on the surface of the ligand. These potentials are then added together, normalized and scaled. Regions of the ligand surface where there is perfect electrostatic complementarity with the protein are colored green, while the regions where there is a perfect electrostatic clash are colored red.

Enhanced protein surface coloring

Faster and improved surface generation code, and new protein surface coloring options to give you more insights into protein-ligand interactions and support molecule design, are also included in this release.

Figure 2. Left: protein electrostatic potential map for PDB: 5HLW (red: positive; cyan: negative). Middle: ligand fields for the ligand in PDB: 5HLW (red: positive; cyan: negative). Right: electrostatic complementarity map for PDB: 5HLW (green: perfect complementarity; red: perfect clash).

Coloring the protein surface according to the Wimley-White [1] residue hydrophobicity value is an excellent way of visualizing hydrophobic areas of the protein active site.


Figure 3. The surface of the binding side of PDB: 5HLW is colored by Wimley-White residue hydrophobicity from yellow (hydrophobic) to blue (hydrophilic).

Ensemble docking

Rapidly and easily dock your ligands in a single experiment to multiple protein conformations using ensemble docking. Results are saved as a list of docked poses for the ligands included in the study, each associated to a specific protein conformation.

Figure 4. Results of an ensemble docking experiment in Flare. Each pose is associated with a specific protein conformation, making browsing of results easy.

Enhanced ligand design functionality

Significant improvements have been made to the ligand functionality in Flare V2.

Radial plots and Multi-Parameter Scoring

Radial plots are useful to gain immediate visual feedback about how each ligand matches the ideal physico-chemical profile for your project. To support Multi-Parameter Scoring, radial plot properties are weighted and combined into a single score that represents the fit of the ligand to the ideal physico-chemical profile. The radial plot score can then be used to filter or sort the ligands.

Good Middling Poor
Figure 5. Radial plots and radial plot scores for ligand showing a match to the ideal physical-chemical profile ranging from good (left) to poor (right).

Filters

Flare V2 enables the definitions of filters (Figure 6) to show only the ligands that conform to a desired set of rules. This includes filtering on numerical values, text data, Boolean values, tags, ligand structure using either a SMARTS string or a substructure sketched into the Flare Molecule Editor.

Figure 6: The Filters window in Flare V2.

Storyboard

Use the Storyboard window to capture and replay scenes recording all details from the 3D window. Each scene can be easily annotated and recalled when needed.

Figure 7: The Storyboard in Flare V2 showing four scenes and their titles together with notes about each scene.

Flare Python® API

The new Python API lets you create your own workflows, automate your common tasks, expand Flare with Python modules and add custom controls. It gives full access to all of Flare’s capabilities, including the RDKit cheminformatics toolkit.

Flare can be upgraded with Python modules for graphing statistics, Jupyter® notebook integration and much more.

Flare V2 makes advanced structure-based design techniques, such as Electrostatic Complementarity, multiparametric scoring and Python scripting, accessible through an intuitive GUI.

See the benefits Flare V2 can bring to your project

With over 200 new or improved features, Flare V2 is built to make structure-based design easy and accessible while incorporating cutting edge scientific methods. I encourage you to upgrade your version of Flare at your earliest convenience.

If you are not currently a Flare customer, please request a free evaluation.

Contact us if you have queries relating to this release.

Reference

  1. Wimley WC & White SH (1996). Nature Struct. Biol. 3:842-848.

 

A sneak peek into Flare V2: Python API and new science

Less than a year ago we released Flare for structure-based design. At the time, we promised you that a Python API would be central to future of this novel application. In keeping with that promise, Flare V2 will include a new Flare Python API, new science and significant improvements to functionality and usability.

Let’s have a quick look at the Flare Python API and what it can do for you.

What is the Flare Python API?

The Flare Python API enables Flare functionality to be accessed from Python and for you to customize the Flare interface. Python scripts can be run from the Flare graphical user interface (GUI) or by the command line pyFlare.

To access the power of Python from the Flare GUI we created a dedicated new tabbed menu named ‘Python’ (Figure 1).


Figure 1: The Python tabbed menu in Flare.
From the Python menu you can:

  • Run simple, one line scripts from the Python console
  • Write, load, run and save one or more scripts using the Python Interpreter
  • Manage the Python scripts using the Manager and Log buttons
  • Access the built-in Python documentation.

As you create your own scripts, you can choose to add them to Flare as new ‘buttons’ either in a new dedicated tab or in an existing tab.

How can Python scripts written with the Flare API make my project life easier?

Automate routine workflow

With the Flare Python API, you can automate routine workflows which you normally carry out for every project. For example, the ‘Prep_n_align’ script (Figure 2) starts from a Flare project where you have imported your protein structures, sequence aligns and superimposes all proteins to the first, then prepares the proteins and extracts the ligands, and finally adds a surface to the active site and focuses the view on the ligands. It also creates a button in a new ‘My Tab’ ribbon so that it takes just a click to launch the script whenever you need it.


Figure 2: A Python script to automate protein preparation, alignment and display.

Add functionality to your Flare project

Importing standard Python libraries will significantly enhance the capabilities of Flare. For example, using the matplotlib library, you can easily write a script which will create an x/y plot from data columns in your Flare project.

Make your workflows talk to Flare

Alongside the GUI interface to the Python API is a new command line binary ‘pyFlare’. This will give you access to all Flare methods directly from the command line enabling you to automate and script your workflows.

Sounds great, but what about the science?

It’s not like Cresset to forget about the science and Flare V2 makes no exception. Stay tuned for the release announcement next month to find out more about ensemble docking, electrostatic complementarity, new protein and ligand surfaces, and lots of improvements to functionality and usability.

Hands-on training

To learn more about Flare and the other Cresset applications, register for one of the hands-on workshops which will be held as part The Cresset User Group Meeting on June 21 – 22, 2018.