Presentations from the Cresset User Group Meeting 2019
Thank you to all attendees who contributed to the success of the Cresset User Group Meeting 2019. As I'm sure ...
I am delighted to announce that a new release of Cresset workflow components for the Pipeline Pilot™ environment is now available. These include new components for accessing Flare™ functions through the Flare Python API, new and enhanced components for Forge™, including the new Machine Learning methods for building Quantitative Structure-Activity Relationships models of regression and classification, and significant enhancements to the Spark™ components. New example protocols are also available to illustrate the new functionality.
The new ‘pyflare’ Pipeline Pilot components provides full access to all the Flare functionality by means of python snippets and scripts using the Flare Python API. The new ‘Flare Ligand and Protein Viewer’ component enables the visualization on ligands and proteins in the Flare GUI.
These new components are used in several steps of the example Spark ligand joining protocol shown in Figure 1. To test this protocol, I used an example previously presented in this poster, based on the crystal structures of inhibitors of RPA protein-protein interactions.1 Figure 1. The Spark ligand joining protocol uses the new Flare Pipeline Pilot components in different stages of the pre- and post-processing of the Spark search.
The crystal structures of the protein bound fragments 1XS and 1DZ (PDB: 4LUV, Figure 2 – left) are used as starting point for the Spark ligand joining experiment. These two fragments bind to adjacent sites in the basic cleft of RPA70N, a heterotrimeric complex essential for eukaryotic DNA replication, damage response and repair. The crystal structure of a linked compound1 is also available with PDB code: 4LUZ (Figure 2 – right), and I used it perform a visual comparison to the Spark results.
Figure 2. Left: Crystal structure of the PDB: 4LUV protein-ligand complex, showing 1XS and 1DZ bound to adjacent sites in the basic cleft of RPA70N. The groups to be replaced in the Spark joining experiment are highlighted. Right: Crystal structure of the PDB: 4LUZ protein in complex with a linked compound.
In the ligand joining protocol, the ‘pyflare’ component is used to prepare the 4LUV and 4LUZ proteins (steps 2 and 8), extract the residue chains of the 4LUV protein (used as an excluded volume in the Spark search) by removing the ligands and the water molecules (step 3), extract the combined ligands of 4LUV to be used as the starter molecules (step 4), calculate Electrostatic Complementarity™ (EC) scores for the Spark results towards 4LUV (step 6), align and superimpose 4LUV and 4LUZ (step 9). Finally, the results are visualized with the ‘Flare Ligand and Protein Viewer’ (step 10).
The Spark search was run on the ChEMBL_common and Commercial Very_Common databases, replacing the groups in the starter molecules shown in Figure 2. The ‘Spark Database Search’ parameters were set-up to mimic the ligand joining calculation method in the Spark wizard. In particular, the search speed was set to exhaustive, the maximum fragment molecular weight and the maximum number of rotatable bonds for the fragment were set respectively to 500 and 5, and the gradient cut-off for minimization was set to 0.1. These conditions relax geometry and size constraints and carry out a more exhaustive search which may possibly find linker fragments that match the geometric constraints less well but still match the starter ligands in terms of fields, followed a thorough minimization of the result molecules to reduce the strain in the linker bonds.
Some favorite results are shown in Figure 3. Linkers similar to 1XT (linked compound from 4LUZ) can be found among the top scoring Spark results which also show high EC score towards the 4LUV protein. However, the Spark results are typically one atom longer than the linker in 4LUZ ligand, as Spark cannot predict the rotation of the phenyl ring which occurs in 1XT upon joining the two fragments 1XS and 1DZ.
Figure 3. Favorite results from the Spark ligand joining experiment. Color coding: green, 1XT; violet, Spark results.
New features for the ‘Spark Database Search’ component are the capability to use reference molecules to guide the search, to define pharmacophore constraints for the starter molecule, and to use additional similarity metrics to score the results (Tversky and Tanimoto). I used this enhanced functionality in the ligand growing protocol shown in Figure 4, based on a P38-alpha case study.
Figure 4. The Spark ligand growing protocol uses some of the new features of the ‘Spark Database Search’ component, enabling to use reference molecules to guide the search and to define pharmacophore constraints for the starter molecule.
As shown in Figure 5 – left, the ligand in PDB: 3K3I specifically binds to the inactive form of P38-alpha known as the ‘DFG-out’ kinase protein conformation. In this form, the activation loop is distorted and the catalytic residues are displaced from their usual position and are thus incapable of binding ATP. In contrast, inhibitors like the ligand in PDB: 3ROC (Figure 5 – right) are more selective for P38, despite being ATP competitive, and interact with an active conformation of P38.
Figure 5. Left: The ligand in PDB: 3K3I binds to the inactive form of P38-alpha (‘DFG-out’ kinase protein conformation), and was used as the molecule to grow in the Spark experiment. Right: The ligand in PDB: 3ROC interacts with an active conformation of P38, and was used as the molecule to guide growth. Relevant pharmacophoric interactions constrained in the Spark experiment are highlighted.
In the Spark ligand growing protocol, the 3K3I ligand is used as the molecule to grow (starter), while the 3ROC ligand as the molecule to guide the fragment growth (reference), with the objective of identifying potential novel potent and selective inhibitors which are non-ATP competitive.
After preparing the PDB: 3K3I molecule (step 2), the starter molecule is extracted from the protein-ligand complex and assigned a 20% weight (steps 3 and 4). A weight of 80% is instead assigned to the reference molecule (step 9). The residue chains in 3K3I are extracted to be used as an excluded volume by Spark (step 5). Starter, reference and excluded volume are sent to ‘Spark Database Search’ (step 6), and the results visualized with the ‘Flare Ligand and Protein Viewer’ (step 10).
The ‘Spark Database Search’ parameters were set-up to mimic the ligand growing calculation method in the Spark wizard. In particular, the search speed was set to exhaustive, the maximum fragment molecular weight and the maximum number of rotatable bonds for the fragment were set respectively to 500 and 20. These conditions relax the default size constraints to allow the starter ligand to grow towards the regions occupied by the reference molecule. In addition, pharmacophore constraints were added to both the starter and reference molecule as shown in Figure 5, to ensure that relevant ligand protein interactions are maintained in the Spark results.
Some favorites results are shown in Figure 6.
Figure 6. Favorite results from the Spark ligand growing experiment. Color coding: orange, 3ROC ligand; green, 3K3I ligand; violet/capped sticks, Spark results.
The 2.5.0 release features the inclusion of the new Machine Learning algorithms for building QSAR models of regression and classification with the Forge Build component, as shown in the example protocol in Figure 7.
Figure 7. The Forge Build component can now generate Machine Learning models of regression and classification.
Main enhancements for the Forge Align component are the availability of additional similarity metrics (Tanimoto and Tversky), the capability to set pharmacophore constraints on the reference molecules, and the option to bias the substructure alignment with a SMARTS pattern.
This release also includes the new ‘Forge Surface Writer’ component, to generate ligand field surfaces which can be exported in CCP4, Cube, Insight or MOE format.