In silico methods to streamline optimization

It’s a long journey from hit to lead, and the path is called optimization.

Your discovery program started with some promising leads from a high throughput screen. The biology has been done and the chemistry has shown that there is an effect on the biochemical assays. You have a series of hit molecules, a set of required parameters, thousands of data points, a team of chemists, a budget and a deadline. Now the optimization work begins.

In silico methods streamline the optimization process by giving you more understanding of your target and your hits, and by making it easier to manage your data. There are methods to:

  • Identify any gaps in your data so that you can decide what further work needs to be done. The more you know, the greater choice of intelligent steps you have.
  • Make sure you have the best possible understanding of your molecule-target interaction, giving you as many optimization options as possible.
  • Balance diverse properties. Visualization and multi-parameter optimization tools can transform your ability to understand the impact of changes across different compounds.
  • Stay in the active window as you make minor changes. Building an activity pharmacophore helps you to understand how far you can take the changes.
  • Escape any liability from toxicity or pre-existing patents. Fragment replacement methods can be invaluable for moving to new areas of chemical space and adding new ideas and directions to your research.

Are there gaps in your data?

The optimization decisions you can make depend on the data you already have.

Activity Atlas, a component of Forge, summarizes the SAR for a series into a 3D model that can help you find any gaps in your data. You can calculate:

  • Activity Cliff Summary: What do the activity cliffs tell us about the SAR?
  • Average of actives: What do active molecules have in common?
  • Regions explored: Where have I been? For a new molecule, would making it increase our understanding? This analysis also calculates a novelty score for each molecule.

This approach is also helpful in looking at toxicity and other liabilities. For example, you may be optimizing a molecule that was identified from a screen. It is active, but has some undesirable chemistry. If you can understand as much as possible about the SAR, electrostatics and shape you are more likely to discover a better way to escape a liability. The more you know, the greater choice of intelligent steps you have.


Figure 1: Activity Atlas condenses your structure-activity data into highly visual 3D maps that inform the design and optimization of new compounds.

Understand the ligand-protein interaction

Understanding how your hits interact with the target helps you to optimize the affinity of your compound. In structure-enabled projects Flare, our new structure-based design application, can be used to analyze the protein-ligand system, calculate the energetics of ligand binding and analyze the water stability and energetics.

Without knowledge of the target structure, you may need to go back and deduce ligand-protein interactions from your hit so that you have a clearer understanding of the binding mechanism. This knowledge makes it far easier to optimize the affinity of your compound.

You get an extra level of insight to this process with Cresset techniques. Our electrostatic, hydrophobic and shape based analyses make it clear which chemical changes can have the largest biological impact.


Figure 2: Flare GUI.

Visualization tools help with multi-parameter optimization

Compounds that come out of the screening process generally have weaker potency than is required, so one of the first tasks is to increase the activity. This is usually done by making small changes to the molecule and testing their effect. However, there is a range of properties besides activity that need to be optimized.

Firstly, drug molecules need to be stable and small. Larger molecules are more likely to have off-target effects and have more problems travelling through membranes. They are also likely to have more complicated chemistry, making them harder and more expensive to synthesize. One of the key steps in optimization is to retain as many components as are required to make the compound active, but no more. The difficulty is that when you change one property you tend to change others.

Computational visualization for multi-parameter optimization shows you how the changes you are making affect other molecular properties. For example, you may want to simultaneously optimize the polar surface area and the LogP. The Torch and Forge radial plots can be set up to define acceptable project ranges for project data and in silico–calculable properties.

 

Contact Cresset Discovery Services for a confidential discussion about how we can streamline your optimization.

Molecular design towards Protein-Protein Interaction inhibitors

In December 2016 I attended the SCI Protein-Protein Interaction symposium. Armed with Cresset’s powerful ligand centric molecular modeling suite Forge, and an embryonic version of our new structure-based design application, Flare, I was keen to see what could usefully be done with PPI’s.

Prof. Richard Baylis (University of Leeds, UK) presented new data on the interaction of N-MYC with Aurora A. N-MYC is a disordered multi-domain protein with a host of interaction partners. Dysregulation of N-MYC has been linked to a range of cancers. N-MYC is short lived in-vivo and its usual fate is to be ubiquitinylated and degraded. Binding with Aurora A protects N-MYC from this process allowing its various tumorogenic affects to persist. The Baylis group provided the first x-ray evidence showing how N-MYC interacts at an allosteric site of Aurora A which stabilises an active conformation of the Kinase (figure 1).


Figure 1: Aurora A kinase with N-MYC – light green (left), and detail of the N-MYC short helical domain 74-89 (right).
Baylis suggested that DFG-out inhibitors of Aurora A provide distortions of the kinase that would prevent MYC binding, conversely, inhibition with ATP competitive inhibitors would not. Evidence of potential beneficial effects of the former type of kinase inhibitor, but not the latter, may be explained by this fact and led to the suggestion that this may be an effective therapeutic strategy for some types of cancer such as neuroblastoma.

An alternative computational strategy, which occurred to Cresset at the time, was to employ a structure-based approach; to furnish molecular designs that could directly prevent this protein-protein interaction. For this purpose, an initial analysis of the surface interaction, including both electrostatic and lipophilic hot-spots, would be vital.

During the talk, I used Flare to quickly download the relevant PDB file (5G1X) and to load the protein coordinates directly into the application. An automated protein prep protocol (build-model) was used to refine the pdb structure before generating the surface interaction maps, using Cressets proprietary XED force field (figure 2).


Figure 2: (A) Positive protein electrostatic isopotential surface of Aurora (left), negative protein electrostatic isopotential surface (center), and neutral isopotential surface with some key residues of N-MYC (right).
These isopotential maps show discrete positive (red), negative (blue) and neutral (yellow) surface regions that represent key interactions sites between N-MYC and Aurora A which allowed the assignment of the N-MYC residues on which to focus. The N-MYC protein was similarly used to generate and visualise the complimentary fields – as the other component half of the PPI (figure 3).


Figure 3: Negative protein electrostatic isopotential surface of N-MYC short helical region (left), and positive protein electrostatic isopotential surface of the same (right).
In keeping with other known PPI’s such as the MDM2 system, in the short helical domain (N-MYC 74-89) residues Met81 and Trp77 were identified as key lipophilic contacts. Much of the rest of the helix is largely for structural integrity and for stabilising solvent except for the NH of Trp77 and Glu84, which provide additional polar contacts, the latter capping an adjacent helix from Aurora. Further along the N-MYC peptide, towards the N-terminus, Pro74-Pro75 motif (figure 1) marks a change in sec. structure leading to another lipophilic contact Val61 and another polar contact Ser64 (not shown).

We can exploit this information to generate chemical starting points, once each important set of residues is identified and mapped. Thus, from the 3D shape and detailed electrostatic information we can conduct de-novo design experiments to furnish ideas for synthesis, or use virtual screening (Blaze) to search for commercial compounds to purchase and test.

Since the distance between the two main hot spot regions was not ideal (27 Ang. Val61 to Trp77) and although linking them might have been possible using a fragment linking or growing technique e.g., using Spark (Using Cresset’s Spark to grow and link distant fragment hits with sensible chemistry), we chose to tackle them independently with a de-novo design technique. I used the key residues Pro75, Trp77, Glu80, Met81 and Glu84 from the short helical domain as a molecular reference. We used this reference to score our molecular ideas against, and to optimize them via iterative ‘molecular design > alignment > scoring’ cycles in Forge. This powerful technique scores 3D shape, electrostatics and protein steric clashes whilst simultaneously calculating and/or filtering in-silico physiochemical properties. This method as described is limited only by the imagination of the user. In conjunction with Spark as the idea generator however, the limit is set only by the availability of appropriate fragments in the Spark databases – which is a substantial resource.

Later, when we returned, we also ran a virtual screening test on this system using the Blaze demo server. Results of this quick virtual screen against a sub-set of the ChEMBL database are shown below (figure 4).


Figure 4: Forge ‘tile view‘ of example diverse 2D output results of the virtual screen using the Blaze Demo server against a sub-set of ChEMBL (left) and 3D alignments of two of these (pink and green sticks) against the reference N-MYC peptide (blue lines) bound to Aurora A (Forge screenshot).
Although some of the Blaze examples retrieved were interesting, very good considering that this was a very small set <200k compound DB, it appears that good shape score and field score were not generally observed simultaneously. The ‘new’ addition of ‘pharmacophoric atom features in Blaze’ ensured we retrieved some of the key contacts such as the indole H-bond. However, we felt that design was probably the best way to address achieving the precise set of contacts we were looking to mimic.  Afterwards, I expanded on the ‘initial’ de-novo design ideas and provided around 20 further designs which had more reasonable properties and synthetic tractability (figure 5).

A powerful combination of cutting edge ligand and structure-based modeling

Figure 5: Flare screenshot of the structure of an initial idea (left) superimposed on N-MYC hot spot residues, plus its calculated properties, and (right) a space filling model of a further example with superior properties, improved fit, better synthetic tractability and … an IP position.Although this is only a thought experiment (until the point at which any of these molecular designs are synthesized and tested) this illustrates how the powerful combination of both ligand centric and structure-based techniques in Flare, Forge, and perhaps also Spark, could be used to generate specific ideas that address the types of challenges presented by PPI’s or fragment enabled drug discovery projects. This is not untypical, in terms of a portfolio of tasks we might suggest to Cresset Discovery Services clients.

Download an evaluation of Flare, Forge, Spark and Blaze, or contact us to find out how Cresset Discovery Services can enhance your project with insightful and creative delivery of powerful molecular modeling.

Help with writing your grant application

Don’t wait until you have funding to talk to Cresset Discovery Services about working together. We can help before you even start writing your grant application.

Our experience in helping write grant applications for academic and government funding shows that working together at the very start of a project reaps rewards. Being involved from the beginning gives all parties the flexibility to see exactly how and when we can make the maximum contribution to your research.

The University of Newcastle and Sygnature Discovery engaged our services from the earliest stages of an MRC funding application: this three-way collaboration involved writing the successful funding proposal together, ensuring the project used the strengths of each collaborator in the most efficient way.

When it comes to writing your application, we can give you as much help as you want. At a very minimum you will receive detailed scientific input, with descriptions of our methods, deliverables and estimates of time and cost.

Cost and benefit analysis of our methods compared to alternative approaches can also be provided. For example, virtual screening can be an incredibly cost effective way of finding a chemical starting point. We recently carried out ligand-protein docking for a customer as a virtual screen of available library compounds, leading to the selection and purchase of a small sub-set of the available compounds. Cresset Discovery Services helped the customer make an estimated five-fold saving over traditional HTS screening approaches without the pre-selection of likely hits.

Milestones, deliverables and reporting are part and parcel of grant funding. We will work with you on these, and of course we are happy to receive staged payments.

If you prefer, we can also contribute to the writing of the whole proposal. We can even work as a project manager to other collaborators. For example, we can manage your procurement process and work with CROs to outsource assays.

It can often help to have a third party, such as Cresset Discovery Services, bring expertise to your grant application. It will certainly reduce your workload!

Flexibility is key in how you engage our services. Contact us to find out how we can help with your next grant application.

Water stability is key to designing novel patentable chemistry

An analysis of the water stability and positions in a ligand-protein complex informed the design of novel ligands for a customer target. This work led to new active chemistry that the customer went on to patent.

A Cresset Discovery Services customer had identified a novel target with a natural ligand and were looking for new chemistry that would be active at the target site. Our scientists carried out an initial project to learn more about the protein-ligand system. The Cresset field approach, used to analyse the structure and interactions, gave the customer valuable insights into the active features of the ligand.

The customer used this information to develop analogous synthetic compounds and example molecules. They asked us to work with them again to computationally align the example molecules and prioritize them for synthesis.

We carried out an initial alignment and then modeled the system in detail. It appeared that part of the molecule that was important for the interaction was not making any contact with the protein.

The PDB had some crystal structures of related proteins, but not of the target of interest. We studied the available protein data to learn as much as possible about the binding pocket, paying particular attention to the positions and stability of the water molecules. This led to us putting forward the hypothesis that an important part of the ligand interaction included the stabilization of water.

Based on this hypothesis we prioritized the molecules that bridged the observed gap between the natural ligand and the target while also stabilizing the free waters.

Water analysis was carried out by manually superimposing multiple crystal structures, viewing the crystallographic waters that clustered together, and mapping on their temperature factors. This process allowed us to determine the importance of each water molecule in the solvation sphere around the ligand and protein pocket. With the advent of the new 3D-RISM method in Flare a similar computational work-flow can be accessed which is far more efficient for this type of analysis. This is a more systematic approach which enables us to calculate the position and stability of all water molecules around a proposed ligand in a binding pocket. Moreover, as it does this without the need for any crystallographic water data, this is far more useful as well as convenient. Ultimately, this data can be used to assess or compare ligands in terms of how well they might stabilize essential water.

Based on our equivalent ‘hands-on’ analysis, we worked with the customer to choose the best candidates for synthesis. These newly-designed ligands resulted in new active chemistry for the customer that was valuable enough for them to patent.


The position and energetics of water molecules in and around the active site is of crucial importance when designing novel ligands. Knowing which water molecules are energetically favorable can give valuable insights into the best positions for ligand molecules. 3D-RISM analysis is one of the methods available in Flare for structure-based drug design.

Move from hit to lead

A long-standing customer had a hit series with good activity but poor properties. Cresset Discovery Services worked closely with the customer to formulate a plan of action to optimize the compound properties while maintaining potency.

Cresset Discovery Services has worked closely with a customer on a target. We ran virtual screens, aligned literature chemotypes and proprietary chemotypes in order to arrive at a robust binding hypothesis and a field hypothesis. This enabled the customer to find a new chemotype that had good activity at the target. However, the hit series had poor properties. They needed to optimize the compounds, potentially sacrificing some potency while balancing this against improving the properties of the molecules.

The target in question required lipophilic molecules, so the set of compounds had reasonably high lipophilicity which can be a liability in drug development. The compounds also had high protein binding and, we suspected, high clearance.

 
Figure 1: Forge radial plot. The selected ‘highlighted’ set (Figure 3). Compounds with better properties give a larger area in the radial plot.

 
Figure 2: Forge graphical radial plot parameters. For each property a function is used to describe perfect, acceptable and unacceptable values. Perfect values are plotted at the edge of the radial plot and unacceptable plotted at the center. A single ‘Radial Plot Score’ is created to represent the fit of a compound to the chosen set as a function of parameters using the specified weighting scheme.

 
Figure 3: Forge property plots. Radial plot scores as a MPO method: Compounds with a radial plot score greater than the chosen cut-off (bottom left) were selected and hence are highlighted all the plots. These compounds have a good balance of activity and other physicochemical properties.

The best course of action for optimization was to search for compounds that were reasonably active but far more polar than the bulk of the molecules in the compound set. We designed suitable variants of these candidates and checked them using the alignment model in field space that we had developed earlier in the project. New combinations of functional groups on the core were selected to address the overall lipophilicity whilst maintaining the essential interaction features. This solution had the potential to address both the protein binding and clearance issues.

Over the course of the following weeks the customer worked through these ideas with a high degree of success. We collaborated closely with the customer throughout the project. They shared not only their activity data, but also their property data. We assessed both the activity and property landscape, and refined the suggested sets with the aid of multi-parameter optimization, which enabled us to suggest which compounds they should make and progress to help to get them past the hurdles they were experiencing. The customer has now arrived at a set of compounds with far better properties without a significant loss of potency.

The customer has retained our services to help them to optimize the potencies of the back-up series, which may help them to choose which chemotypes to progress next.

Contact us to see how we can help you with similar projects.

Conduct ligand-protein docking

A long-standing customer of Cresset Discovery Services asked us to identify new compounds that could be active at their protein target. We conducted ligand-protein docking to narrow down their 50k compound library to the best 1.5k compounds. The cost of the consulting project plus the chemistry for 1.5k compounds was about 20% of what it would have cost to buy and screen the entire 50k library.

Ligand-protein docking can be an excellent way to build up knowledge about the binding pocket. It can also form the basis for a virtual screen to identify new active compounds.

Cresset Discovery Services had been working with this customer on a particular ligand for some time, but there was very little information available about the protein target. There were homologues in the literature, but they were distantly related and nothing very similar had been crystallized.

Detailed preparatory work to model the protein active site

It was necessary to do a lot of modeling work to build up the relationship between the human target and the distantly related proteins available from the literature. We built sequence alignments and compared them, enabling us to build up 3D models of the target and its interaction with the ligand.

Some mutagenesis data was available on the known ligands, so we were able to use this to refine the 3D models and check that the correct residues were in the right places on the active site. This enabled us to define the active site for the ligands. We went on to calculate the energies for the protein-ligand interactions to make sure we had identified poses that made sense.

This was a complex system that required a great deal of protein preparation. This preparatory work was essential for successful docking and required expert knowledge, experience and skill.

Docking and virtual screening using different scenarios

At the end of this process we had a good model of the protein-ligand system. The next step was to remove the ligand and carry out docking.

Docking was first tested on the molecules that were known to bind to the target. This resulted in excellent retrieval rates, showing that the model would also be able to retrieve new compounds.

There were a number of different binding sites on the protein so we decided to carry out the virtual screening using different scenarios for the protein. We:

  • Kept the ligand intact in the binding site
  • Removed the ligand completely
  • Looked at partly bound situations and un-bound situations for each of the binding sites.

The customer provided us with a set of 50k ligands and we docked each of these against the binding pockets. A docking scoring system was used to rank the top 2k compounds from each of the screens.

Analyzing the results and compiling a purchasing list

The top 2k compounds from the four screens were analysed in detail. We visualized every one of the top 2k compounds and looked at each of the docking poses. The docking gave us good geometries for the ligands and we used Cresset software to check that the electrostatics made sense. Any compounds that were unlikely to bind well were rejected.

A final, ranked list was provided to the customer with a very high degree of confidence that it included compounds that were active at the protein target. They were able to procure about 75% of the compounds from the hit list, giving them a final set of 1.5k compounds to test.

An incredible saving in time and money

Carrying out virtual screening to focus the library in this way represented an incredible saving in time and money for our customer. The alternative approach would have been to buy and test the whole 50k compound set. Not only would the customer have needed to purchase all of the compounds, but also shipped them, stored them, plated them, screened them, and then they would still have to analyse the results.

The estimated cost of doing this for all 50k compounds would have been about five times the cost of the combined tasks of the Cresset Discovery Services project plus buying and testing 1.5k compounds.

Cost of

{buying and testing 50k compounds}

=  5 X

Cost of

{Cresset Discovery Services project + buying and testing 1.5k hit list}

Contact us to find out how we can add value to your project.

 

 

 

 

 

Dr Martin Slater, Director of Consulting Services

Homology modeling and ligand electrostatics plays key role in elucidating binding mode and molecular interaction of new class of antifungal drugs

Last month F2G published a paper in PNAS [1] describing F901318, the leading representative of a novel class of antifungal drug. Dr Martin Slater, Director of Cresset Discovery Services, is a co-author on the paper. He describes how modeling work carried out by Cresset Discovery Services was critical to predicting the binding mode of the inhibitor and important interacting amino acid residues. F901318 is currently in clinical development for the treatment of invasive aspergillosis.

There is an important medical need for new antifungal agents with novel mechanisms of action to treat the increasing number of patients with life-threatening systemic fungal disease and to overcome the growing problem of resistance to current therapies.

F2G are a UK-based antifungal drug discovery and development company who have identified F901318 as a leading representative of the orotomides, a novel class of antifungal drug. Their identification of dihydroorotate dehydrogenase (DHODH) as the mechanism by which F901318 inhibits and kills Aspergillus fumigatus has been a major breakthrough differentiating F901318 from other systemic antifungal agents.

From hit to lead with medicinal chemistry

F2G had a large amount of proprietary cellular activity data developed over time against their antifungal screening platform. After an initial hit finding campaign significant progress had been made using classical medicinal chemistry approaches.

F2G were keen to inform and assist the development process by gaining a molecular level understanding of the target protein ligand system. They approached Cresset Discovery Services for help in elucidating the molecular interaction of the target protein-ligand system.

A detailed molecular understanding with modeling

Cresset’s unique approach of defining the electrostatics around the active chemotype made it possible to identify the precise nature of the various sites on the active molecules. In conjunction with sequence analysis across the wider DHODH family, Cresset scientists were able to match these subtle ligand features to the patterns of residues that were likely to be key.

Subsequent homology and ligand protein interaction modeling of Aspergillus fumigatus DHODH using the XED force field identified a predicted binding mode of the inhibitor and important interacting amino acid residues.

We combined a detailed ligand centric approach using Forge with protein modeling using a prototype of the new Cresset protein tool to arrive at a binding hypothesis consistent with the selectivity profile. The modeling process is fully reported in the paper [1].

Testing in silico hypotheses in vitro

Having made a binding hypothesis, a number of lab experiments were initiated by F2G to check the predictions e.g., using site directed mutagenesis.

Most satisfyingly, the lab results supported our predictions.

F901318 is currently in late Phase 1 clinical trials, offering hope that the antifungal armamentarium can be expanded to include a class of agent with a mechanism of action distinct from currently marketed antifungals.

Cresset’s consulting work with F2G provided valuable insight into the predicted interaction pattern of the main chemical series with the Aspergillus DHODH target protein. As with many research projects, any level of understanding achieved is often a prelude to even deeper questions, and there are many remaining to be answered for this unique system. Cresset continues to work closely with F2G, providing software and services to support them in their ongoing projects.

References

1 http://www.pnas.org/content/113/45/12809.abstract

 

 

 

 

 

 

 

Dr Martin Slater

Director, Cresset Discovery Services

Build and cluster diverse 3D libraries

Cresset Discovery Services (CDS) worked with BioBlocks to analyze their fragment library to maximize coverage of 3D chemical space. As part of the project, we developed an innovative clustering method that made it possible to assess the 3D similarity across their virtual database of over 1.5 million fragments.

The goal of the project was to help BioBlocks build the maximum 3D diversity into a fragment library of manageable size from a starting pool of over a million compounds. Existing techniques would have required an infeasible amount of computing power, so CDS developed an entirely novel rapid clustering method especially for the project. The solution was still extremely computationally challenging, but we were able to use our expertise in distributing calculations to the cloud to deliver the results that BioBlocks needed on time and within budget.

“Working with Cresset has been a positive experience from start to finish,” said Warren Wade, VP of Chemistry at BioBlocks. “Because our fragments are designed to be new chemical matter, they challenged the limits of existing structural descriptions. Cresset worked closely with us to overcome these limits and produce a high value compound set”.

The final result was a 3D fragment library that contains a significant number of compounds with novel core structures that are now viable candidates for fragment screening. BioBlocks envisions this Comprehensive Fragment Library to be a drug discovery tool available only to collaborators who will be able to leverage this new chemical space for their lead discovery programs. Hits from the library are entry points to BioBlocks’ collaborative medicinal chemistry processes, developed to increase the probability of generating commercially viable leads.


3D Similarity-based clustering workflow
3D similarity-based clustering workflow

Read more about this project: Large scale compound clustering in 3D.

Contact Cresset Discovery Services to find out more about how we can help you design large scale libraries for your project.

Develop bespoke software

Cresset software focuses on novel methods to discover, design, perfect or view compounds and their data in easy to use applications. Our applications are firmly founded in the experience of our customers and the most common problems that they face. However, our development expertise is not limited by our applications. We have an excellent track record of delivering novel scientific plugins, command line applications and workflows that go beyond our commercial offerings.

Custom integration

Many customers have extensive in-house computational chemistry tools that their chemists appreciate and derive value from. They like to access the unique benefits of Cresset software from within their existing solutions. In this situation Cresset Discovery Services (CDS) can develop software that is seamlessly integrated with the customer’s existing framework. From viewing our excellent electrostatic interaction potentials to detecting 3D activity cliffs we have the expertise to plug Cresset directly into their world.

Unique science

Customers regularly approach us with specific workflows, or novel approaches for which they require a truly bespoke solution. We can either develop bespoke software with an exclusive license, or co-develop novel methods for the customer to their requirements, before developing it for a wider commercial market. In either case we use our extensive experience of creating software for chemistry workflows to deliver usable, stable and functional solutions that solve the customer needs.

“Bespoke software gives customers access to Cresset software without having to change their software environment.”
Dr Martin Slater, Director, Cresset Discovery Services

Case study

Cresset were recently asked to develop a tool that could look at a customer’s SAR data and prioritize new molecules for synthesis based on the information that they would add to the project, as well as potential for activity. The outcome of this work was a ground-breaking 3D-QSAR application. As part of the customer agreement, Cresset went on to commercialize the application as Activity Atlas, which is now a component of Forge.

 

If you have a situation that requires a novel solution, get in touch for a free, confidential discussion to find out how Cresset Discovery Services can work with you to develop a bespoke application.

What’s in the CDS virtual screening toolbox?

Cresset is very well known for providing fast and accurate ligand-based virtual screening through Blaze. We have now added the Lead Finder docking engine to our virtual screening toolbox, giving Cresset Discovery Services (CDS) the most comprehensive virtual screening capabilities available anywhere in the industry.

Based on an informal survey of our contacts and customers, I estimate that something like 50% of all current pharma SME projects are ‘structure enabled’. Lead discovery and lead optimization are driven through the use of in-house structures, public structures (typically from the PDB) and homology models. These structures inform lead optimization programs by explaining observed SAR and providing feedback and a detailed context for the design of further analogues.

CDS routinely uses the Cresset software Blaze for ligand-based virtual screening. Although we had access to structure-based methods, we are pleased to have brought Lead Finder in-house, giving us full capability in conducting ligand-protein docking.

Ligand-based virtual screening with Blaze

Virtual screening with Blaze remains one of the most consistently requested projects for CDS. What makes Blaze extremely useful for our customers is:

  • Virtual screening is probably the only way to really sample adequate chemical diversity
  • Virtual screens are far more cost effective than wet HTS
  • Excellent enrichments can be achieved
  • The chemotype diversity in the output is second to none.

Blaze also relies on two very simple premises:

  1. A bioactive conformation encodes, in its shape and electrostatic field, both the properties, recognition features and solvation pattern optimised for interaction with its protein target site.
  2. A molecule conformation with increasing ‘shape and field’ similarity to that bioactive conformation has an increasing probability of also being active.

So, the key determinants of real activity obtained from hit lists (other than was this truly the ‘bioactive conformation’?) is often just how relevant and what distribution that hit conformation has in the population. This is fundamentally why our ligand-centric screening invariably works extremely well. Given that a molecule can adopt a similar shape, and project the same electrostatic patterns, from a completely different chemical architecture, leads to a very diverse output.

Structure-based virtual screening with Lead Finder

The Lead Finder software has been developed to provide cutting-edge docking for an array of typical tasks, from high-throughput virtual screening to best-in-class prediction of bioactive conformations to accurate prediction of binding energies. In combination with the companion Build Model protein preparation tool, Lead Finder has been shown to match or outperform the historically leading docking solutions.

When preparing ligands for virtual screening in Blaze, CDS scientists use modeling to help define the best ‘hand-crafted’ estimate of a bioactive conformation, based on the widest data for any given system. We apply the same care to exploring and preparing protein targets prior to structure-based virtual screens. We take advantage of three main approaches. Firstly, Lead Finder includes the excellent Build Model protein preparation tool. Secondly, we are privileged to be able to model proteins and ligands using the same proprietary XED force field used to give the accurate electrostatics that all Cresset software is based on. Finally, at CDS we have access to the latest Cresset software that is still under development. This gives us capability to provide protein electrostatic field maps and water analysis, providing a very reliable starting position for structure-based virtual screening.

vs_2bsm3

Lead Finder uses a stochastic ligand sampling workflow, with conformations generated on-the-fly, and a genetic algorithm for processing these into pools of the best docking poses. Multiple interaction grids are generated from the protein target and combined to define a scoring system for poses. More importantly, the scoring method has been shown to outperform some of the more conventional docking engines currently available commercially.

Structure-based or ligand-based?

What are the advantages of having structure-based and ligand-based virtual screening?  And how do we choose which is the best approach for a project?

Ligand-based virtual screening is less computationally intensive, making it a preferred option when there is a known ligand available. An average protein of 400 amino acids has over 20,000 heavy atoms and 9,600 bonds and in excess of 50 charges, making it a more challenging system to model.

However, even when there is a known ligand there are some situations when a ligand-based virtual screening is not viable, such as when the known ligand does not exploit all the interactions available in an active site or when a protein has an unattractive orthosteric site and attractive allosteric sites with no known ligands. In these cases, we prefer to use a structure-based method.

In the case of protein-protein interaction sites and protein-DNA/RNA sites, Blaze can take DNA and protein fragments as a template in place of a ligand. However, it is useful to have a structure-based approach available for comparison.

In fact, we often find it useful to combine different virtual screening techniques. In lead discovery, one of the key requirements for virtual screening is to maximise the diversity of hits returned.  All virtual screening techniques, be they ligand-based or structure-based, are probabilistic techniques in that they may be used to increase the likelihood of getting hits from a wet screen. No technique guarantees to give absolute binding energies (at least not in the context of virtual screening on any realistic size of screening library), but they do give good rank ordering of compounds and can, therefore, be used as a means of selection and prioritisation.

Ligand-based techniques, whether 2D or 3D, are algorithmically distinct from structure-based techniques such as docking and, therefore, give different rankings to compounds. Different approaches return different hits and the results can be combined into an enriched final list.

Combining the results of structure-based and ligand-based techniques provides further diversity, leading to better hit rates and more interesting hits.

A one-stop shop for virtual screening

Through combining the strengths of Blaze in the ligand-based world with Lead Finder for docking, CDS now has the most comprehensive virtual screening capabilities available anywhere in the industry. Both Blaze and Lead Finder are available to purchase as software or as a service through CDS. CDS is truly now a one stop shop for virtual screening and indeed very much more.

Download a free evaluation of Lead Finder or access the Blaze demo server.