Water stability is key to designing novel patentable chemistry

An analysis of the water stability and positions in a ligand-protein complex informed the design of novel ligands for a customer target. This work led to new active chemistry that the customer went on to patent.

A Cresset Discovery Services customer had identified a novel target with a natural ligand and were looking for new chemistry that would be active at the target site. Our scientists carried out an initial project to learn more about the protein-ligand system. The Cresset field approach, used to analyse the structure and interactions, gave the customer valuable insights into the active features of the ligand.

The customer used this information to develop analogous synthetic compounds and example molecules. They asked us to work with them again to computationally align the example molecules and prioritize them for synthesis.

We carried out an initial alignment and then modeled the system in detail. It appeared that part of the molecule that was important for the interaction was not making any contact with the protein.

The PDB had some crystal structures of related proteins, but not of the target of interest. We studied the available protein data to learn as much as possible about the binding pocket, paying particular attention to the positions and stability of the water molecules. This led to us putting forward the hypothesis that an important part of the ligand interaction included the stabilization of water.

Based on this hypothesis we prioritized the molecules that bridged the observed gap between the natural ligand and the target while also stabilizing the free waters.

Water analysis was carried out by manually superimposing multiple crystal structures, viewing the crystallographic waters that clustered together, and mapping on their temperature factors. This process allowed us to determine the importance of each water molecule in the solvation sphere around the ligand and protein pocket. With the advent of the new 3D-RISM method in Flare a similar computational work-flow can be accessed which is far more efficient for this type of analysis. This is a more systematic approach which enables us to calculate the position and stability of all water molecules around a proposed ligand in a binding pocket. Moreover, as it does this without the need for any crystallographic water data, this is far more useful as well as convenient. Ultimately, this data can be used to assess or compare ligands in terms of how well they might stabilize essential water.

Based on our equivalent ‘hands-on’ analysis, we worked with the customer to choose the best candidates for synthesis. These newly-designed ligands resulted in new active chemistry for the customer that was valuable enough for them to patent.


The position and energetics of water molecules in and around the active site is of crucial importance when designing novel ligands. Knowing which water molecules are energetically favorable can give valuable insights into the best positions for ligand molecules. 3D-RISM analysis is one of the methods available in Flare for structure-based drug design.

Homology modeling and ligand electrostatics plays key role in elucidating binding mode and molecular interaction of new class of antifungal drugs

Last month F2G published a paper in PNAS [1] describing F901318, the leading representative of a novel class of antifungal drug. Dr Martin Slater, Director of Cresset Discovery Services, is a co-author on the paper. He describes how modeling work carried out by Cresset Discovery Services was critical to predicting the binding mode of the inhibitor and important interacting amino acid residues. F901318 is currently in clinical development for the treatment of invasive aspergillosis.

There is an important medical need for new antifungal agents with novel mechanisms of action to treat the increasing number of patients with life-threatening systemic fungal disease and to overcome the growing problem of resistance to current therapies.

F2G are a UK-based antifungal drug discovery and development company who have identified F901318 as a leading representative of the orotomides, a novel class of antifungal drug. Their identification of dihydroorotate dehydrogenase (DHODH) as the mechanism by which F901318 inhibits and kills Aspergillus fumigatus has been a major breakthrough differentiating F901318 from other systemic antifungal agents.

From hit to lead with medicinal chemistry

F2G had a large amount of proprietary cellular activity data developed over time against their antifungal screening platform. After an initial hit finding campaign significant progress had been made using classical medicinal chemistry approaches.

F2G were keen to inform and assist the development process by gaining a molecular level understanding of the target protein ligand system. They approached Cresset Discovery Services for help in elucidating the molecular interaction of the target protein-ligand system.

A detailed molecular understanding with modeling

Cresset’s unique approach of defining the electrostatics around the active chemotype made it possible to identify the precise nature of the various sites on the active molecules. In conjunction with sequence analysis across the wider DHODH family, Cresset scientists were able to match these subtle ligand features to the patterns of residues that were likely to be key.

Subsequent homology and ligand protein interaction modeling of Aspergillus fumigatus DHODH using the XED force field identified a predicted binding mode of the inhibitor and important interacting amino acid residues.

We combined a detailed ligand centric approach using Forge with protein modeling using a prototype of the new Cresset protein tool to arrive at a binding hypothesis consistent with the selectivity profile. The modeling process is fully reported in the paper [1].

Testing in silico hypotheses in vitro

Having made a binding hypothesis, a number of lab experiments were initiated by F2G to check the predictions e.g., using site directed mutagenesis.

Most satisfyingly, the lab results supported our predictions.

F901318 is currently in late Phase 1 clinical trials, offering hope that the antifungal armamentarium can be expanded to include a class of agent with a mechanism of action distinct from currently marketed antifungals.

Cresset’s consulting work with F2G provided valuable insight into the predicted interaction pattern of the main chemical series with the Aspergillus DHODH target protein. As with many research projects, any level of understanding achieved is often a prelude to even deeper questions, and there are many remaining to be answered for this unique system. Cresset continues to work closely with F2G, providing software and services to support them in their ongoing projects.

References

1 http://www.pnas.org/content/113/45/12809.abstract

 

 

 

 

 

 

 

Dr Martin Slater

Director, Cresset Discovery Services

Build and cluster diverse 3D libraries

Cresset Discovery Services (CDS) worked with BioBlocks to analyze their fragment library to maximize coverage of 3D chemical space. As part of the project, we developed an innovative clustering method that made it possible to assess the 3D similarity across their virtual database of over 1.5 million fragments.

The goal of the project was to help BioBlocks build the maximum 3D diversity into a fragment library of manageable size from a starting pool of over a million compounds. Existing techniques would have required an infeasible amount of computing power, so CDS developed an entirely novel rapid clustering method especially for the project. The solution was still extremely computationally challenging, but we were able to use our expertise in distributing calculations to the cloud to deliver the results that BioBlocks needed on time and within budget.

“Working with Cresset has been a positive experience from start to finish,” said Warren Wade, VP of Chemistry at BioBlocks. “Because our fragments are designed to be new chemical matter, they challenged the limits of existing structural descriptions. Cresset worked closely with us to overcome these limits and produce a high value compound set”.

The final result was a 3D fragment library that contains a significant number of compounds with novel core structures that are now viable candidates for fragment screening. BioBlocks envisions this Comprehensive Fragment Library to be a drug discovery tool available only to collaborators who will be able to leverage this new chemical space for their lead discovery programs. Hits from the library are entry points to BioBlocks’ collaborative medicinal chemistry processes, developed to increase the probability of generating commercially viable leads.


3D Similarity-based clustering workflow
3D similarity-based clustering workflow

Read more about this project: Large scale compound clustering in 3D.

Contact Cresset Discovery Services to find out more about how we can help you design large scale libraries for your project.

Engaging with Cresset Discovery Services

Cresset Discovery Services (CDS) offers bespoke in silico services for small molecule discovery. We do a lot of work in drug discovery and optimization for the pharmaceutical industry but we also work extensively in agrochemicals, flavors, fragrances – in fact, in any industry that involves work with small organic molecules.

This post explains the process that we go through when customers work with Cresset Discovery Services, from the first contact to the final deliverables.

Enquire

At the enquiry stage we talk with customers about their requirements in general terms to get an idea of whether we will be able to help them. The answer is usually yes, but we will certainly let you know if we think that our approach would not be the best match for your project.

These initial discussions will involve members of both our sales team and the scientific team. Everything at the enquiry stage is free, but the discussions will not be at a great depth since confidential details cannot yet be shared.

Establish

Once both sides have agreed to proceed, we exchange confidentiality agreements and can then get down to the details. The customer will share their confidential data and CDS will prepare a detailed proposal of the work they will carry out.

This stage will involve a detailed meeting to gather the data and another to present the proposal. The proposal will include full details of pricing and milestones. If the work is a collaboration, then all partners will be involved at this stage.

Execute

Close collaboration is key to any successful project. Depending on the size and complexity of the project, there may be several long meetings at the start of the project. These could involve many members of the customer team. The goal is to focus on the project and to scope out exactly what needs to be achieved.

Work then moves to the details – for example, what to do, with which molecules and which conformations. This could involve conversations several times a week until everything is in place to run the study.

Frequent reviews take place throughput the project between the customer and CDS. Each customer has a personal point of contact who remains consistent throughout the project.

At each stage of the project there will be several conversations to make sure that the customer is getting exactly what they wanted. These will be tied in to agreed milestone reviews and deliverables.

Project deliverables are likely to be available through the project, not only at the end. No matter when they are delivered, the approach remains the same: we make sure that the customer gets the maximum value out of the results.

For example, typical results for a large screening project with multiple compounds may be between 10,000 and 20,000 hits. But CDS will make sure that the customer gets more than a list from the project. We will always ensure that the customer fully understands and can interpret the results in the context of the project in order to get the best out of them.

Evaluate

No project is complete without a project review of what went well and what could go better. As part of this process we agree the next steps, which could range from a follow-on project, to advice on the next research steps.

Many of our customers remain customers for the long term. In fact, when we do lose a services customer it’s usually because they have decided to buy our software and hire a computational chemist to work full time. This case study describes how we helped one customer to hire and train computational chemists. Even then, customers still come back to us for projects if they need the extra resource.

 

Contact us today to start the process of working with CDS.

Enquire_Establish_Execute_Evaluate

What can the cloud offer computational chemistry?

The latest edition of Innovations in Pharmaceutical Technology (IPT) includes the article Sky’s the Limit by Tim Cheeseright and Katriona Scoffin of Cresset outlining some of the key benefits of the cloud for computational chemistry.

They point out that, “computational chemistry methods all involve a trade-off between accuracy and computational resources”. Cloud computing makes it easy to access computing power on a flexible basis, translating to “better results faster and cheaper”. Other benefits include, “flexible access to computing resources meaning users only pay for what they need” and “easy to use web interfaces that remove the need for local installation”.

Issues of security around using the cloud are also discussed, notably that, “cloud computing is an infrastructure and, in that sense, the security is as good as the product that is built upon it.”

The article also includes a recent example of how Cresset used Blaze Cloud to cluster a large database to create a diverse compound library.

Displacing crystallographic water molecules with Spark

Abstract

Cresset’s Spark1 software for bioisosteric replacement was used to carry out a water displacement experiment starting from the X-ray crystal structure of a selective inhibitor of Bruton’s tyrosine kinase2. The use of databases derived from available reagents ensured that the results could be tethered to molecules that were readily synthetically accessible. The availability of a sufficiently diverse source of reagents was crucial in demonstrating the feasibility of this approach.

Introduction

Bruton’s tyrosine kinase (Btk) is a member of the Tec family of non-receptor tyrosine kinases. Recent literature findings2 indicate that Btk inhibition could be an attractive approach for the treatment of autoimmune diseases such as rheumatoid arthritis, a progressive autoimmune disease characterized by swelling and erosion of the joints3.

A fragment-based drug design approach was recently2 applied to the discovery of non-covalent, potent inhibitors of Btk inhibitors with Lck selectivity (Lymphocyte-specific protein tyrosine kinase, a target playing a key role in T-cell activation).

Among the most interesting hits identified with this approach, compound 2 (Table 1) was selected for further optimization. Position 8 of the cinnoline ring of fragment 2 was explored using the Suzuki−Miyaura4 synthetic methodology, starting from a series of monocyclic boronic acids/esters. This initial SAR exploration led to the discovery of compound 8 (Table 1), which shows improved potency and selectivity with respect to fragment 2.

The published X-ray crystal structure of compound 8 in the active site of Btk (PDB 4ZLZ) shows a water-mediated hydrogen bond from the pyridyl nitrogen to the P-loop backbone residues Phe413 and Gly414 of Btk2 (Figure 1 – left). The replacement of 4-methylpyridin-3-yl in compound 8 with small bicyclic heterocycles displacing the water molecule and making direct H-bond interactions with the P-loop led to the discovery of compounds 10 and 11 (Table 1), with a 10-fold improved potency towards Btk.

The 3D structure of compound 8 and the bridging water molecule were used as the starting point for this Spark case study. The aim of this experiment is to verify whether our methodology is able to displace the bridging water molecule and correctly identify the same alternative indazole fragments.

Table 1. SAR exploration of fragment hit 2
SAR exploration of fragment hit 2

Spark reagent databases: accessing available chemical diversity

Spark’s approach to scaffold hopping and R-group replacement uses Cresset’s field-based technology5 6 to identify viable replacements for a selected portion of a reference compound using a series of fragments. In this case study we chose to use standard reagent databases7 supplied by Cresset which are based on the available chemicals directory. This gives the opportunity to rapidly search all R-groups that could be introduced at a selected position. However, an optional Database Generator module enables the creation of fragment databases that are derived from corporate compound registries or inventory systems, linking your available chemistry directly to the Spark experiment.

Method

The published X-ray crystal structure of compound 8 bound into to the active site of Btk (PDB 4ZLZ) was downloaded into Forge8. The structure of the ligand was minimized and then combined with the water molecule mediating the H-bond interaction with the P-loop backbone residues of Btk to make a single molecule entry. The merging of the two 3D structures was done using the ‘combine selected pair into single molecule’ feature available in Forge. The unique entry thus created (see Figure 1 – right) was used as the Starter molecule for the Spark experiment (Figure 2 – left).

In this water displacement experiment, we want the Spark search to be driven mainly by the electrostatic fields, rather than by the usual combination of fields and shape.

For this reason a constraint was added to the negative and positive field points of the water molecule using the Spark Field Constraints Editor (Figure 2 – right). This introduced a score penalty for those results that did not match the constrained field points.

Furthermore, the ‘Normal’ conditions for scoring the Spark search results were fine-tuned to 90% Field and 10% shape, using the Btk protein as a ‘hard’ excluded volume, to constrain the size of the potential replacement fragments.


X-ray crystal structure and 3D structures
Figure 1. Left: X-ray crystal structure of compound 8 in the active site of Btk making a water mediated hydrogen bond with the P-loop backbone. Right: 3D structures and field points of compound 8 and of the bridging water molecule combined into a single entry.
Color coding of field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

The gradient cutoff for minimization was set to 0.200 kcal/mol/A, removing at the same time the automatic constraint of fragment size to ensure that the results of the search were not too biased by the size of the starter molecule.

Finally, to focus the experiment on small bicyclic heterocycles, monocyclic fragments were filtered out from the list of potential results using an appropriate SMARTS filter.

Two runs of Spark were carried out using the above conditions. The initial experiment was run on a database of 775 boronic acids to closely replicate the chemistry used in the original publication2, 4.

Combined 3D structures and constraints associated to the field points
Figure 2. Left: the combined 3D structures of compound 8 and the bridging water molecule used as a starter molecule in the Spark experiment. Right: constraints associated to the field points of the water molecule.
Color coding of field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.

In the second experiment, the ZINC7 database of commercial aromatic halides (41K fragments) was also searched to explore a larger chemical diversity, starting from the assumption that the appropriate boronic acid/ester could be obtained from any interesting commercial aryl halide at the cost of an additional synthetic effort.

Results

The top scoring compound from the initial search (boronic acids only) is compound 10 (Table 1). As can be seen in Fig. 3 – right, this compound superimposes very well with the starter molecule and matches the constrained field points in a satisfactory manner. However, compound 11, which would presumably superimpose even better with the conformation of the ortho-methyl-pyridin-3-yl group of compound 8, was not found in this search, due to the limited chemical diversity of the database searched.

In the second Spark search, which was run on a much larger collection of reagents (boronic acids and aryl halides), compound 11 (Fig. 3 – center and Fig. 4) is the top scoring result, while compound 10 ranks 4th in the list (Fig. 4).

The original paper2 also reports the indole-substituted compound 9 (Table 1), quite similar in terms of 2D structure to the much more potent indazole compounds 10 and 11. This fragment is available in both the databases searched, but is not retrieved by Spark. The indole fragment in fact cannot match the constrained negative field point of the bridging water molecule, as shown in Fig. 5, where compound 9 is shown superimposed to the starter molecule in Forge. The lack of this relevant interaction explains the much lower potency of compound 9, with a Btk IC50 = 850nM (Table 1).

Figure 4 shows a tile view of the 16 top scoring results from the second Spark experiment. Several different flavors of the indazole fragment carrying different substitution patterns are represented in this list. Alternative bicyclic fragments are also proposed, which may provide useful ideas for a further exploration of this target.

Electrostatics starter molecule_Compound 11_Compound 10
Figure 3. Left: electrostatics of starter molecule. Center: compound 11 (Btk IC50 = 4.0 nM). Right: compound 10 (Btk IC50 = 12 nM)
Color coding of fields/field points: blue = negative; red = positive; yellow = steric; gold = hydrophobic.


Tile view of top scoring Spark results
Figure 4. Tile view of the top scoring Spark results for the second experiment.


Compound 9 superimposed to starter molecule
Figure 5. Compound 9 (right) superimposed to the starter molecule of the Spark experiment (left).

Conclusions

In this case study Spark successfully managed to displace the crystallographic water molecule bridging the interaction between compound 8 and the P-loop of Btk, replacing it with small, synthetically accessible bicyclic heterocycles.

Availability of appropriate sources of chemical diversity is still a key factor in determining the success of any bioisosteric replacement experiment.

For this reason, the creation of fragment databases derived from corporate compound registries or inventory systems, linking your available chemistry directly to the Spark experiment, is highly recommended.

References and links

1. http://www.cresset-group.com/products/spark/
2. Smith, C. R. et al., J. Med. Chem. 2015, 58, 5437−5444
3. Firestein, G. S., Nature 2003, 423 (6937), 356−361
4. Miyaura, N., Suzuki, A. et. al., J. Am. Chem. Soc. 1989, 111 (1), 314−321.
5. J. Chem. Inf. Model., 2006, 46, 665-676.
6. http://www.cresset-group.com/science/field-technology/
7. Spark fragment databases come from commercial compounds, ChEMBL, ZINC and VEHICLe.
8. http://www.cresset-group.com/products/forge/

Identifying bioisosteres of the benzazepine scaffold

Drug discovery projects continuously explore novel and diverse structures with the objective of optimizing existing leads, improving IP position, or identifying new leads by switching scaffolds completely. The identification of novel chemotypes can be particularly difficult for those targets where the crystallographic information is scarce or unavailable (for example GPCRs, ion channels and novel targets). In this case study, working from just a 2D fragment of a known active D3 antagonist, we show how Spark was able to quickly identify a variety of alternative scaffolds, some of which have proven D3 activity.

Figure 2 Results for the first run of Spark searches
Results for the first run of Spark searches. Lime green: SB-414796; cyan: known D3 scaffolds; magenta: other Spark bioisosteres.

Read the case study.

Elucidating the bioactive conformation of CCR5 Chemokine Receptor inhibitors

There are still many projects which do not have a relevant protein-ligand crystal structure to drive compound design. This includes those targeting GPCRs and Ion Channels as well as those working with phenotypic or whole-organism screens. In such cases, field pharmacophore modeling as implemented in FieldTemplater can help to decipher how and which active compounds interact with a common protein target and which parts of those active molecules are involved in binding, in the absence of any protein information.

FieldTemplater generates a series of conformations that the ligands might adopt at physiological conditions. It analyzes these conformations to find sets that show a high molecular field similarity (and hence have similar shape/binding properties). Where all the ligands with a common activity align well, it is very likely that this is the bioactive conformation.

The case study Elucidating the bioactive conformation of CCR5 Chemokine Receptor inhibitors shows how FieldTemplater, working from just a few 2D structures of known active CCR5 Chemokine Receptor inhibitors, was able to correctly reproduce the bioactive conformation of the CCR5 receptor inhibitor Maraviroc as derived from the 4MBS PDB crystal structure, without making use of the X-ray information about the binding mode of this ligand. Additionally, FieldTemplater indicates the relative alignments and likely bioactive conformations of 3 further CCR5 inhibitors enabling the transfer of SAR between series. The case study gives full experimental details and results and is used in a web clip to show the power of the Cresset Engine Broker to accelerate computationally intensive experiments.

Giovanna
Dr Giovanna Tedesco, Product Manager

Deciphering complex aromatic SAR

The substitution of aromatic groups provides a unique tool to moderate the potency and physicochemical properties of drug like molecules. However, the huge variety of substitutions that are possible can give rise to SAR that is almost impossible to understand, with small changes resulting in large shifts in potency. In these circumstances the understanding of the causes of the observed activity cliff is critical to progressing the project aims. This is an area where we at Cresset have always felt that using molecular interaction fields gives you a head start as you can model the electrostatic and shape properties of the molecule accurately. The release of the Activity Miner module for Forge and Torch significantly improves this process by detecting automatically activity cliffs in the SAR. Below we present a case study on a small set of changes around a set of reported DPP-IV inhibitors and show how the Activity Miner interface helps find the root causes of the changes in activity.

A set of DPP-IV inhibitors related to the ligands from PDB codes 2QOE and 2P8S were extracted from bindingdb together with IC50 values for enzyme inhibition. Using Forge, PDB 2QOE was downloaded and split into reference ligand and protein. The ligand from PDB code 2P8S was downloaded as a fixed conformation and aligned to the 2QOE reference using the default ‘normal’ settings then added as an additional reference molecule. The remaining 31 compounds in the dataset were aligned using the ‘Substructure’ method to these references with the maximum score against any reference being used to choose the alignment. The resulting alignments are shown below.

dpp-iv_aligned_structures-300x136
The aligned dataset was transferred to the Activity Miner module to study the SAR around the terminal phenyl substituent. Using the activity view focused on the most active compound (shown below) highlights that the SAR around this substituent is complicated with many small changes resulting in significantly worse IC50 values. The activity view presents a central (focus) molecule, with the most similar molecules to the focus compound displayed in a wheel around it. The size of the segment represents the distance between the two molecules and the segment is colored by the disparity between the pair. Highly colored segments represent changes that result in disproportionately high changes in activity (colored red is worse activity, green is better).

DPP-IV_phenyl_activityview-300x168
It is interesting to contrast the activity view above with a classic SAR table:

row Phenyl substitution Activity (pIC50) row Phenyl substitution Activity (pIC50)
1 2,4,5-triF 8.2 6 3,4-diCl 5.8
2 2-Cl-4,5-diF 7.1 7 3-F 6.9
3 3,4-diF 6.9 8 2,4,5-triF 6.1
4 2,4,6-triF 7.1 9 4-F 6.6
5 2,5-diF 7.6

Clearly the SAR around the phenyl substituent is critical to activity but it is very difficult to decipher. However, with the combination of Activity Miner, field differences and the protein crystal structure we can get some pretty good hypotheses. (Note that all pictures below show field differences not absolute fields – regions where one molecule is more positive (red) or negative (blue) than the other.

1. The 2- substituent should have a negative field

The change of F to Cl in the 2- position (compare row 1 to row 2) is a slight increase in size but also introduces a small positive field at the end of the chlorine atom. It is interesting to note that the phenyl ring is slightly less electron poor when changing to chlorine (Cl is a better pi-donator than F). Taken together with the change of 2-F to 2-H (row 1 to row 3) there is a strong suggestion that this substituent should present a negative “end”. This is consistent with the protein crystal structure which indicates interactions with an arginine and the NH2 of an asparagine side chains.

Comparing row 1 to row 2 (top) and row 1 to row 3 (bottom) shows the less active molecules (right) are more positive at the end of the ortho substituent
Comparing row 1 to row 2 (top) and row 1 to row 3 (bottom) shows the less active molecules (right) are more positive at the end of the ortho substituent.

2. The 4 position prefers negativity at the end

Removing the 4-F from row 1 gives row 5. Moving the fluorine atom in this position round the ring one position gives row 8. In both cases the activity is reduced by the change. The smaller change in activity when going from F→H suggests that introducing a negative region in the 3 position is additionally unfavorable. Neither of these hypotheses are obvious from the protein crystal structure where both the 3 and 4 positions interact with a number of residues of various types.

Comparing row 1 to row 5 (top) and row 1 to row 8 (bottom) shows the less active molecules (right) are more positive at the end of the para substituent

Comparing row 1 to row 5 (top) and row 1 to row 8 (bottom) shows the less active molecules (right) are more positive at the end of the para substituent.

3. The 5 position must be negative at the end

All the changes that remove the negativity from the end of the 5 position result in significant drops in activity whilst those that retain the negativity, even in the absence of other favorable interactions retain some activity. For example row 4 has both the 2 and 4 fluoro atoms but is only pIC50 7.1. The reason for this becomes evident on examination of the protein crystal structure. This atom points directly at the edges of the indole from tryptophan-659 and the phenyl of tyrosine-670 (numbers from PDB 2QOE).

Comparing row 1 to row 4 (top) shows the less active molecules (right) are more positive at the end of the 5-substituent. Bottom shows the interaction of this substituent with the protein
Comparing row 1 to row 4 (top) shows the less active molecules (right) are more positive at the end of the 5-substituent. Bottom shows the interaction of this substituent with the protein.

4. The electron density of the phenyl substituent is important

This hypothesis is harder to establish as it comes from many observations. The most obvious is the change from row 3 to row 6 where there is a drop in activity from pIC50 6.9 to 5.8. Clearly this could be due to the increased size of the chlorine atoms but equally likely is the change in the electronic properties of the phenyl ring where highly electron poor rings have higher activity. This change is also observed where any of the fluorines of row 1 are deleted or where any atom is switched from fluorine to chlorine. Again the protein crystal structure helps to validate this hypothesis as the catalytic serine together with a couple of tyrosine residues point their respective alcohol oxygen atoms at the face of this ring.

Comparing row 3 to row 6 (top) shows the less active molecules (right) are more electron rich. Bottom shows the interaction of this phenyl ring with alcohols from the protein.

Conclusion

Many of our hypotheses could have been guessed at from studying the crystal structure of the 2,3,5-tri-fluorophenyl analogue in detail. However, the use of the field difference mode in Activity Miner brings the interactions into sharp focus and helps us rationalize the observations that we have. Subtle effects such as the difference between electron-rich aromatic and electron-poor aromatic rings are clearly visualized, explaining difficult and complex SAR in a way that is easy to interpret.

Our hypotheses can now be used in the design of new ligands with better IP or physicochemical properties with each design being validated against the regions of positive or negative field that we conclude to be important. Equally we could look for new ideas for this section of the molecule by using Spark together with the new reagent databases to suggest compounds (that we could make today!) that would retain the activity we have in this series while driving us into new regions of chemical space.

Delivering high quality library design

Libraries of chemical compounds are the lifeblood of modern drug discovery programs. The quality of library design can determine a project’s success or failure.

Both molecular modeling and cheminformatics techniques are important for the production of chemical libraries. The Cresset Consulting Services team has the analysis and design experience that is vital for the delivery of successful chemical libraries.

Different types of library design

Library design as a concept is not new, but it only became a popular paradigm in drug discovery a decade or so ago. Over time the field of library design has split to encompass two main type of library, both of which are commonly used by medicinal chemists for their drug discovery campaigns:

Diverse

  1. Diverse compound libraries for the discovery phase
  2. Diverse lead-like libraries for the discovery phase
  3. Diverse fragment libraries for fragment based drug discovery

Knowledge-based

  1. Focused libraries for the discovery phase
  2. Libraries for the lead optimization phase

Modern drug discovery now rarely proceeds simply via the classical route of making serial changes and acting on the output of testing. Rather, activity is explored using SAR explosions at discrete points in the process.

Designing a diverse library

Diverse sets of compounds – be they drug-sized, lead-like or fragments – are usually created by selecting compounds from a greater pool using some measure of diversity on the pool. The pool could be commercially available compounds (singles or libraries), internal collections or synthetically accessible library space. Often a combination of these sources is used to get the widest possible range of compounds into the final library. In most cases the selection of compounds to include in the diverse library proceeds by using a combination of 2D similarity matrices and property calculations. This is essentially the process used by big pharma to get the most out of their compound screening file.

Although there are established methods for this, which work OK for generic screening molecules from vendors, there is no standard protocol and each company may have a different preferred set derived from the same commercially available pool.

Diverse fragment libraries

With the rise of fragment based drug discovery over the last 5-10 years a thirst has emerged for libraries containing smaller lead-like and fragment-like diversity. The type of analysis required to gauge redundancy in this case becomes tricky as the smaller the molecules become the more difficult it is to create meaningful robust measures of chemical similarity – many of the 2D similarity methods lose their discriminatory ability. Thus fragment libraries or lead-like libraries may require special treatment.

We have become interested in using our own description of molecules – their shape and electrostatic character – to describe compound collections. We presented some initial work in this space at the spring 2012 ACS meeting. In this blog post Tim describes how we are looking again at the diversity of compound and specifically fragment collections using the computational efficiency available from BlazeGPU.

Knowledge based library design – Focused libraries

To design a focussed library computational input becomes a critical factor. Focused libraries are inherently the result of leveraging the designs using existing knowledge. However this knowledge can be applied in different ways. Two clear approaches are common in this space, each with differing factors that dictate the course of the library design workflow.

The technique typically used by compound vendors is to filter their compound collection based on the fit of molecules to activity models that have been developed (e.g. using physical property, pharmacophore or 2D similarity models). The usefulness of the classification is entirely dependent on the details of how the model has been constructed and applied.
The alternative technique, often employed by specialist vendors and bigger drug discovery organisations, is to design novel scaffolds and substitutions to address specific biological target areas of interest. These include application of structure or ligand based designs targeting protein families or sets of related targets using medicinal chemistry principles. Unlike the filtering approach above, in this case all molecules would have to be synthesized with inherent advantages (notably IP) and disadvantages (cost) that comes with this.

The latter undoubtedly requires the greatest engagement of time and resource to provide a suitable level of insight into the problem from which to develop innovative chemical solutions.

Case study

S-adenosyl methionine (SAM) is a co-factor used as a biological methylation synthon. It is employed in a host of enzymatic methyl transferase processes which are important in a number of disease areas. In the area of Epigenetics the lysine methyl transferases ‘KMT’s are responsible for methylating lysine groups on histones – a process which mediates gene expression by changing the stability of the nucleosome.

A quick analysis of the binding conformation of SAM across the PDB (Figure 1) reveals a small number of clusters of SAM bioactive conformations are observed. The conformation of SAM found in KMT’s form a tight cluster which is distinct from the more diverse generic SAM utilising enzymes. Interestingly, the analysis shows that DOT1L, which is also thought to be a KMT, is an outlier and more closely related to the generic enzyme set than to the other KMTs.

Figure 1. SAM conformations from SAM utilising enzymes observed from the PDB

Figure 1. SAM conformations from SAM utilising enzymes observed from the PDB

Assuming we wished to pursue a SAM mimetic design as a paradigm for KMT or DOT1L inhibitor generation, then from a molecular design point of view there are a number of issues which would need to be addressed. One major issue already given is that SAM is ubiquitously used as a cofactor thus a close mimetic may have unwanted side interactions. Clearly a DOT1L SAM mimetic design will have more issues with generic SAM enzyme crossover. A design aimed at other KMTs (e.g. SMYD2) would have selectivity issues just within the specific KMT family.
Designing away from potential crossover activity could be achieved by a full SAM mimetic design since both the adenine and Met chains adopt different vectors and shapes in the different sub-classes. Alternatively, concentrating on the adenine mimetic alone, the H-bonding patterns and solvent exposure are distinct in the two enzyme sub-classes as shown in Figure 2.

Figure 2. Differences in recognition of adenine in the two ‘DOT1L-like’ v ‘KMT-like’ systems

Figure 2. Differences in recognition of adenine in the two ‘DOT1L-like’ v ‘KMT-like’ systems

This simple example shows how some background knowledge on the system can impact on the scope and potential success of any given design.

We described in our previous blog how our fragment replacement tools can be used to search for novel bioisosteric replacements – in this case using the Spark software with adenine as the molecular input you can find suitable replacements as seeds for a library. As the template is extracted from a protein context all the ideas would be generated in the same coordinate frame and thus could be visualized and assessed for fit into the protein.

Alternatively the whole SAM 3D conformation from whichever sub-class could be submitted to Blaze to search for commercial vendor molecules that fit specific field patterns from the specific SAM conformation.

Figure 3. Library design idea for a SMYD-like KMT inhibitor (Left: SAM from SMYD2 and Right: virtual molecule)

Figure 3. Library design idea for a SMYD-like KMT inhibitor (Left: SAM from SMYD2 and Right: virtual molecule)

The output of these virtual exercises, rather than being molecules to test (which is the usual scenario) would be molecular scaffolding ideas that would be potential starting seeds for a design. Ideally we would be looking for a good molecular fit to the interaction patterns (Figure 3) and especially to those which also provide appropriate synthetic vectors from which to explore the allowed variation defined from the starting binding pose.

In this case Spark has provided us with a design idea which matches well to the field patterns and interaction patterns required by the KMT SAM conformation in SMYD2 (PDB: 3S7F) and provides three potential vectors for a library: R1 for the substrate pocket, R2 for the open solvated pocket, R3 for the ribose pocket (Figs 3 and 4).

Figure 4. Interaction patterns and putative library design substitution vectors.

Figure 4. Interaction patterns and putative library design substitution vectors.

A standard protocol for constructing the library might proceed as follows:

  1. Synthetically accessible variants (i.e., commercially available building blocks) of the above library would be gathered and a method outlined, possibly involving
  2. intermediate route scouting for incorporating R2 and R3 variants first and then a final array
  3. fulfilled by elaborating R1.
  4. A virtual ‘all-combinations’ library would be constructed and
  5. the enumerated library analyzed in terms of predicted ‘drug-like’ properties [MWT, LogP, TPSA, (HBD, HBA, Rot.bnd)-counts etc]. Combinations which provide poor properties would be discarded.
  6. Chemistry validation of the synthetic route and scope for the decoration transformations would be established followed by
  7. stability studies on a sub-set before (VIII) final synthetic library construction and (IX) purification and plating (i.e., 96 well plates for screening).

Our library design service offering

Cresset computational chemists have wide knowledge of and experience in delivering projects involving all of the library scenarios described above which we are now able to offer as a service. Contact us for more information.