Blaze V10.2 released

We are delighted to announce the release of Blaze V10.2. This version of our unique virtual screening software contains major enhancements to the workflow that improves results and reduces the search time, a new REST interface to enable enhanced integration and an improved security system. We have also released a new fully functional Blaze Cloud demo server.

Blaze search cascade

Blaze Search cascade

FieldPrint

FieldClique

FieldSimplex

We’ve taken a fresh look at the workflow of a typical Blaze experiment. We use a search cascade (right) in Blaze to enable us to process large databases of compounds (typically up to 10 million) in a time efficient but detailed manner. The cascade uses field finger prints (FieldPrints) to do a rapid search across the entire database. However, we know that methods derived from distilling the field down to a simple vector representation are not as accurate as performing full alignments and therefore developed a quick alignment method ‘FieldClique’ to do exactly this. In FieldClique we perform a rapid alignment and single point score to get a good estimate of the final optimized alignment score. This method is significantly more accurate than FieldPrint but still less accurate than the fully optimized alignment score which we get from a FieldSimplex routine. The accuracy of these 3 different methods is used to cascade the number of compounds through the virtual screening experiment. All compounds receive a FieldPrint score, around 30-50% receive a FieldClique score and around 5% receive the most accurate FieldSimplex score.

Greater throughput of molecules

We have been looking at improving the performance of the screening cascade for some time with a desire to fully align more molecules. For customers with GPU clusters this is less of a problem as the throughput of the GPU code is so good. In this release of Blaze we have introduced a new algorithm for the FieldClique method that enables a much greater throughput of molecules but at the cost of some accuracy. The new algorithm is around four times faster than the previous one, enabling 4 times as many molecules to be processed in the same time. In testing we have found that the loss of accuracy is more than compensated for by the ability to process more molecules in a full 3D alignment method and by subsequent FieldSimplex refinements. We recommend that whenever computing resources are available you should be refining greater than 50% of the FieldPrint results using the new fast FieldClique routine.

Number of compounds FieldPrint FieldClique FieldSimplex Typical Time using 250 CPU cluster
Normal Mode 4 500 000 1 500 000 150 000 4 hours
Fast Mode 4 500 000 3 000 000 150 000 2.3 hours

 

Interact with Blaze through many different environments

This version of Blaze comes with a new RESTful web service that enables you to interact with Blaze through many different environments. The fully documented methods provide access to all of Blaze’s features including searching, collection management and result retrieval. This highly requested feature provides easy integration with workflow solutions such as KNIME and Pipeline Pilot and enables Blaze integration with custom software solutions as well as enhancing the integration with future versions of Cresset’s desktop applications such as Forge.

A Blaze search using KNIME

A KNIME workflow for searching Blaze

Enhanced security features

One of the key requests from our customers has been to simplify and unify the excellent security features of Blaze with their corporate policies. In this release we have achieved this by adding support for the apache web server authentication modules. This new integration enables you to roll Blaze out to a wider audience without the previously required management of user names and passwords. This will prove especially useful as we roll out enhancements to the desktop products that enable full control of Blaze from within those applications.

Free demo server

The new features above have enabled us to enhance our Blaze SaaS offering to include access from KNIME and to offer a demo server to the community at large. The demo server is fully functional – you can search a small collection of compounds using the standard search cascades, manage compound collections etc. Blaze is remarkable easy to use but should you require advice then the software comes with a full manual and context sensitive help on every page. You are welcome to use it from a web browser or using the REST interface. Register for your username and password at the Blaze demo signup page.

BlazeGPU released

We are delighted to announce the release of BlazeGPU. This update to Blaze is available to all customers at no extra charge and includes all of the infrastructure needed to convert a standard blaze install to the GPU version.

Our 18 month project to convert our core algorithms to run in a GPU environment has achieved a fantastic 50 times speed up (see the graph on the BlazeGPU page). This massive increase in speed has been achieved without losing any accuracy; in fact BlazeGPU gives slightly better results than previous versions. Additionally, we have found that we get significant speed increases on consumer devices as well as high-performance accelerated devices.

BlazeGPU Speed Data
We are looking forward to using the new capabilities of BlazeGPU to investigate problems that we had not previously been able to look at. For example we have been looking at 3D molecular similarity in compound collections using our technology and are expecting to be able to increase the pace of this research with our new code.

BlazeGPU has enabled us to improve the throughput of projects using our Consulting and Software as a service offerings. We have invested in new hardware at a cost of only $2000 that effectively doubles the throughput of our 150 node Linux cluster. We are investigating the creation of a portable cluster that will be available for our clients to use on their premises. This ‘COW’ will be available to use on virtual screening campaigns by our clients shortly. Please get in touch with us to find out more.

We forward to bringing similar speed increases into our desktop applications in due course.

Porting Virtual Screening Applications

The classic process of drug discovery in particular is long and laborious. A chemical lead must first be found, often by computationally screening millions of chemical compounds against a target. Hundreds or even thousands of compounds must then be synthesised around this lead to try to identify one with the correct combination of properties to become a drug. This trial-and-error iterative process typically takes about five years, and if a new drug is discovered, the US FDA approval process then requires another five years of clinical trials. The parallel computing challenge is to speed up the screening and lead optimization process, thus reducing the time to find new drugs and increase the chance of saving lives.

In Scientific Computing World, Simon Krige discusses how GPUs are speeding up the screening and lead optimization process. Read the full article.



Simon Krige,
Software Developer

April 2013 Newsletter

Dear colleague,

Secure your place at Cresset’s North American and European user group meetings and workshops. Whether a customer or not we hope you will join us. Registration is free!

This month we: look at GSK’s discovery of a series of potent and selective inhibitors of PKR-like endoplasmic reticulum kinase (PERK); release a new version of Cresset’s KNIME nodes; give further details on our plan to dramatically change the landscape for field based virtual screening.

Kind regards,

The Cresset Team

Field Based Chemistry North America


With just over three weeks to go space is filling up fast so secure your place now. Scientifc program presentations are from Cresset’s experts plus Broad Institute, Novartis, DCAM Pharma, Utica College and Drexel University. Network with Cresset users from large pharma, biotech and academia.

Free training is available at our hands-on software workshops for computational and medicinal chemists. The workshops are: “Rapidly generating new scaffolds” and “Designing imaginative and effective compounds”. Limited space is available to ensure you receive maximum benefit from these small groups, so book your place now.

Join us for an informative day and dinner as our guest at Cambridge Brewing Company. Register FREE.

Training Ideas for Synthesis – A Study of Recent PERK Inhibitors


Following GSK’s publication of a series of potent and selective inhibitors of PKR-like endoplasmic reticulum kinase (PERK), Cresset asks “What is a good way to triage ideas for synthesis?”. We use Torch and Spark to examine the SAR data and design new compounds. Read more …

Release of V2.0.0 of Cresset’s KNIME Nodes


Includes: addition of Spark and the 3D-QSAR functionalities of Forge resulting in 6 new nodes; expanded capabilities of the XedMin node; updated molecule viewer node. Read more …

Dramatically Changing the Landscape for Field Based Virtual Screening


We’re rewriting the core Blaze functionality into openCL to give you a dramatic speed increase. The changes mean that you will be able to screen a database of a few million compounds easily overnight using a single desktop box with 4 GPUs rather than requiring a Linux cluster. Alternatively, you could use a small cluster equipped with GPU coprocessors to screen virtual libraries of tens or hundreds of millions of molecules. Read more …

Field Based Chemistry Europe, Cambridge, UK


Scientific program includes presentations from: GSK; Merck Serono; Astex Pharmaceuticals; e-Therapeutics; Trinity College Dublin as well as Cresset’s experts.

Workshop options for CompChem and MedChem: “Using cloud based virtual screening to find new leads”; “Rapidly generating new scaffolds”; “Deciphering complex SAR” and “Designing imaginative and effective compounds”.

Join us for an informative day plus a tour of Madingley Hall and dinner as our guest. Register FREE.

Dramatically Changing the Landscape for Field Based Virtual Screening

BlazeGPU Sneak Preview

As those of you who have been regularly reading the Cresset newsletter will know, we have had an ongoing collaboration with Simon Mcintosh-Smith at the University of Bristol to bring some of his group’s expertise in parallel and multicore computing to Cresset. The initial aim of the collaboration is to port our field based alignment and similarity algorithm to run on graphics cards (GPUs). This was no small task as it involved rewriting a number of highly advanced and complicated algorithms into openCL, a new language designed explicitly to allow massively parallel computation.

GPUs are strange beasts. The architecture was originally designed for shunting vast numbers of triangles and textures around in order to make computer games prettier. They have increasingly become useful for more general-purpose computing tasks, but their origins mean that while they can be blazingly fast at some types of computation they do other things very poorly. Provided you can tweak your algorithm into an appropriate form you can get remarkable performance out of a single cheap graphics card, but not all algorithms are amenable to such tweaking and it takes an expert to get the best results. Luckily we have an expert on hand in the form of Simon Krige, who joined us last year from Bristol!

Simon has been working hard rewriting the core Blaze functionality into openCL, from our initial field point clique-matching algorithm to shape similarity calculations to the core field similarity computations that are at the center of Cresset’s ability to sensibly compare molecules’ electrostatic properties. The results look set to dramatically change the landscape for field based virtual screening. As you can see from the graph below, a single NVidia or AMD graphics card has the same screening performance as more than 40 modern CPU cores!


This dramatic speed increase means that screening of a database of a few million compounds can be easily done overnight using a single desktop box with 4 GPUs, rather than requiring a Linux cluster. Alternatively, a small cluster equipped with GPU coprocessors (which are becoming increasingly common) will be able to screen virtual libraries of tens or hundreds of millions of molecules, a database size which was previously accessible only to 2D methods. The best thing is that you can access this new blazing fast technology through the same intuitive web-based interface as the CPU version of Blaze.

BlazeGPU is scheduled for release later this year. Contact us for more information on Blaze or BlazeGPU.

Accelerating Ligand-Based Virtual Screening

Today the UK’s most powerful GPU-based supercomputer, ‘Emerald’, will enter into service alongside the ‘Iridis 3’ system at the Science and Technology Facilities Council’s Rutherford Appleton Laboratory (RAL) in Oxfordshire, UK. These two High Performance Computing systems will give businesses and academics unprecedented access to super-fast processing capability.

Cresset is collaborating with the high performance computing group at the University of Bristol, UK to implement new GPU based algorithms within the core of our field technology. The following poster will be presented at today’s meeting. For further details of our project with the University of Bristol refer to our Fields at Warp Speed blog post.

Accelerating Ligand-Based Virtual Screening

Mark Mackey†, Simon McIntosh-Smithµ, Simon Krige†, Rob Scoffin
†Cresset Biomolecular Discovery Ltd, BioPark, Broadwater Rd, Welwyn Garden City, Herts, AL7 3AX, UK
µDepartment of Computer Science, University of Bristol, Woodland Road, Clifton, BS8 1UB, UK

Introduction

It has long been known that small molecule drugs are recognized by and bind to proteins on the basis of their 3D electronic and shape properties, yet the drug discovery cycle has traditional described and protected 2D structures.

Cresset is using field point descriptions of molecules to close the gap between chemistry and biology, bringing the features that are recognized by proteins to the desktop of our customers.

Field Points

Field Points are a condensed representation of electrostatic, hydrophobic and shape properties (protein’s view).

Molecular Field Extrema
Field Points

Molecular Similarity Scoring Algorithm

Given an alignment:
– For a given field point on molecule A, calculate what the field value is at the corresponding point in molecule B. The score of the field point is the product of its size and B’s field value.
– Repeat for all field points on A and calculate the sum of scores
– Repeat for the field points on B sampling the field of A, and normalise to a similarity

MolecularAlignmentAndSimilarities

Results

Current Situation

Large database of molecules (~5million)
– Compute time: 2-5s per molecule on a single CPU core
– Full screening takes ~35hours on 200 CPUs
– Full screening costs ~$500 on CPU

FieldScreenDB

Using GPUs and the Emerald Cluster

We have run the prototype FieldScreen GPU port on Emerald nodes. Speedup results are relative to the serial code running on 12 Intel i7 CPU cores.
– Using OpenCL: currently ~40x faster for a GPU vs a CPU.
– Full screening:
~$20 on GPU (25 times cheaper than CPU).
~30min using all Emerald GPUs!

NumberOfGPUs

Conclusions

The Emerald Cluster is giving us the opportunity to screen large virtual libraries of compounds (> 100m compounds) in very little time. The speed and cost advantages of GPUs have made it our technology of choice.

Acknowledgements

Funded by a TSB Knowledge Transfer Partnership.