ChEMBL leadlike compounds freely searchable on Blaze demo server

The Blaze virtual screening demo server has proved popular since its launch last year, however, we wanted to extend the range of compounds that are available for users to search. We have now achieved this through the introduction of three new collections of ChEMBL compounds. These collections provide leadlike compounds for drug, agrochemical, flavor and fragrance discovery that are suitable for the evaluation of Blaze in these areas. The collections are open to all registered Blaze demo users whether accessed through a web page, KNIME or using Forge, Torch, or TorchLite.

Creating collections of molecules for searching in Blaze

Blaze is a full virtual screening system that is integrated to queuing systems like SGE for database population and searching and hence creating a new collection to be searched is easy. Blaze takes care of the difficult part – splitting uploads into different sizes, identifying and linking duplicates, exploding unspecified chirality and populating the conformations of new molecules. This creates a new collection for searching. All that is required is to tell Blaze about the collection and then to upload an SDF file to the server. Choosing what to upload is more difficult. On our main Blaze server that we use for our consulting projects we have 10,000,000 molecules arranged in collections from compound suppliers. In the demo server it is not possible to use such large numbers of compounds. Until now we have had only a few thousand compounds. Here we expand that to over 400,000 compounds, derived largely from ChEMBL.

Creating the ChEMBL collections

To generate collections with appropriate properties ChEMBL was filtered in KNIME using physico-chemical properties as shown in the table below.

Property Chembl20_filtered
leadlike collection for
drug discovery
leadlike collection for
agrochemical discovery
ChEMBL filtered for
fragrance like molecules
MW 200 – 400 200 – 430 30 – 300
TPSA 40 – 80  N/A < 60
RotBonds 0 – 5  < 5 0 – 4
Aryl rings 0 – 3 N/A N/A
HBD 0 – 3 2 – 3 0 – 1
HBA 0 – 6 2 – 12 0 – 3
SlogP -1 – 4 0.75 – 4.5 > 1
Elements C,N,S,O,F,Cl,Br,I  C,N,S,O,F,Cl C,H,N,O
Total Molecules
available for searching
202,895 136,457 45,383

Additionally for the drug discovery library we removed compounds that we considered to be toxic or undruglike (acyl halides, sulfate esters etc.) and compounds that contain specific functional groups that have regularly appeared as false positives with Blaze (thioethers, hydrazones and imines).

The filtering was performed in KNIME workflows (represented for the drug discovery collection below).

The upload is traditionally done using Blaze’s web interface but on this occasion we chose to extend our KNIME protocol to upload the compounds to Blaze using the REST interface. This feature was introduced in Blaze 10.2 and has proved a popular and easy way to keep Blaze in sync with corporate databases. While we are using KNIME here, the protocol would work equally well with Pipeline Pilot. The upload workflow is shown below with the filtering steps reduced to metanodes.


Using the new collections

The new collections are available to search using the standard Blaze web interface or through the REST interface enabling searching from KNIME and Pipeline pilot as well as Cresset’s desktop applications Forge, Torch, or TorchLite. The applications require configuring (in the preferences) with the address of the Blaze server together with a username and password for access. Once this is done the Run menu → Send to Blaze and right click menu ‘Send to Blaze’ options will open a dialog box for configuration of the Blaze search.

The advantage of submitting a Blaze search from within the desktop applications is that your current field constraints and the protein excluded volume will get transferred to Blaze and used without extensive interaction or file uploading.

Note that result download is also possible from within the desktop applications. Selecting the File menu → Download Blaze Search Results brings a dialog containing a tree view of Blaze searches. One tip here – it is important to make sure that we select the best results – those from the simplex refinement not the initial search.

To try the new Blaze collections for yourself please register for a username and password. If you think that there are other sets that we could usefully include or that we could improve the filters that we have used here then please contact us to discuss your suggestion.