News

Ensure novel ideas for your project with the new Spark databases

To accompany the release of Spark™ V10.6, the Spark fragment and reagent databases have been updated and are now available for download. Derived by fragmenting compounds and reagents from commercial sources and the literature, these database are a great source of novel ideas for your drug discovery projects, ensuring at the same time that the results found by Spark are always associated to real, synthetically accessible compounds.

Fragment databases

The Spark ‘Commercial’ databases in this release are derived from the eMolecules Screening Compounds. With more than 6 million fragments to search overall, they provide an excellent source of chemical diversity for your experiments.

The Spark ‘ChEMBL’ databases have also been updated. Based on release 26 of ChEMBL, they provide more than 1.5 million additional fragments to search, derived from chemical literature compounds.

Compounds in both original source collections are filtered to remove molecules containing potentially toxic or reactive groups before the creation of the databases. Each compound is then fragmented independently, breaking the bonds which connect to heteroatoms, carbonyls, thiocarbonyls and bonds to rings. Specific functional groups such as carboxylic acids, nitro groups and rings are not fragmented. The frequency with which a given fragment occurs is captured together with the number of bonds that were broken to disconnect the fragment from the parent molecule.

All resultant fragments are subject to molecular weight, number of H-bond acceptor/donor and rotatable bond limits. They are then sorted by frequency and labelled as shown in Table 1.

Table 1: Fragment databases sorted by frequency.

Spark category	Database	Total number of fragments (to nearest 1,000)	Frequency
Commercial	Very Common	68,000	Fragments which appear in more than 725 molecules
	Common	68,000	Fragments which appear in 215-724 molecules
	Less Common	212,000	Fragments which appear in 65-214 molecules
	Rare	280,000	Fragments which appear in 25-64 molecules
	Very Rare	527,000	Fragments which appear in 9-24 molecules
	Extremely Rare	534,000	Fragments which appear in 5-8 molecules
	Ultra Rare	770,000	Fragments which appear in 3-4 molecules
	Doubleton*	1,053,000	Fragments which appear in 2 molecules
	Singleton*	2,526,000	Fragments which appear in a single molecule
ChEMBL	Common	232,000	Fragments which appear in more than 12 molecules
	Rare	232,000	Fragments which appear in 4-12 molecules
	Very Rare	382,000	Fragments which appear in 2-3 molecules
	Extremely Rare*	641,000	Fragments which appear in a single molecule

*Contact us for further details.

Typically we would recommend to install only the databases including fragments which appear at least 3-4 times in the original collections. The databases containing fragments seen with lower frequency (Singleton, Doubleton and ChEMBL Extremely Rare) are very large, and may contain fragments derived from unrealistic/wrong structures in the original collections. If you do wish to use these databases then please contact Cresset Support for download instructions.

The number of fragments in each database per connection point count (excluding the databases containing only singletons and doubletons) is shown in Figure 1.

Counts of fragments in Spark databases

Figure 1: Count of fragments in Spark ‘Commercial’ and ‘ChEMBL’ databases split by the number of connection points of each fragment.

The most common fragments in the ChEMBL and Commercial databases have a significant overlap (Table 2). However, comparing the rarer fragments from each database shows significantly less overlap, highlighting the different areas of chemical space each database occupies.

Table 2: Overlap of the most common fragments in the ChEMBL and Commercial databases.

% overlap	Very Common	Common	Less Common	Rare	Very Rare	Extremely Rare	Ultra Rare	Doubleton*	Singleton*	Unique
ChEMBL common	17%	14%	13%	8%	8%	4%	4%	3%	5%	24%
ChEMBL rare	2%	6%	9%	9%	10%	6%	5%	4%	6%	43%
ChEMBL very rare	1%	2%	4%	5%	7%	5%	5%	5%	7%	58%
ChEMBL extremely rare*	0%	1%	2%	3%	5%	4%	4%	4%	9%	68%

*Contact us for further details.

With more than 6.9 million unique fragments to search, the Spark fragment databases provide an extremely large source of novel bioisosteres for Spark projects, which can be further complemented by generating fragments from your corporate collection with the Spark Database Generator, a dedicated and user-friendly interface to custom database creation within Spark.

Reagent databases

Monthly updates of the Spark reagent databases, derived from the eMolecules building blocks using an enhanced set of rules for chemical transformation, are included in the Spark V10.6 release. The November update includes over 314,000 reagents with up-to-date availability information, to make it easy for you to order the reagents you require to synthesize your favorite Spark results.

	Total	1-50	51-100	101-150	151-200	201-250
eMolecules_acidCO	23,983	3	401	6,732	13,361	3,486
eMolecules_acid	41,545	43	2,811	15,618	17,939	5,134
eMolecules_alcohol	18,032	11	1,435	7,521	7,193	1,872
eMolecules_alcoholO	19,634	3	468	6,773	9,666	2,724
eMolecules_aliphatic_halide	8,808	13	924	3,651	3,421	799
eMolecules_alkyne	2,851	27	505	1,420	781	118
eMolecules_aromatic_alcoholO	8,625	0	44	1,927	5,023	1,631
eMolecules_aromatic_aminesN	18,557	0	111	4,207	10,567	3,672
eMolecules_aromatic_halide	40,110	8	451	13,592	22,762	3,297
eMolecules_boronic	4,496	0	128	1,894	2,093	381
eMolecules_cyano	15,118	20	1,086	5,662	6,283	2,067
eMolecules_isocyanateCO	555	0	20	170	287	78
eMolecules_olefin	3,273	16	524	1,419	1,089	225
eMolecules_primary_aliphatic_amine	19,016	6	1,366	8,495	7,763	1,386
eMolecules_primary_aliphatic_amineN	11,571	0	398	5,234	5,101	838
eMolecules_primary_aliphatic_halide	6,875	12	627	2,886	2,705	645
eMolecules_primary_aromatic_amines	23,350	0	325	6,581	12,171	4,273
eMolecules_reductive_amination	22,127	3	818	6,551	10,683	4,072
eMolecules_secondary_aliphatic_amineN	15,061	1	277	4,270	8,413	2,100
eMolecules_sulfonicacid	5,066	31	602	2,265	1,761	407
eMolecules_sulfonicacidSO2	3,075	0	13	302	1,584	1,176
eMolecules_thiol	721	7	206	330	164	14
eMolecules_thiolS	1,986	1	38	537	1,078	332

In the Spark results table, the eMolecules IDs for your favorite reagents can be easily exported from Spark and used to purchase the compounds from the eMolecules building blocks database, as shown in the web clip How to use the eMolecules reagents databases in Spark and access ordering information for the result.

Crystallographic fragments database

Spark V10.6 also includes the new ‘COD’ database (Figure 2). This contains more than 440K fragments in their crystallographic conformation, derived from the Crystallography Open Database and available for download to all Spark customers.

New COD databases

Figure 2: The new ‘COD’ database is available to all Spark customers and includes more than 440K fragments in their crystallographic conformation.

Create your own Spark databases

If you have access to large collections of proprietary chemistry or specialized reagents, or if you want to only consider fragments from reagents you have in stock, you can add value to your Spark experiments by creating your own custom databases.

These can be easily prepared using the Database Generator (Figure 3), a dedicated and user-friendly interface to custom database creation within Spark, or using the equivalent functionality from the command line.

Spark database generator

Figure 3: Use the Spark Database Generator to create your own fragments and reagent databases.

Conclusion

This new release of the fragment and reagent databases, combined with custom databases from corporate collections generated with the Spark Database Generator, will provide an outstanding range of bioisosteres for your project.

Spark and your project

Please contact us to update to the latest databases, to learn how to make the best use of the Spark Database Generator, or to find out how Spark can impact your project.

desktop

Server

Ensure novel ideas for your project with the new Spark databases

Fragment databases

Reagent databases

Crystallographic fragments database

Create your own Spark databases

Conclusion

Spark and your project

Request a software evaluation, Torx® demo or Discovery CRO discussion

Fragment databases

Reagent databases

Crystallographic fragments database

Create your own Spark databases

Conclusion

Spark and your project

Improving PROTAC properties via single-point changes to linkers

Promising results for three GPCR benchmarks using Flare™ FEP for accurate binding affinity calculations in membrane proteins

Flare™ V8 릴리즈: Cresset의 CADD 워크벤치 최신 버전의 흥미로운 새로운 과학, 향상된 기능 및 시각적 분석 도구에 액세스하세요

Request a software evaluation, Torx® demo or Discovery CRO discussion