Downloadable Data Files

The DataWarrior installation already comes with various sample data files. These include FDA-approved drugs, compound collections with physico-chemical properties or measured pKa values, kinase ligands, and a few files with non-chemical content to illustrate various program features.

This page contains download links to larger data files, which are not included in the DataWarrior installers, because they would significantly increase its size or because may not be of general interest.


March 2018 Snapshot of the Crystallography Open Database (COD)

Since version 4.2.2 DataWarrior is able to generate conformers. The algorithm uses a combination of self-organization and rule-based approach. The latter is based on statistical data derived from a large number of 3-dimensional, diverse, organic structures from a crystallographic database. The de-facto standard source for organic, crystallographic molecule structures would be the Cambridge Structural Database (CSD). Its license, however, does not permit to derive and publish geometrical statistical data as part of an open source package. Luckily, there is an open alternative, the Crystallography Open Database (COD). While this database consists of one CIF file per structure, Saulius Grazulis and Antanas Vaitkus from the COD have built an automatic procedure to convert the database into a format that is more suited for cheminformaticians using Perl, Java and OpenChemLib. Here you may download a COD snapshot with 215,995 quality-checked 3D-structures in DataWarrior format (112,670 organic, 94,913 metalorganic, 8412 inorganic structures, 292.7 MByte, COD snapshot, March 24, 2018).

DataWarrior displaying a COD entry


DrugBank Version 5.0.10 (Subset in DataWarrior format)

The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains 8250 drug entries including 2016 FDA-approved small molecule drugs, 229 FDA-approved biotech (protein/peptide) drugs, 94 nutraceuticals and over 6000 experimental drugs. This DataWarrior file is a subset of drugbank 5.0.3 downloaded from https://www.drugbank.ca. DrugBank is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes (including internal use) requires a license. We ask that users who download significant portions of the database cite the DrugBank paper in any resulting publications. Citing DrugBank: Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D668-72. 16381955.

DataWarrior showing general information about Vitamine E entry of 'DrugBank'