ChemistryOA
Open Access to Scholarly Literature, Data and Lab Experiments in Chemistry
---- A subject Guide
Open Data Resources
Chemistry is a data-rich science. The following open data databases strive to capture, publish and preserve factual and non-text information of chemical compounds and structures, which are of significance to chemists and researchers in this field.
- PubChem (http://pubchem.ncbi.nlm.nih.gov/ )
It is a freely accessible database of small molecules. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. The system is maintained by the National Centre of Biotechnology Information (NCBI) of the National Library of Medicine in USA. It provides services similar to ACS’s Chemical Abstracts Service.
- ChEBI (http://www.ebi.ac.uk/chebi/)
Chemical Entities of biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on “small” chemical compounds, including distinct atom, molecule, ion, ion pair, radical ion, complex, conformer, etc, identifiable as a separately distinguishable entity. All data in the database is non-proprietary or is derived from a non-proprietary source. It is thus freely accessible and available to anyone. In addition, each data item is fully traceable and explicitly referenced to the original source.
It pulls up information from several government databases as well as catalogue data from about 150 chemical vendors and allows chemists to search it by structure and substructure. Searchers use eMolecules mainly to find what compounds are available and from which suppliers for over 8 million unique chemicals. Projects are underway to allow the search engine to pull up structures from published patents and peer-reviewed literature.
It is an interesting and free-to-access database of insect behavior modifying chemicals. Contributed by interested specialists, “its main objective is to convert scientific data and knowledge about behavioral modifying chemicals in insects from the literature and peer-reviewed information into electronically searchable database entries”. With over 5000 entries and about 3000 molecules, it is the world's largest database of behavior modifying chemicals.
- ChemBank (http://chembank.broad.harvard.edu/welcome.htm)
It is a public, web-based informatics environment and is created by the Board Institutes’s Chemical Biology Program. It includes freely available data derived from small molecules, small-molecule screens and resources for studying the data. ChemBank is intended to guide chemists synthesizing novel compounds and allow libraries to assist biologists searching for small molecules.
- ChemIDplus(http://chem.sis.nlm.nih.gov/chemidplus/chemidlite.jsp)
It is a free, web-based search system that provides access to structure and nomenclature authority files used for the identification of chemical substances cited in National Library of Medicine databases. ChemIDplus provides structure searching and direct links to many biomedical resources at NLM and on the Internet for chemicals of interest. The database contains over 380,000 chemical records and 289,000 records include chemical structures. It is searchable by name, synonym, CAS registry number, molecular formula, classification code, locator code, structure, toxicity, and/or physical properties.
It is a free access service providing a structure centric community for chemists. It was built with the intention of aggregating and indexing chemical structures and their associated information into a single free searchable repository available to everybody”. ChemSpider is the richest single source of structure-based chemistry information. Nature is now depositing chemical data in ChemSpider.
Chempedia is a free and continuously - updated online encyclopaedia of chemical compounds. It is built upon two of the biggest free and open chemical information repositories - Wikipedia and PubChem. Chempedia includes all Wikipedia's 6,000-plus chemical compound monographs. Each monograph gives information about the structure, uses, history, and significance of a chemical compound. Users can search for compounds by structure or by text using a compound name, CAS Number, or PubChem CID.
Peter Murray-Rust, a chemist and advocate of Open Data, reported in December 2007 that Microsoft is funding a repository interoperability project with partners PubChem (see above), Cornell, Los Alamos Nuclear Laboratory, as well as several universities that involves creating “well-populated molecular repositories with heterogeneous content (for example, everything from crystallography to Wikipedia chemicals)” that will be openly available.
Created by Qiong Yang
Instructor: Heather Morrison
LIBR 559K: Topics In Computer-Based Information Systems: Open Access
June 21, 2008
Comments (0)
You don't have permission to comment on this page.