| advertise add site services publishers database health videos | ![]() | about toolbar stats live show health store more stuff JOIN/LOGIN |
Biological databases are libraries of life sciences information, collected from scientific experiments, published literature, high throughput experiment technology, and computational analyses. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics.[1] Information contained in biological databases includes gene function, structure, localization (both cellular and chromosomal), clinical effects of mutations as well as similarities of biological sequences and structures. Relational database concepts of computer science and Information retrieval concepts of digital libraries are important for understanding biological databases. Biological database design, development, and long-term management is a core area of the discipline of Bioinformatics.[2]. Data contents include gene sequences, textual descriptions, attributes and ontology classifications, citations, and tabular data. These are often described as semi-structured data, and can be represented as tables, key delimited records, and XML structures. Cross-references among databases are common, using database accession numbers. [edit] OverviewBiological databases have become an important tool in assisting scientists to understand and explain a host of biological phenomena from the structure of biomolecules and their interaction, to the whole metabolism of organisms and to understanding the evolution of species. This knowledge helps facilitate the fight against diseases, assists in the development of medications and in discovering basic relationships amongst species in the history of life. The biological knowledge is distributed amongst many different general and specialized databases. This sometimes makes it difficult to ensure the consistency of information. Biological databases cross-reference other databases with accession numbers as one way of linking their related knowledge together. An important resource for finding biological databases is a special yearly issue of the journal Nucleic Acids Research (NAR). The Database Issue of NAR is freely available, and categorizes many of the publicly available online databases related to biology and bioinformatics. [edit] OutputBiological data comes in many formats. These formats include text, sequence data, protein structure, links. Each of these can be found from certain sources. For example, Text formats are provided by PubMed and OMIM. Sequence Data are provide by GeneBank, in terms of DNA, and UniProt, in terms of protein. Protein Structures are provided by PDB, SCOP, and CATH. [edit] Example public databases for molecular biology(from www.kokocinski.net) [edit] Primary sequence databasesThe International Nucleotide Sequence Database (INSD) consists of the following databases.
The four largest databases are GeneBank, (the U.S.’s collection of various biological data), EMBL, (Europe’s collection of nucleotide sequence data), DDBJ, (DNA Data Bank of Japan), and UniProt, (Universal Protein Resource). GeneBank, is a service provided by NCBI, which stores sequence data and “biological sequence related data.” EMBL is a service provided by EBI, the European Bioinformatics Institute, and provides a collection of nucleotide sequence data, as its name suggests. DDBJ is a nucleotide database. UniProt is a high-quality and comprehensive universal protein resource. It provides translations of sequences from EMBL, GeneBank, and DDBJ, in its UniProt Knowledgebase (UniProtKB). These databanks represent the current knowledge about the sequences of all organisms. They interchange the stored information and are the source for many other databases. Note that GeneBank, EMBL, and DDBJ work very closely with one-another, and as a result what one can find from one of these sources they can find from any of the other two and vice-versa. [edit] Meta-databasesStrictly speaking a meta database can be considered a database of databases, rather than any one integration project or technology. They collect data from different sources and usually makes them available in new and more convenient form, or with an emphasis on a particular disease or organism.
[edit] Genome DatabasesThese databases collect organism genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species genomes, or a single model organism genome.
[edit] Genome BrowsersGenome Browsers enable researchers to visualize and browse entire genomes (most have many complete genomes) with annotated data including gene prediction and structure, proteins, expression, regulation, variation, comparative analysis, etc. Annotated data is usually from multiple diverse sources.
[edit] Protein sequence databases
[edit] Protein structure Databases
[edit] Protein-protein interactions
[edit] Signaling Pathway Databases
[edit] Metabolic pathway Databases
[edit] Microarray databasesMain article: Microarray databases
[edit] Mathematical Model Databases[edit] PCR / Real time PCR primer Databases[edit] Specialized databases (in alphabetical order)
[edit] Wiki style databases
[edit] Problems Associated with Protein DatabasesSince discovery in the area of protein structure has not evolved quite as quickly as discoveries in the area sequence data, due to the 3D nature of protein structure, less information is available for it. Nonetheless, data can be accessed through the RCSB Protein Data Bank at (http://www.pdb.org), SCOP-Structural Classification of Proteins- at ([13]), and CATH at ([14]). [edit] Frequently UsedAlso, species specific databases are also available for some species, mainly those that are often used in research. These databases provided extensive detail for the species in question. For example, Colibase ([15]) is an E. coli database. Other popular species specific databases include, Flybase ([16]) for drosophila, and Wormbase ([17]) for nematodes. [edit] HarvestingIt is impossible to attain all the necessary information in one place with the large amount of information present. However, there is a web-site which is working on doing just that. There web-page can be found at, http://harvester.embl.de/. [edit] Be CarefulNote, with the large amount of information available, one must be wary of false data. [edit] References
http://www.avatar.se/molbioinfo2001/databases.html</ref> [edit] See also
[edit] External links
| ||||||||
| ↑ top of page ↑ | about thumbshots |