WHAT ARE DATABASES?
Databases are electronic cabinets used in storing retrieving and data management.
A biological database is a large organized body of persisted data coming from various laboratories.
E.g. NCBI, Swissprot
Types of databases
-
Primary database:
These databases are also known as archival databases. Generally, these databases are populated with experimental data such as nucleotide sequences, protein sequences. The experimental data in these databases is directly coming from various laboratories & stored without any changes
Data stored in these databases is never changed, only the version no of that data can be changed.
Example of the primary database: ENA, GenBank, DDBJ Â
-
Secondary databases:
These databases comprise data from the result analysis of primary data. Secondary databases are highly curated by using a complex computational algorithm. These databases are created manually or automatically and contain more relevant information about the structure.
Ex. Prosite, Swissprot, prints
-
Composite databases:
These databases stores information from various databases (primary and secondary databases) and these databases compile and filter data from different databases
Ex. NCBI
The need of databases:
Databases act as a storehouse where we can store lots of information.
Databases are used to store and organize data that can be retrieved easily.
Databases allow researchers to find connections between pieces of information.
It allows the indexing of data.
Secondary databases have become the molecular biologist reference library.
Some important databases of bioinformatics
-
NCBI
http://www.ncbi.nlm.nih.gov.in
NCBI stands for NATIONAL CENTER FOR BIOTECHNOLOGY INFORMATION
NLM stands for         NATIONAL LABORATORY OF MEDICINE
NIH stands for           NATIONAL INSTITUTE OF HEALTH
NCBI was created in 1988 and it a very important database relevant to biotechnology. NCBI has its own search engine known as ENTREZ
NCBI is a composite database it is connected with different other databases such as:
PUBMED for books and journals.
GenBank for nucleotide sequences.
PubChem molecule database.
-
DDBJ
DDBJ stands for DNA databank of japan
It is a biological database that collects DNA sequences & located at the National Institute of Genetics in Shizuoka (Japan). DDBJ exchange its data with EMBL (European molecular biological laboratory) at EBI (European bioinformatics institute) and GenBank (NCBI). these are also the members of INSDC (international nucleotide sequence database collaboration)
-
Uniport
it is also a freely accessible database of protein sequences and provides information about their function and structure. We can collect protein names, function, enzyme-specific information, protein-protein interactions form this database.