Biological Database Ecosystem
Run the Biological Database Ecosystem MicroSim Fullscreen
About This MicroSim
This MicroSim maps the ecosystem of major bioinformatics databases as an interactive network. Each node represents a database, colored by category (sequence, structure, pathway, interaction, etc.), and edges show cross-reference links between databases. This helps students understand how biological data is organized and interconnected across the field.
Database Categories
Databases are grouped by the type of biological data they store:
- Sequence databases — GenBank, UniProt, RefSeq
- Structure databases — PDB, AlphaFold DB
- Pathway databases — KEGG, Reactome
- Interaction databases — STRING, IntAct, BioGRID
- Ontology/annotation — Gene Ontology, InterPro
- Disease/variation — OMIM, ClinVar, dbSNP
Cross-References
Edges between databases show how entries link to each other. For example, a UniProt protein entry cross-references its 3D structure in PDB, its gene in GenBank, its pathways in KEGG, and its known variants in ClinVar.
How to Use
- Hover over any database node to see its full name, description, and data type
- Drag nodes to rearrange the layout
- Zoom and pan to explore different regions of the ecosystem
- Follow edges to understand how databases cross-reference each other
Iframe Embed Code
1 2 3 4 | |
Lesson Plan
Grade Level
College introductory bioinformatics
Duration
15-20 minutes
Prerequisites
- Basic understanding of biological data types (sequences, structures, pathways)
- Awareness that bioinformatics relies on public databases
Activities
- Exploration (5 min): Identify all sequence databases in the network. How do they differ? (GenBank stores nucleotide sequences, UniProt stores protein sequences, RefSeq provides curated reference sequences.)
- Cross-Reference Tracing (5 min): Start at GenBank and follow edges to all connected databases. List each connected database and explain what type of information you would find there.
- Discussion (5 min): You discover a new gene. Which databases would you need to search, and in what order, to fully characterize its sequence, structure, function, associated diseases, and pathways?
- Assessment (5 min): Answer the reflection questions below.
Assessment
- Name three major categories of biological databases and give one example of each.
- Why is cross-referencing between databases important for biological research?
- What is the difference between GenBank and UniProt in terms of the data they store?
- If you wanted to find all known disease-causing mutations in a gene, which databases would you consult?