Skip to content

Taxonomy Distribution Report

Overview

  • Total Concepts: 480
  • Number of Taxonomies: 14
  • Average Concepts per Taxonomy: 34.3

Distribution Summary

Category TaxonomyID Count Percentage Status
Pathways and Disease PATH 53 11.0%
Graph Theory GRTH 52 10.8%
Graph Databases GRDB 45 9.4%
Knowledge Graphs KNOW 40 8.3%
Sequence Analysis SEQA 34 7.1%
Phylogenetics PHYL 34 7.1%
Structural Bioinformatics STRU 34 7.1%
Tools and Capstone TOOL 31 6.5%
Foundation Concepts FOUND 30 6.2%
Protein Interactions PPIS 29 6.0%
Genomics GENO 29 6.0%
Transcriptomics TRNS 29 6.0%
Biological Databases DBAS 25 5.2%
Data Formats DFMT 15 3.1%

Visual Distribution

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Pathways and Disease      █████  53 ( 11.0%)
Graph Theory              █████  52 ( 10.8%)
Graph Databases           ████  45 (  9.4%)
Knowledge Graphs          ████  40 (  8.3%)
Sequence Analysis         ███  34 (  7.1%)
Phylogenetics             ███  34 (  7.1%)
Structural Bioinformatics ███  34 (  7.1%)
Tools and Capstone        ███  31 (  6.5%)
Foundation Concepts       ███  30 (  6.2%)
Protein Interactions      ███  29 (  6.0%)
Genomics                  ███  29 (  6.0%)
Transcriptomics           ███  29 (  6.0%)
Biological Databases      ██  25 (  5.2%)
Data Formats              █  15 (  3.1%)

Balance Analysis

✅ No Over-Represented Categories

All categories are under the 30% threshold. Good balance!

Category Details

Pathways and Disease (PATH)

Count: 53 concepts (11.0%)

Concepts:

    1. Metabolic Network
    1. Metabolite
    1. Enzyme
    1. Enzyme Kinetics
    1. Metabolic Pathway
    1. Bipartite Metabolic Graph
    1. KEGG Pathways
    1. Reactome Pathways
    1. BioCyc Pathways
    1. Flux Balance Analysis
    1. Constraint-Based Modeling
    1. Stoichiometric Matrix
    1. Objective Function
    1. Metabolic Flux
    1. Genome-Scale Model
  • ...and 38 more

Graph Theory (GRTH)

Count: 52 concepts (10.8%)

Concepts:

    1. Graph Theory
    1. Nodes and Edges
    1. Directed Graphs
    1. Undirected Graphs
    1. Weighted Graphs
    1. Bipartite Graphs
    1. Labeled Property Graph
    1. Multigraph
    1. Hypergraph
    1. Subgraph
    1. Graph Properties
    1. Degree Distribution
    1. In-Degree
    1. Out-Degree
    1. Clustering Coefficient
  • ...and 37 more

Graph Databases (GRDB)

Count: 45 concepts (9.4%)

Concepts:

    1. Graph Database
    1. Relational Database
    1. Graph vs Relational Model
    1. Neo4j
    1. Memgraph
    1. Cypher Query Language
    1. GQL Query Language
    1. MATCH Clause
    1. WHERE Clause
    1. RETURN Clause
    1. CREATE Clause
    1. MERGE Clause
    1. Graph Pattern Matching
    1. Variable-Length Paths
    1. Path Queries
  • ...and 30 more

Knowledge Graphs (KNOW)

Count: 40 concepts (8.3%)

Concepts:

    1. Knowledge Graph
    1. Biomedical Knowledge Graph
    1. Gene Ontology
    1. GO Molecular Function
    1. GO Biological Process
    1. GO Cellular Component
    1. GO Term Enrichment
    1. Disease Ontology
    1. Human Phenotype Ontology
    1. Ontology Structure
    1. Ontology Reasoning
    1. Semantic Similarity
    1. Heterogeneous Data
    1. Data Integration
    1. Schema Mapping
  • ...and 25 more

Sequence Analysis (SEQA)

Count: 34 concepts (7.1%)

Concepts:

    1. Sequence Alignment
    1. Pairwise Alignment
    1. Global Alignment
    1. Local Alignment
    1. Smith-Waterman Algorithm
    1. Needleman-Wunsch Algorithm
    1. Dynamic Programming
    1. Scoring Matrices
    1. BLOSUM Matrix
    1. PAM Matrix
    1. Substitution Model
    1. Gap Penalties
    1. Affine Gap Penalty
    1. BLAST
    1. BLAST E-Value
  • ...and 19 more

Phylogenetics (PHYL)

Count: 34 concepts (7.1%)

Concepts:

    1. Phylogenetic Tree
    1. Phylogenetics
    1. Molecular Phylogenetics
    1. Distance Matrix
    1. Neighbor-Joining Method
    1. UPGMA Method
    1. Maximum Parsimony
    1. Maximum Likelihood Method
    1. Bayesian Inference
    1. Markov Chain Monte Carlo
    1. Bootstrap Analysis
    1. Branch Support Values
    1. Molecular Clock
    1. Substitution Rate
    1. Trees as DAGs
  • ...and 19 more

Structural Bioinformatics (STRU)

Count: 34 concepts (7.1%)

Concepts:

    1. Primary Structure
    1. Secondary Structure
    1. Alpha Helix
    1. Beta Sheet
    1. Tertiary Structure
    1. Quaternary Structure
    1. Protein Folding
    1. Protein Folding Problem
    1. Homology Modeling
    1. Threading
    1. Ab Initio Prediction
    1. AlphaFold
    1. AlphaFold Database
    1. Protein Contact Map
    1. Contact Map as Graph
  • ...and 19 more

Tools and Capstone (TOOL)

Count: 31 concepts (6.5%)

Concepts:

    1. Python for Bioinformatics
    1. Biopython
    1. NetworkX
    1. Pandas for Bioinformatics
    1. Scikit-Learn
    1. Jupyter Notebooks
    1. Matplotlib
    1. Seaborn
    1. Neo4j Python Driver
    1. Cytoscape API
    1. Data Wrangling
    1. Reproducible Analysis
    1. Version Control for Science
    1. Workflow Managers
    1. Conda Environments
  • ...and 16 more

Foundation Concepts (FOUND)

Count: 30 concepts (6.2%)

Concepts:

    1. Bioinformatics
    1. Computational Biology
    1. Central Dogma
    1. DNA Structure
    1. RNA Structure
    1. Protein Structure
    1. Amino Acids
    1. Nucleotides
    1. Codons
    1. Gene
    1. Genome
    1. Transcription
    1. Translation
    1. Gene Expression
    1. Sequence Data
  • ...and 15 more

Protein Interactions (PPIS)

Count: 29 concepts (6.0%)

Concepts:

    1. Protein Interaction Network
    1. Interactome
    1. Yeast Two-Hybrid
    1. Co-Immunoprecipitation
    1. Affinity Purification MS
    1. Cross-Linking Mass Spec
    1. PPI Confidence Scoring
    1. Binary vs Complex PPIs
    1. Network Hubs
    1. Network Bottlenecks
    1. Network Modules
    1. Hub-and-Spoke Topology
    1. Date Hubs vs Party Hubs
    1. Essential Proteins
    1. Protein Complex Detection
  • ...and 14 more

Genomics (GENO)

Count: 29 concepts (6.0%)

Concepts:

    1. Genome Assembly
    1. De Bruijn Graph
    1. K-mer
    1. K-mer Spectrum
    1. Contig
    1. Scaffold
    1. N50 Metric
    1. Assembly Quality Metrics
    1. Reference Genome
    1. Reference Bias
    1. Pangenome
    1. Pangenome Graph
    1. Variation Graph
    1. VG Toolkit
    1. Read Mapping to Graphs
  • ...and 14 more

Transcriptomics (TRNS)

Count: 29 concepts (6.0%)

Concepts:

    1. Transcriptome
    1. RNA-Seq Pipeline
    1. Read Quality Trimming
    1. Read Alignment
    1. Transcript Quantification
    1. Differential Expression
    1. Fold Change
    1. Statistical Testing for DE
    1. False Discovery Rate
    1. Transcription Factor
    1. Promoter Region
    1. Enhancer Region
    1. Cis-Regulatory Element
    1. Operon
    1. Gene Regulatory Network
  • ...and 14 more

Biological Databases (DBAS)

Count: 25 concepts (5.2%)

Concepts:

    1. Biological Databases
    1. NCBI
    1. GenBank Database
    1. UniProt
    1. Swiss-Prot
    1. TrEMBL
    1. Protein Data Bank
    1. Ensembl
    1. KEGG Database
    1. Reactome Database
    1. BioGRID Database
    1. STRING Database
    1. IntAct Database
    1. COSMIC Database
    1. Gene Ontology Database
  • ...and 10 more

Data Formats (DFMT)

Count: 15 concepts (3.1%)

Concepts:

    1. FASTA Format
    1. FASTQ Format
    1. GenBank Format
    1. GFF3 Format
    1. OWL Format
    1. PDB File Format
    1. VCF Format
    1. SAM and BAM Format
    1. BED Format
    1. SBML Format
    1. BioPAX Format
    1. CSV for Bioinformatics
    1. JSON for Bioinformatics
    1. Data Format Conversion
    1. Data Quality Control

Recommendations

  • Excellent balance: Categories are evenly distributed (spread: 7.9%)
  • MISC category minimal: Good categorization specificity

Educational Use Recommendations

  • Use taxonomy categories for color-coding in graph visualizations
  • Design curriculum modules based on taxonomy groupings
  • Create filtered views for focused learning paths
  • Use categories for assessment organization
  • Enable navigation by topic area in interactive tools

Report generated by learning-graph-reports/taxonomy_distribution.py