Skip to content

Concept List

The following is a numbered list of 350 concepts for the Bioinformatics course with a focus on graph-based data modeling.

Foundations of Bioinformatics (1–30)

  1. Bioinformatics
  2. Computational Biology
  3. Central Dogma
  4. DNA Structure
  5. RNA Structure
  6. Protein Structure
  7. Amino Acids
  8. Nucleotides
  9. Codons
  10. Gene
  11. Genome
  12. Transcription
  13. Translation
  14. Gene Expression
  15. Sequence Data
  16. Molecular Biology
  17. Cell Biology Basics
  18. Genetic Code
  19. Open Reading Frame
  20. Complementary Base Pairing
  21. Chromosomes
  22. Mutations
  23. Single Nucleotide Polymorphism
  24. Insertion and Deletion
  25. Structural Variant
  26. Copy Number Variation
  27. Epigenetics
  28. DNA Methylation
  29. Histone Modification
  30. Central Dogma Exceptions

Biological Databases (31–55)

  1. Biological Databases
  2. NCBI
  3. GenBank Database
  4. UniProt
  5. Swiss-Prot
  6. TrEMBL
  7. Protein Data Bank
  8. Ensembl
  9. KEGG Database
  10. Reactome Database
  11. BioGRID Database
  12. STRING Database
  13. IntAct Database
  14. COSMIC Database
  15. Gene Ontology Database
  16. Disease Ontology Database
  17. Human Phenotype Ontology DB
  18. BioCyc Database
  19. OMIM Database
  20. Hetionet Database
  21. Database Cross-References
  22. Programmatic Database Access
  23. REST APIs for Biology
  24. Batch Data Download
  25. Data Provenance

Data Formats (56–70)

  1. FASTA Format
  2. FASTQ Format
  3. GenBank Format
  4. GFF3 Format
  5. OWL Format
  6. PDB File Format
  7. VCF Format
  8. SAM and BAM Format
  9. BED Format
  10. SBML Format
  11. BioPAX Format
  12. CSV for Bioinformatics
  13. JSON for Bioinformatics
  14. Data Format Conversion
  15. Data Quality Control

Graph Theory Fundamentals (71–110)

  1. Graph Theory
  2. Nodes and Edges
  3. Directed Graphs
  4. Undirected Graphs
  5. Weighted Graphs
  6. Bipartite Graphs
  7. Labeled Property Graph
  8. Multigraph
  9. Hypergraph
  10. Subgraph
  11. Graph Properties
  12. Degree Distribution
  13. In-Degree
  14. Out-Degree
  15. Clustering Coefficient
  16. Centrality Measures
  17. Degree Centrality
  18. Betweenness Centrality
  19. Closeness Centrality
  20. Eigenvector Centrality
  21. PageRank
  22. Connected Components
  23. Strongly Connected Comp
  24. Graph Traversal
  25. Breadth-First Search
  26. Depth-First Search
  27. Shortest Path Algorithms
  28. Dijkstra Algorithm
  29. Graph Density
  30. Graph Diameter
  31. Scale-Free Networks
  32. Small-World Networks
  33. Power-Law Distribution
  34. Random Graph Models
  35. Erdos-Renyi Model
  36. Barabasi-Albert Model
  37. Network Motifs
  38. Graph Isomorphism
  39. Adjacency Matrix
  40. Edge List Representation

Graph Databases and Queries (111–145)

  1. Graph Database
  2. Relational Database
  3. Graph vs Relational Model
  4. Neo4j
  5. Memgraph
  6. Cypher Query Language
  7. GQL Query Language
  8. MATCH Clause
  9. WHERE Clause
  10. RETURN Clause
  11. CREATE Clause
  12. MERGE Clause
  13. Graph Pattern Matching
  14. Variable-Length Paths
  15. Path Queries
  16. Aggregation in Cypher
  17. Graph Schema Design
  18. Node Labels
  19. Relationship Types
  20. Property Keys
  21. Index and Constraints
  22. RDF Triple Model
  23. Subject-Predicate-Object
  24. SPARQL Query Language
  25. LPG vs RDF Comparison
  26. Graph Data Loading
  27. CSV Import to Graph DB
  28. ETL for Graph Databases
  29. Graph Query Optimization
  30. Query Profiling
  31. Distributed Graph Databases
  32. Graph Partitioning
  33. Graph Scalability
  34. Graph Transactions
  35. Graph Access Control

Sequence Alignment (146–180)

  1. Sequence Alignment
  2. Pairwise Alignment
  3. Global Alignment
  4. Local Alignment
  5. Smith-Waterman Algorithm
  6. Needleman-Wunsch Algorithm
  7. Dynamic Programming
  8. Scoring Matrices
  9. BLOSUM Matrix
  10. PAM Matrix
  11. Substitution Model
  12. Gap Penalties
  13. Affine Gap Penalty
  14. BLAST
  15. BLAST E-Value
  16. BLAST Heuristics
  17. PSI-BLAST
  18. Sequence Homology
  19. Orthologs
  20. Paralogs
  21. Sequence Identity
  22. Sequence Similarity
  23. Sequence Similarity Network
  24. Graph Model for Similarity
  25. Multiple Sequence Alignment
  26. Clustal
  27. MUSCLE Aligner
  28. Progressive Alignment
  29. Consensus Sequence
  30. Sequence Profile
  31. Hidden Markov Model
  32. Profile HMM
  33. Sequence Motif
  34. Regular Expressions
  35. Motif Discovery

Phylogenetics and Evolution (181–215)

  1. Phylogenetic Tree
  2. Phylogenetics
  3. Molecular Phylogenetics
  4. Distance Matrix
  5. Neighbor-Joining Method
  6. UPGMA Method
  7. Maximum Parsimony
  8. Maximum Likelihood Method
  9. Bayesian Inference
  10. Markov Chain Monte Carlo
  11. Bootstrap Analysis
  12. Branch Support Values
  13. Molecular Clock
  14. Substitution Rate
  15. Trees as DAGs
  16. Phylogenetic Networks
  17. Reticulate Evolution
  18. Horizontal Gene Transfer
  19. Recombination
  20. Incomplete Lineage Sorting
  21. Graph Model for Evolution
  22. Cladogram
  23. Phylogram
  24. Monophyletic Group
  25. Paraphyletic Group
  26. Outgroup
  27. Rooted vs Unrooted Trees
  28. Tree Topology Comparison
  29. Robinson-Foulds Distance
  30. Ancestral Reconstruction
  31. Divergence Time Estimation
  32. Gene Tree vs Species Tree
  33. Coalescent Theory
  34. Phylogenomics
  35. Comparative Genomics

Structural Bioinformatics (216–250)

  1. Primary Structure
  2. Secondary Structure
  3. Alpha Helix
  4. Beta Sheet
  5. Tertiary Structure
  6. Quaternary Structure
  7. Protein Folding
  8. Protein Folding Problem
  9. Homology Modeling
  10. Threading
  11. Ab Initio Prediction
  12. AlphaFold
  13. AlphaFold Database
  14. Protein Contact Map
  15. Contact Map as Graph
  16. Residue Interaction Network
  17. Graph Model for Contacts
  18. Structural Alignment
  19. RMSD
  20. Protein Domain
  21. Domain Classification
  22. SCOP Database
  23. Pfam Database
  24. Protein Surface Analysis
  25. Binding Site Prediction
  26. Molecular Docking
  27. Ligand-Protein Interaction
  28. Drug-Likeness
  29. ADMET Properties
  30. Protein-Ligand Graph
  31. Molecular Fingerprints
  32. Chemical Similarity
  33. Structure-Activity Relation
  34. Protein Function Inference
  35. Structural Genomics

Protein-Protein Interactions (251–280)

  1. Protein Interaction Network
  2. Interactome
  3. Yeast Two-Hybrid
  4. Co-Immunoprecipitation
  5. Affinity Purification MS
  6. Cross-Linking Mass Spec
  7. PPI Confidence Scoring
  8. Binary vs Complex PPIs
  9. Network Hubs
  10. Network Bottlenecks
  11. Network Modules
  12. Graph Model for PPIs
  13. Hub-and-Spoke Topology
  14. Date Hubs vs Party Hubs
  15. Essential Proteins
  16. Protein Complex Detection
  17. Clique Detection
  18. Dense Subgraph Mining
  19. Network Rewiring
  20. Dynamic PPI Networks
  21. Tissue-Specific PPIs
  22. Host-Pathogen PPIs
  23. Viral Interactome
  24. PPI Prediction Methods
  25. Interaction Domain Pairs
  26. Co-Evolution Analysis
  27. Network Alignment
  28. Network Comparison
  29. Graphlet Analysis
  30. Network Centrality in PPIs

Genomics and Assembly (281–310)

  1. Genome Assembly
  2. De Bruijn Graph
  3. K-mer
  4. K-mer Spectrum
  5. Contig
  6. Scaffold
  7. N50 Metric
  8. Assembly Quality Metrics
  9. Reference Genome
  10. Reference Bias
  11. Pangenome
  12. Pangenome Graph
  13. Variation Graph
  14. VG Toolkit
  15. Graph Model for Variants
  16. Read Mapping to Graphs
  17. Genome Annotation
  18. Gene Prediction
  19. Next-Gen Sequencing
  20. Short Reads
  21. Long Reads
  22. Sequencing Depth
  23. Coverage
  24. Variant Calling
  25. SNP Calling
  26. Structural Variant Calling
  27. Genotyping
  28. Haplotype
  29. Phasing
  30. Population Reference Graph

Transcriptomics and Regulation (311–340)

  1. Transcriptome
  2. RNA-Seq Pipeline
  3. Read Quality Trimming
  4. Read Alignment
  5. Transcript Quantification
  6. Differential Expression
  7. Fold Change
  8. Statistical Testing for DE
  9. False Discovery Rate
  10. Transcription Factor
  11. Promoter Region
  12. Enhancer Region
  13. Cis-Regulatory Element
  14. Operon
  15. Gene Regulatory Network
  16. Co-Expression Network
  17. WGCNA
  18. ARACNE
  19. GENIE3
  20. Mutual Information
  21. Network Inference Methods
  22. Boolean Network Model
  23. Bayesian Network Model
  24. Graph Model for Regulation
  25. Single-Cell RNA-Seq
  26. Cell Type Clustering
  27. Trajectory Analysis
  28. Spatial Transcriptomics
  29. Alternative Splicing
  30. Non-Coding RNA

Metabolic Pathways (341–365)

  1. Metabolic Network
  2. Metabolite
  3. Enzyme
  4. Enzyme Kinetics
  5. Metabolic Pathway
  6. Bipartite Metabolic Graph
  7. KEGG Pathways
  8. Reactome Pathways
  9. BioCyc Pathways
  10. Flux Balance Analysis
  11. Constraint-Based Modeling
  12. Stoichiometric Matrix
  13. Objective Function
  14. Metabolic Flux
  15. Graph Model for Metabolism
  16. Genome-Scale Model
  17. Essential Reaction
  18. Minimal Growth Medium
  19. Metabolic Engineering
  20. Synthetic Biology
  21. Pathway Enrichment
  22. Metabolomics
  23. Mass Spec for Metabolomics
  24. Metabolic Network Compare
  25. Metabolic Graph Alignment

Signaling and Disease (366–395)

  1. Cell Signaling Cascade
  2. Signal Transduction
  3. Receptor
  4. Kinase Cascade
  5. Second Messenger
  6. Directed Signaling Graph
  7. Feedback Loop
  8. Feed-Forward Loop
  9. Network Medicine
  10. Disease Module
  11. Network Proximity
  12. Guilt by Association
  13. Drug Target
  14. Drug Target Validation
  15. Drug Repurposing
  16. Drug-Target-Disease Graph
  17. Graph Model for Repurposing
  18. Pharmacogenomics
  19. Cancer Driver Genes
  20. Tumor Suppressor Gene
  21. Oncogene
  22. Cancer Network Analysis
  23. Precision Medicine
  24. Biomarker Discovery
  25. Clinical Network Analysis
  26. Side Effect Prediction
  27. Drug-Drug Interaction Graph
  28. Adverse Event Network
  29. Comorbidity Network
  30. Disease Gene Prioritization

Knowledge Graphs and Ontologies (396–425)

  1. Knowledge Graph
  2. Biomedical Knowledge Graph
  3. Gene Ontology
  4. GO Molecular Function
  5. GO Biological Process
  6. GO Cellular Component
  7. GO Term Enrichment
  8. Disease Ontology
  9. Human Phenotype Ontology
  10. Ontology Structure
  11. Ontology Reasoning
  12. Semantic Similarity
  13. Heterogeneous Data
  14. Data Integration
  15. Schema Mapping
  16. Entity Resolution
  17. Graph Embeddings
  18. Node2Vec
  19. TransE
  20. Knowledge Graph Embedding
  21. Link Prediction
  22. Triple Classification
  23. Relation Extraction
  24. Named Entity Recognition
  25. Text Mining for Biology
  26. Graph Neural Networks
  27. Message Passing
  28. GNN for Molecules
  29. Graph Model for Knowledge
  30. Hetionet

Multi-Omics and Visualization (426–455)

  1. Multi-Omics Integration
  2. Genomics Layer
  3. Transcriptomics Layer
  4. Proteomics Layer
  5. Metabolomics Layer
  6. Unified Omics Graph
  7. Graph Model for Multi-Omics
  8. Community Detection
  9. Louvain Algorithm
  10. Leiden Algorithm
  11. Modularity Score
  12. Graph Clustering
  13. Spectral Clustering
  14. Graph Visualization
  15. Vis-Network Library
  16. Cytoscape Tool
  17. Force-Directed Layout
  18. Hierarchical Layout
  19. Network Layout Algorithms
  20. Patient Similarity Network
  21. Clinical Data Graph
  22. Survival Analysis
  23. Patient Stratification
  24. Network-Based Biomarkers

Python Tools and Libraries (450–470)

  1. Python for Bioinformatics
  2. Biopython
  3. NetworkX
  4. Pandas for Bioinformatics
  5. Scikit-Learn
  6. Jupyter Notebooks
  7. Matplotlib
  8. Seaborn
  9. Neo4j Python Driver
  10. Cytoscape API
  11. Data Wrangling
  12. Reproducible Analysis
  13. Version Control for Science
  14. Workflow Managers
  15. Conda Environments

Capstone and Applied Concepts (465–480)

  1. Capstone Project Design
  2. Graph Data Model Design
  3. Antibiotic Resistance Graph
  4. Resistance Gene Network
  5. Mobile Genetic Elements
  6. Rare Disease Knowledge Graph
  7. Phenotype-Gene Mapping
  8. Metabolic Model Comparison
  9. Cross-Species Graph Align
  10. Protein Function Predict
  11. GO Annotation Prediction
  12. Multi-Omics Stratification
  13. Patient Subgroup Discovery
  14. Graph-Based Discovery
  15. Bench to Bedside Pipeline
  16. Future of Graph Bioinform