Concept List
The following is a numbered list of 350 concepts for the Bioinformatics course with a focus on graph-based data modeling.
Foundations of Bioinformatics (1–30)
- Bioinformatics
- Computational Biology
- Central Dogma
- DNA Structure
- RNA Structure
- Protein Structure
- Amino Acids
- Nucleotides
- Codons
- Gene
- Genome
- Transcription
- Translation
- Gene Expression
- Sequence Data
- Molecular Biology
- Cell Biology Basics
- Genetic Code
- Open Reading Frame
- Complementary Base Pairing
- Chromosomes
- Mutations
- Single Nucleotide Polymorphism
- Insertion and Deletion
- Structural Variant
- Copy Number Variation
- Epigenetics
- DNA Methylation
- Histone Modification
- Central Dogma Exceptions
Biological Databases (31–55)
- Biological Databases
- NCBI
- GenBank Database
- UniProt
- Swiss-Prot
- TrEMBL
- Protein Data Bank
- Ensembl
- KEGG Database
- Reactome Database
- BioGRID Database
- STRING Database
- IntAct Database
- COSMIC Database
- Gene Ontology Database
- Disease Ontology Database
- Human Phenotype Ontology DB
- BioCyc Database
- OMIM Database
- Hetionet Database
- Database Cross-References
- Programmatic Database Access
- REST APIs for Biology
- Batch Data Download
- Data Provenance
Data Formats (56–70)
- FASTA Format
- FASTQ Format
- GenBank Format
- GFF3 Format
- OWL Format
- PDB File Format
- VCF Format
- SAM and BAM Format
- BED Format
- SBML Format
- BioPAX Format
- CSV for Bioinformatics
- JSON for Bioinformatics
- Data Format Conversion
- Data Quality Control
Graph Theory Fundamentals (71–110)
- Graph Theory
- Nodes and Edges
- Directed Graphs
- Undirected Graphs
- Weighted Graphs
- Bipartite Graphs
- Labeled Property Graph
- Multigraph
- Hypergraph
- Subgraph
- Graph Properties
- Degree Distribution
- In-Degree
- Out-Degree
- Clustering Coefficient
- Centrality Measures
- Degree Centrality
- Betweenness Centrality
- Closeness Centrality
- Eigenvector Centrality
- PageRank
- Connected Components
- Strongly Connected Comp
- Graph Traversal
- Breadth-First Search
- Depth-First Search
- Shortest Path Algorithms
- Dijkstra Algorithm
- Graph Density
- Graph Diameter
- Scale-Free Networks
- Small-World Networks
- Power-Law Distribution
- Random Graph Models
- Erdos-Renyi Model
- Barabasi-Albert Model
- Network Motifs
- Graph Isomorphism
- Adjacency Matrix
- Edge List Representation
Graph Databases and Queries (111–145)
- Graph Database
- Relational Database
- Graph vs Relational Model
- Neo4j
- Memgraph
- Cypher Query Language
- GQL Query Language
- MATCH Clause
- WHERE Clause
- RETURN Clause
- CREATE Clause
- MERGE Clause
- Graph Pattern Matching
- Variable-Length Paths
- Path Queries
- Aggregation in Cypher
- Graph Schema Design
- Node Labels
- Relationship Types
- Property Keys
- Index and Constraints
- RDF Triple Model
- Subject-Predicate-Object
- SPARQL Query Language
- LPG vs RDF Comparison
- Graph Data Loading
- CSV Import to Graph DB
- ETL for Graph Databases
- Graph Query Optimization
- Query Profiling
- Distributed Graph Databases
- Graph Partitioning
- Graph Scalability
- Graph Transactions
- Graph Access Control
Sequence Alignment (146–180)
- Sequence Alignment
- Pairwise Alignment
- Global Alignment
- Local Alignment
- Smith-Waterman Algorithm
- Needleman-Wunsch Algorithm
- Dynamic Programming
- Scoring Matrices
- BLOSUM Matrix
- PAM Matrix
- Substitution Model
- Gap Penalties
- Affine Gap Penalty
- BLAST
- BLAST E-Value
- BLAST Heuristics
- PSI-BLAST
- Sequence Homology
- Orthologs
- Paralogs
- Sequence Identity
- Sequence Similarity
- Sequence Similarity Network
- Graph Model for Similarity
- Multiple Sequence Alignment
- Clustal
- MUSCLE Aligner
- Progressive Alignment
- Consensus Sequence
- Sequence Profile
- Hidden Markov Model
- Profile HMM
- Sequence Motif
- Regular Expressions
- Motif Discovery
Phylogenetics and Evolution (181–215)
- Phylogenetic Tree
- Phylogenetics
- Molecular Phylogenetics
- Distance Matrix
- Neighbor-Joining Method
- UPGMA Method
- Maximum Parsimony
- Maximum Likelihood Method
- Bayesian Inference
- Markov Chain Monte Carlo
- Bootstrap Analysis
- Branch Support Values
- Molecular Clock
- Substitution Rate
- Trees as DAGs
- Phylogenetic Networks
- Reticulate Evolution
- Horizontal Gene Transfer
- Recombination
- Incomplete Lineage Sorting
- Graph Model for Evolution
- Cladogram
- Phylogram
- Monophyletic Group
- Paraphyletic Group
- Outgroup
- Rooted vs Unrooted Trees
- Tree Topology Comparison
- Robinson-Foulds Distance
- Ancestral Reconstruction
- Divergence Time Estimation
- Gene Tree vs Species Tree
- Coalescent Theory
- Phylogenomics
- Comparative Genomics
Structural Bioinformatics (216–250)
- Primary Structure
- Secondary Structure
- Alpha Helix
- Beta Sheet
- Tertiary Structure
- Quaternary Structure
- Protein Folding
- Protein Folding Problem
- Homology Modeling
- Threading
- Ab Initio Prediction
- AlphaFold
- AlphaFold Database
- Protein Contact Map
- Contact Map as Graph
- Residue Interaction Network
- Graph Model for Contacts
- Structural Alignment
- RMSD
- Protein Domain
- Domain Classification
- SCOP Database
- Pfam Database
- Protein Surface Analysis
- Binding Site Prediction
- Molecular Docking
- Ligand-Protein Interaction
- Drug-Likeness
- ADMET Properties
- Protein-Ligand Graph
- Molecular Fingerprints
- Chemical Similarity
- Structure-Activity Relation
- Protein Function Inference
- Structural Genomics
Protein-Protein Interactions (251–280)
- Protein Interaction Network
- Interactome
- Yeast Two-Hybrid
- Co-Immunoprecipitation
- Affinity Purification MS
- Cross-Linking Mass Spec
- PPI Confidence Scoring
- Binary vs Complex PPIs
- Network Hubs
- Network Bottlenecks
- Network Modules
- Graph Model for PPIs
- Hub-and-Spoke Topology
- Date Hubs vs Party Hubs
- Essential Proteins
- Protein Complex Detection
- Clique Detection
- Dense Subgraph Mining
- Network Rewiring
- Dynamic PPI Networks
- Tissue-Specific PPIs
- Host-Pathogen PPIs
- Viral Interactome
- PPI Prediction Methods
- Interaction Domain Pairs
- Co-Evolution Analysis
- Network Alignment
- Network Comparison
- Graphlet Analysis
- Network Centrality in PPIs
Genomics and Assembly (281–310)
- Genome Assembly
- De Bruijn Graph
- K-mer
- K-mer Spectrum
- Contig
- Scaffold
- N50 Metric
- Assembly Quality Metrics
- Reference Genome
- Reference Bias
- Pangenome
- Pangenome Graph
- Variation Graph
- VG Toolkit
- Graph Model for Variants
- Read Mapping to Graphs
- Genome Annotation
- Gene Prediction
- Next-Gen Sequencing
- Short Reads
- Long Reads
- Sequencing Depth
- Coverage
- Variant Calling
- SNP Calling
- Structural Variant Calling
- Genotyping
- Haplotype
- Phasing
- Population Reference Graph
Transcriptomics and Regulation (311–340)
- Transcriptome
- RNA-Seq Pipeline
- Read Quality Trimming
- Read Alignment
- Transcript Quantification
- Differential Expression
- Fold Change
- Statistical Testing for DE
- False Discovery Rate
- Transcription Factor
- Promoter Region
- Enhancer Region
- Cis-Regulatory Element
- Operon
- Gene Regulatory Network
- Co-Expression Network
- WGCNA
- ARACNE
- GENIE3
- Mutual Information
- Network Inference Methods
- Boolean Network Model
- Bayesian Network Model
- Graph Model for Regulation
- Single-Cell RNA-Seq
- Cell Type Clustering
- Trajectory Analysis
- Spatial Transcriptomics
- Alternative Splicing
- Non-Coding RNA
Metabolic Pathways (341–365)
- Metabolic Network
- Metabolite
- Enzyme
- Enzyme Kinetics
- Metabolic Pathway
- Bipartite Metabolic Graph
- KEGG Pathways
- Reactome Pathways
- BioCyc Pathways
- Flux Balance Analysis
- Constraint-Based Modeling
- Stoichiometric Matrix
- Objective Function
- Metabolic Flux
- Graph Model for Metabolism
- Genome-Scale Model
- Essential Reaction
- Minimal Growth Medium
- Metabolic Engineering
- Synthetic Biology
- Pathway Enrichment
- Metabolomics
- Mass Spec for Metabolomics
- Metabolic Network Compare
- Metabolic Graph Alignment
Signaling and Disease (366–395)
- Cell Signaling Cascade
- Signal Transduction
- Receptor
- Kinase Cascade
- Second Messenger
- Directed Signaling Graph
- Feedback Loop
- Feed-Forward Loop
- Network Medicine
- Disease Module
- Network Proximity
- Guilt by Association
- Drug Target
- Drug Target Validation
- Drug Repurposing
- Drug-Target-Disease Graph
- Graph Model for Repurposing
- Pharmacogenomics
- Cancer Driver Genes
- Tumor Suppressor Gene
- Oncogene
- Cancer Network Analysis
- Precision Medicine
- Biomarker Discovery
- Clinical Network Analysis
- Side Effect Prediction
- Drug-Drug Interaction Graph
- Adverse Event Network
- Comorbidity Network
- Disease Gene Prioritization
Knowledge Graphs and Ontologies (396–425)
- Knowledge Graph
- Biomedical Knowledge Graph
- Gene Ontology
- GO Molecular Function
- GO Biological Process
- GO Cellular Component
- GO Term Enrichment
- Disease Ontology
- Human Phenotype Ontology
- Ontology Structure
- Ontology Reasoning
- Semantic Similarity
- Heterogeneous Data
- Data Integration
- Schema Mapping
- Entity Resolution
- Graph Embeddings
- Node2Vec
- TransE
- Knowledge Graph Embedding
- Link Prediction
- Triple Classification
- Relation Extraction
- Named Entity Recognition
- Text Mining for Biology
- Graph Neural Networks
- Message Passing
- GNN for Molecules
- Graph Model for Knowledge
- Hetionet
Multi-Omics and Visualization (426–455)
- Multi-Omics Integration
- Genomics Layer
- Transcriptomics Layer
- Proteomics Layer
- Metabolomics Layer
- Unified Omics Graph
- Graph Model for Multi-Omics
- Community Detection
- Louvain Algorithm
- Leiden Algorithm
- Modularity Score
- Graph Clustering
- Spectral Clustering
- Graph Visualization
- Vis-Network Library
- Cytoscape Tool
- Force-Directed Layout
- Hierarchical Layout
- Network Layout Algorithms
- Patient Similarity Network
- Clinical Data Graph
- Survival Analysis
- Patient Stratification
- Network-Based Biomarkers
Python Tools and Libraries (450–470)
- Python for Bioinformatics
- Biopython
- NetworkX
- Pandas for Bioinformatics
- Scikit-Learn
- Jupyter Notebooks
- Matplotlib
- Seaborn
- Neo4j Python Driver
- Cytoscape API
- Data Wrangling
- Reproducible Analysis
- Version Control for Science
- Workflow Managers
- Conda Environments
Capstone and Applied Concepts (465–480)
- Capstone Project Design
- Graph Data Model Design
- Antibiotic Resistance Graph
- Resistance Gene Network
- Mobile Genetic Elements
- Rare Disease Knowledge Graph
- Phenotype-Gene Mapping
- Metabolic Model Comparison
- Cross-Species Graph Align
- Protein Function Predict
- GO Annotation Prediction
- Multi-Omics Stratification
- Patient Subgroup Discovery
- Graph-Based Discovery
- Bench to Bedside Pipeline
- Future of Graph Bioinform