Skip to content

Learning Graph Quality Metrics Report

Overview

  • Total Concepts: 480
  • Foundational Concepts (no prerequisites, other concepts depend on them): 9
  • Terminal Nodes (nothing depends on them, but have prerequisites): 250
  • Orphaned Nodes (completely disconnected, no edges): 0
  • Concepts with Dependencies: 471
  • Average Dependencies per Concept: 1.41

Graph Structure Validation

  • Valid DAG Structure: ✅ Yes
  • Self-Dependencies: None detected ✅
  • Cycles Detected: 0

Foundational Concepts

These concepts have no prerequisites:

  • 1: Bioinformatics
  • 71: Graph Theory
  • 112: Relational Database
  • 152: Dynamic Programming
  • 176: Hidden Markov Model
  • 179: Regular Expressions
  • 330: Mutual Information
  • 450: Python for Bioinformatics
  • 462: Version Control for Science

Dependency Chain Analysis

  • Maximum Dependency Chain Length: 16

Longest Learning Path:

  1. Graph Theory (ID: 71)
  2. Nodes and Edges (ID: 72)
  3. Graph Properties (ID: 81)
  4. Degree Distribution (ID: 82)
  5. Power-Law Distribution (ID: 103)
  6. Scale-Free Networks (ID: 101)
  7. Protein Interaction Network (ID: 251)
  8. Network Modules (ID: 261)
  9. Disease Module (ID: 375)
  10. Network Medicine (ID: 374)
  11. Drug Target (ID: 378)
  12. Drug Repurposing (ID: 380)
  13. Pharmacogenomics (ID: 383)
  14. Precision Medicine (ID: 388)
  15. Biomarker Discovery (ID: 389)
  16. Network-Based Biomarkers (ID: 449)

Terminal Nodes Analysis

Terminal nodes are concepts that nothing else depends on but have prerequisites. They represent natural endpoints of learning paths — culminating or specialized concepts.

  • Total Terminal Nodes: 250 (52.1% of all concepts)
  • Healthy Range: 5-40% of total concepts

Concepts at the end of learning paths:

  • 2: Computational Biology
  • 16: Molecular Biology
  • 19: Open Reading Frame
  • 20: Complementary Base Pairing
  • 24: Insertion and Deletion
  • 26: Copy Number Variation
  • 28: DNA Methylation
  • 29: Histone Modification
  • 30: Central Dogma Exceptions
  • 35: Swiss-Prot
  • 36: TrEMBL
  • 38: Ensembl
  • 41: BioGRID Database
  • 42: STRING Database
  • 43: IntAct Database
  • 49: OMIM Database
  • 51: Database Cross-References
  • 53: REST APIs for Biology
  • 54: Batch Data Download
  • 55: Data Provenance

...and 230 more

Orphaned Nodes Analysis

Orphaned nodes are completely disconnected concepts with no inbound AND no outbound edges. These indicate a quality problem — every concept should connect to the graph.

  • Total Orphaned Nodes: 0

✅ No orphaned nodes detected. All concepts are connected to the graph.

Connected Components

  • Number of Connected Components: 1

✅ All concepts are connected in a single graph.

Indegree Analysis

Top 10 concepts that are prerequisites for the most other concepts:

Rank Concept ID Concept Label Indegree
1 72 Nodes and Edges 27
2 251 Protein Interaction Network 26
3 31 Biological Databases 23
4 56 FASTA Format 15
5 127 Graph Schema Design 15
6 181 Phylogenetic Tree 15
7 73 Directed Graphs 14
8 111 Graph Database 10
9 220 Tertiary Structure 10
10 4 DNA Structure 9

Outdegree Distribution

Dependencies Number of Concepts
0 9
1 282
2 186
3 3

Recommendations

  • ℹ️ High terminal node percentage (52.1%): Consider if some terminal concepts should be prerequisites for advanced concepts
  • DAG structure verified: Graph supports valid learning progressions
  • ℹ️ Long dependency chains (16): Ensure students can follow extended learning paths
  • ℹ️ Consider adding cross-dependencies: More connections could create richer learning pathways

Report generated by learning-graph-reports/analyze_graph.py