Learning Graph Quality Metrics Report
Overview
- Total Concepts: 480
- Foundational Concepts (no prerequisites, other concepts depend on them): 9
- Terminal Nodes (nothing depends on them, but have prerequisites): 250
- Orphaned Nodes (completely disconnected, no edges): 0
- Concepts with Dependencies: 471
- Average Dependencies per Concept: 1.41
Graph Structure Validation
- Valid DAG Structure: ✅ Yes
- Self-Dependencies: None detected ✅
- Cycles Detected: 0
Foundational Concepts
These concepts have no prerequisites:
- 1: Bioinformatics
- 71: Graph Theory
- 112: Relational Database
- 152: Dynamic Programming
- 176: Hidden Markov Model
- 179: Regular Expressions
- 330: Mutual Information
- 450: Python for Bioinformatics
- 462: Version Control for Science
Dependency Chain Analysis
- Maximum Dependency Chain Length: 16
Longest Learning Path:
- Graph Theory (ID: 71)
- Nodes and Edges (ID: 72)
- Graph Properties (ID: 81)
- Degree Distribution (ID: 82)
- Power-Law Distribution (ID: 103)
- Scale-Free Networks (ID: 101)
- Protein Interaction Network (ID: 251)
- Network Modules (ID: 261)
- Disease Module (ID: 375)
- Network Medicine (ID: 374)
- Drug Target (ID: 378)
- Drug Repurposing (ID: 380)
- Pharmacogenomics (ID: 383)
- Precision Medicine (ID: 388)
- Biomarker Discovery (ID: 389)
- Network-Based Biomarkers (ID: 449)
Terminal Nodes Analysis
Terminal nodes are concepts that nothing else depends on but have prerequisites. They represent natural endpoints of learning paths — culminating or specialized concepts.
- Total Terminal Nodes: 250 (52.1% of all concepts)
- Healthy Range: 5-40% of total concepts
Concepts at the end of learning paths:
- 2: Computational Biology
- 16: Molecular Biology
- 19: Open Reading Frame
- 20: Complementary Base Pairing
- 24: Insertion and Deletion
- 26: Copy Number Variation
- 28: DNA Methylation
- 29: Histone Modification
- 30: Central Dogma Exceptions
- 35: Swiss-Prot
- 36: TrEMBL
- 38: Ensembl
- 41: BioGRID Database
- 42: STRING Database
- 43: IntAct Database
- 49: OMIM Database
- 51: Database Cross-References
- 53: REST APIs for Biology
- 54: Batch Data Download
- 55: Data Provenance
...and 230 more
Orphaned Nodes Analysis
Orphaned nodes are completely disconnected concepts with no inbound AND no outbound edges. These indicate a quality problem — every concept should connect to the graph.
- Total Orphaned Nodes: 0
✅ No orphaned nodes detected. All concepts are connected to the graph.
Connected Components
- Number of Connected Components: 1
✅ All concepts are connected in a single graph.
Indegree Analysis
Top 10 concepts that are prerequisites for the most other concepts:
| Rank | Concept ID | Concept Label | Indegree |
|---|---|---|---|
| 1 | 72 | Nodes and Edges | 27 |
| 2 | 251 | Protein Interaction Network | 26 |
| 3 | 31 | Biological Databases | 23 |
| 4 | 56 | FASTA Format | 15 |
| 5 | 127 | Graph Schema Design | 15 |
| 6 | 181 | Phylogenetic Tree | 15 |
| 7 | 73 | Directed Graphs | 14 |
| 8 | 111 | Graph Database | 10 |
| 9 | 220 | Tertiary Structure | 10 |
| 10 | 4 | DNA Structure | 9 |
Outdegree Distribution
| Dependencies | Number of Concepts |
|---|---|
| 0 | 9 |
| 1 | 282 |
| 2 | 186 |
| 3 | 3 |
Recommendations
- ℹ️ High terminal node percentage (52.1%): Consider if some terminal concepts should be prerequisites for advanced concepts
- ✅ DAG structure verified: Graph supports valid learning progressions
- ℹ️ Long dependency chains (16): Ensure students can follow extended learning paths
- ℹ️ Consider adding cross-dependencies: More connections could create richer learning pathways
Report generated by learning-graph-reports/analyze_graph.py