References: Genome Assembly and Variation Graphs
-
De Bruijn Graph - Wikipedia - Mathematical foundation of de Bruijn graphs and their application to genome assembly, explaining k-mer decomposition, Eulerian paths, and how sequencing reads are reconstructed into contigs.
-
Genome Assembly - Wikipedia - Overview of genome assembly approaches including overlap-layout-consensus and de Bruijn graph methods, covering scaffolding, gap filling, and quality metrics like N50 and contig statistics.
-
Pan-genome - Wikipedia - Describes the concept of a pan-genome representing all genetic variation within a species, including core and dispensable genomes, and graph-based pangenome reference structures.
-
Genome Assembly and Annotation - Mark Sherlock - Springer - Practical guide to genome assembly workflows covering read preprocessing, assembly algorithms, scaffolding strategies, and quality assessment methods for next-generation sequencing data.
-
Bioinformatics Algorithms: An Active Learning Approach (3rd Edition) - Phillip Compeau - Active Learning Publishers - Interactive textbook with detailed coverage of de Bruijn graph assembly, read error correction, and genome rearrangement algorithms with programming challenges.
-
vg Toolkit Documentation - vg Team - Wiki documentation for the vg variation graph toolkit, covering graph construction, read mapping with GIRAFFE, variant calling, and pangenome reference graph operations.
-
Human Pangenome Reference Consortium - HPRC - Resources from the consortium building a human pangenome reference, explaining why graph-based references better represent population genetic diversity than linear references.
-
SPAdes Genome Assembler Manual - Center for Algorithmic Biotechnology - Documentation for the SPAdes assembler using de Bruijn graphs, covering assembly modes for various data types and parameter optimization strategies.
-
Galaxy Training: Genome Assembly - Galaxy Project - Hands-on tutorials for genome assembly workflows in the Galaxy platform, covering quality control, assembly with various tools, and assembly evaluation metrics.
-
GFA Format Specification - GFA-spec - Specification for the Graphical Fragment Assembly format used to represent assembly and variation graphs, defining segment, link, and path records for graph-based genome representations.