FASTQ Quality Control Pipeline
Run the FASTQ Quality Control Pipeline MicroSim Fullscreen
About This MicroSim
This MicroSim presents the FASTQ quality control pipeline as an interactive directed flowchart. Each node represents a processing step or decision point, showing how raw sequencing reads are filtered and trimmed before downstream analysis.
Pipeline Steps
- Raw FASTQ — Unprocessed sequencing reads with quality scores
- Quality Assessment — Tools like FastQC evaluate per-base quality, GC content, adapter contamination, and read length distribution
- Adapter Trimming — Remove adapter sequences ligated during library preparation (Trimmomatic, Cutadapt)
- Quality Filtering — Remove reads below a quality threshold (e.g., Phred < 20)
- Length Filtering — Remove reads shorter than a minimum length after trimming
- Deduplication — Optional removal of PCR duplicate reads
- Clean FASTQ — High-quality reads ready for alignment or assembly
Decision Nodes
The flowchart includes decision points where reads pass or fail quality criteria, illustrating how different QC steps filter the data and how the filtering criteria affect the final read count.
How to Use
- Click any pipeline step to see its description, the tools commonly used, and the parameters involved
- Follow the flow — Trace reads from raw input through each processing step to clean output
- Identify decision points — See where reads are filtered out and why
Iframe Embed Code
1 2 3 4 | |
Lesson Plan
Grade Level
College introductory bioinformatics
Duration
15-20 minutes
Prerequisites
- Understanding of DNA sequencing (Illumina short reads)
- Concept of Phred quality scores
- Awareness that raw data requires preprocessing
Activities
- Exploration (5 min): Click each pipeline step in order. For each, note what problem it addresses and what tool is commonly used.
- Parameter Discussion (5 min): What happens if you set the quality threshold too high? Too low? Discuss the trade-off between read quality and read quantity.
- Pipeline Design (5 min): If you were analyzing RNA-seq data vs. whole-genome sequencing, would you change any QC parameters? Why?
- Assessment (5 min): Answer the reflection questions below.
Assessment
- Why is quality control the first step in any sequencing analysis pipeline?
- What is a Phred quality score, and what does a Phred score of 30 mean?
- Why must adapter sequences be removed before alignment?
- What are the risks of over-filtering (too strict QC) vs. under-filtering (too lenient QC)?