AlphaFold Prediction Pipeline
Run the AlphaFold Prediction Pipeline MicroSim Fullscreen
Edit in the p5.js Editor
About This MicroSim
This MicroSim walks through the AlphaFold protein structure prediction pipeline as an animated flowchart. Students step through each stage to understand how DeepMind's AlphaFold2 system predicts 3D protein structure from amino acid sequence alone.
Pipeline Stages
- Amino Acid Sequence — The input: a protein's primary structure in FASTA format
- Multiple Sequence Alignment (MSA) — Homologous sequences are retrieved and aligned, revealing co-evolutionary patterns that encode structural constraints
- Evoformer — A deep neural network processes the MSA and pairwise features through attention layers, learning residue-residue relationships
- Structure Module — Converts the Evoformer output into 3D atomic coordinates using an iterative refinement process
- 3D Model — The predicted structure with per-residue confidence scores (pLDDT)
- Refinement — Energy minimization and side-chain optimization to produce the final structure
- Final Predicted Structure — The output in PDB format, ready for analysis
Why This Matters
AlphaFold2 solved the protein structure prediction problem in 2020, a 50-year grand challenge in biology. Understanding its pipeline helps students appreciate how machine learning transforms bioinformatics, and why the AlphaFold Protein Structure Database now contains over 200 million predicted structures.
How to Use
- Next / Previous buttons — Step through the pipeline stages one at a time
- Read the descriptions — Each stage includes an explanation of what happens and why it matters
- Follow the flow — Watch how information transforms from 1D sequence to 3D structure across the pipeline
Iframe Embed Code
1 2 3 4 | |
Lesson Plan
Grade Level
College introductory bioinformatics
Duration
15-20 minutes
Prerequisites
- Understanding of protein primary and tertiary structure
- Familiarity with sequence alignment concepts
- Basic awareness of machine learning / neural networks
Activities
- Exploration (5 min): Step through all stages. At each stage, note what the input is, what processing occurs, and what the output is.
- Guided Practice (5 min): Go back to the MSA stage. Why are homologous sequences important for structure prediction? What information does co-evolution provide that a single sequence cannot?
- Discussion (5 min): Before AlphaFold, experimental methods (X-ray crystallography, cryo-EM, NMR) were the only way to determine protein structure. What are the advantages and limitations of computational prediction vs. experimental determination?
- Assessment (5 min): Answer the reflection questions below.
Assessment
- What is the input to AlphaFold, and what is the output?
- Why is the multiple sequence alignment step critical for prediction accuracy?
- What does the pLDDT confidence score tell you about a predicted structure, and how should you interpret low-confidence regions?
- How has AlphaFold changed the landscape of structural biology and drug discovery?