Skip to content

Foundation Model Training Pipeline

Run the Foundation Model Training Pipeline MicroSim Fullscreen
Edit in the p5.js Editor

About This MicroSim

This interactive infographic illustrates the three-stage pipeline that transforms raw data into a useful AI assistant:

  1. Pre-Training -- A foundation model learns language patterns by predicting the next token across trillions of tokens from books, websites, and code repositories.
  2. Fine-Tuning -- The foundation model is adapted using instructions, conversations, and human feedback (RLHF) to follow instructions, be helpful, and refuse harmful requests.
  3. Deployment -- The fine-tuned model generates responses to your prompts token by token through an API or chat interface.

The arrows between stages show how transfer learning carries knowledge forward and how the final model is accessed via an API or chat interface.

How to Use

  • Hover over any stage to see a detailed explanation of what happens at that step.
  • Click a stage to expand an example panel showing sample data for that stage (pre-training text, a fine-tuning instruction pair, or a prompt and response).
  • Click again or click outside to close the expanded panel.

Iframe Embed Code

You can add this MicroSim to any web page by adding this to your HTML:

1
2
3
4
<iframe src="https://dmccreary.github.io/prompt-class/sims/foundation-model-pipeline/main.html"
        height="522px"
        width="100%"
        scrolling="no"></iframe>

Lesson Plan

Grade Level

9-12 and Undergraduate

Duration

5-10 minutes

Prerequisites

Basic understanding of what AI and machine learning are at a high level.

Activities

  1. Exploration (3 min): Have students hover over each stage and read the detailed descriptions. Ask them to note the key transformation that happens at each stage.
  2. Guided Practice (4 min): Click each stage to see example data. Discuss: How does pre-training data differ from fine-tuning data? Why does the model need both?
  3. Discussion (3 min): Ask students to explain in their own words why a model trained only on raw text (pre-training) would not be a good assistant without fine-tuning.

Assessment

Students should be able to:

  • Name and describe the three stages of the foundation model pipeline
  • Explain the role of transfer learning between pre-training and fine-tuning
  • Describe why fine-tuning with human feedback is necessary

References

  1. Wikipedia: Foundation Models
  2. Wikipedia: Transfer Learning
  3. Wikipedia: Fine-tuning (machine learning)