Linear Regression Interactive Visualizer
Run the Linear Regression Visualizer Fullscreen
Edit the MicroSim in the p5.js Editor
About This MicroSim
This visualization demonstrates linear regression as an optimization problem where we find the line that best fits the data by minimizing the sum of squared errors (residuals).
Key Features
- Draggable Data Points: Click and drag any point to see how it affects the regression line
- Manual Parameter Control: Adjust the slope (w) and intercept (b) sliders to explore the loss landscape
- Fit OLS Button: Instantly compute the optimal parameters using Ordinary Least Squares
- Residual Visualization: See the vertical distances from each point to the fitted line
- Loss Surface Heatmap: Visualize the Mean Squared Error as a function of w and b
- Real-time Statistics: View current and optimal parameters, loss values, and R-squared
- Add Noise: Introduce random noise to see how it affects the fit
The Linear Regression Problem
Given data points \((x_i, y_i)\) for \(i = 1, \ldots, n\), we want to find the line:
that minimizes the sum of squared residuals:
The Normal Equations
The optimal parameters satisfy the normal equations:
where:
Closed-Form Solution
The optimal slope and intercept are:
How to Use
Exploring the Loss Surface
- Move the sliders for w (slope) and b (intercept)
- Watch the red dot move on the loss surface heatmap
- Notice how the loss increases as you move away from the optimal point (green dot)
- The bowl shape of the loss surface shows this is a convex optimization problem
Understanding Residuals
- Enable Residuals checkbox to see vertical error lines
- Each residual shows the distance from a data point to the fitted line
- The small squares visualize the squared error being minimized
- Notice that residuals are vertical (not perpendicular to the line)
Interactive Exploration
- Drag data points to change the dataset
- Watch the optimal line (green dashed) update instantly
- Use Fit OLS to snap to the optimal solution
- Compare your manual fit to the computed optimal
Visual Elements
| Element | Color | Meaning |
|---|---|---|
| Data points | Blue | Observed data \((x_i, y_i)\) |
| Fitted line | Red | Current line \(\hat{y} = wx + b\) |
| Optimal line | Green dashed | OLS solution |
| Residuals | Green dashed | Vertical errors \(y_i - \hat{y}_i\) |
| Loss surface | Blue-White-Red | MSE as function of (w, b) |
| Current position | Red dot | Current (w, b) on loss surface |
| Optimal position | Green dot | Optimal \((w^*, b^*)\) |
Learning Objectives
After using this MicroSim, students will be able to:
- Explain why we minimize squared errors (not absolute errors)
- Interpret the loss surface as a function of model parameters
- Understand why the OLS solution is at the minimum of the loss surface
- Derive and apply the normal equations
- Calculate and interpret the R-squared goodness-of-fit measure
- Recognize that vertical residuals differ from perpendicular distances
Key Insights
Why Squared Errors?
- Differentiable: Allows calculus-based optimization
- Penalizes large errors: Outliers have significant impact
- Unique minimum: Convex loss surface guarantees global optimum
- Statistical properties: MLE under Gaussian noise assumption
Why Vertical Residuals?
Linear regression minimizes errors in y (the dependent variable), not perpendicular distance to the line. This makes sense when:
- x is known precisely (independent variable)
- y has measurement error (dependent variable)
For errors in both variables, use Total Least Squares instead.
The R-Squared Statistic
- \(R^2 = 1\): Perfect fit (all points on line)
- \(R^2 = 0\): Model no better than predicting the mean
- \(R^2 < 0\): Model is worse than the mean (usually indicates error)
Lesson Plan
Introduction (5 minutes)
Pose the question: "Given scattered data, how do we find the 'best' line through it?"
Exploration Phase (10 minutes)
- Let students drag sliders to find their best fit manually
- Discuss what makes a line "good" or "bad"
- Reveal the loss surface - where is your solution?
- Click "Fit OLS" to see the optimal solution
Mathematical Foundation (10 minutes)
- Introduce the loss function \(L(w, b)\)
- Show why the minimum is where partial derivatives equal zero
- Derive the normal equations
- Connect to matrix form \(X^T X \beta = X^T y\)
Interactive Experiments (10 minutes)
- Drag an outlier: How much does one point affect the fit?
- Add noise: How does noise affect R-squared?
- Change slope: What happens to the loss surface?
Discussion Questions
- Why do we square the residuals instead of using absolute values?
- What would happen if we minimized perpendicular distance instead?
- How would you extend this to multiple independent variables?
- When might linear regression be inappropriate?
Connections to Linear Algebra
Linear regression connects to several key concepts:
- Least Squares: Finding the best approximation when Ax = b has no solution
- Projection: The fitted values \(\hat{y} = X\beta\) are the projection of y onto Col(X)
- Normal Equations: \(X^T X \beta = X^T y\) ensures the residual is orthogonal to Col(X)
- Pseudoinverse: \(\beta = (X^T X)^{-1} X^T y = X^+ y\)
References
- Chapter 8: Vector Spaces and Subspaces - Least Squares section
- Chapter 9: Solving Linear Systems - Normal equations
- 3Blue1Brown: Linear Regression