Neural Network Layer

Run the Neural Network Layer MicroSim Fullscreen

Edit the Neural Network Layer MicroSim with the p5.js editor

You can include this MicroSim on your website using the following iframe:

<iframe src="https://dmccreary.github.io/linear-algebra/sims/neural-network-layer/main.html" height="502px" scrolling="no"></iframe>

Description

This MicroSim visualizes how a fully connected neural network layer is implemented as matrix-vector multiplication. Each layer computes:

\[h = \sigma(Wx + b)\]

where:

x is the input vector (left neurons)
W is the weight matrix (connection lines)
b is the bias vector (optional, shown as nodes above outputs)
σ is the activation function (ReLU, sigmoid, tanh, or none)
h is the output vector (right neurons)

Key Features:

Visual Weights: Connection thickness shows weight magnitude; blue = positive, red = negative
Activation Functions: Compare ReLU, sigmoid, tanh, or linear (none)
Adjustable Architecture: Change the number of inputs and outputs
Bias Toggle: Show or hide bias terms
Random Initialization: Generate new weights or inputs

The Matrix View

The weight matrix W has dimensions (outputs × inputs). Each row of W corresponds to one output neuron and contains the weights for all connections to that neuron.

For 3 inputs and 2 outputs:

\[W = \begin{bmatrix} w_{11} & w_{12} & w_{13} \\ w_{21} & w_{22} & w_{23} \end{bmatrix}\]

The output is computed as:

\[h_i = \sigma\left(\sum_{j=1}^{n} w_{ij} x_j + b_i\right)\]

Activation Functions

Function	Formula	Range	Properties
None	σ(z) = z	(-∞, ∞)	Linear, no nonlinearity
ReLU	σ(z) = max(0, z)	[0, ∞)	Sparse activation, fast
Sigmoid	σ(z) = 1/(1+e^(-z))	(0, 1)	Smooth, probability interpretation
Tanh	σ(z) = tanh(z)	(-1, 1)	Zero-centered, smooth

Lesson Plan

Learning Objectives

After using this MicroSim, students will be able to:

Explain how a neural network layer implements matrix-vector multiplication
Describe the role of weights, biases, and activation functions
Calculate the output of a simple layer by hand
Connect linear algebra concepts to deep learning

Guided Exploration (7-10 minutes)

Start Simple: Set inputs=2, outputs=2, activation=none, no bias
Observe Weights: Click "Random W" and watch connection changes
Add Nonlinearity: Switch to ReLU and note how negative pre-activations become 0
Enable Bias: Toggle bias on and observe the bias nodes appear

Key Discussion Points

Why do we need activation functions? (Without them, the whole network is just one big linear transformation)
What does a large positive weight vs large negative weight mean visually?
How many parameters (weights + biases) does this layer have?

Assessment Questions

For a layer with 4 inputs and 3 outputs, what are the dimensions of W?
If all weights are positive, can an output ever be negative (with ReLU)?
How many total parameters does a 10-input, 5-output layer have (with bias)?

References

Chapter 2: Matrices and Matrix Operations - Neural network layers as matrix operations
Deep Learning Book - Goodfellow, Bengio, and Courville