MicroSim Similarity Map
Version: 1.0.0 Generated: 2026-02-02 Source: docs/reports/micros-similarity-map.md
Overview
This interactive visualization displays all MicroSims in a 2D similarity map where similar MicroSims appear close together. The map is generated using Principal Component Analysis (PCA) to reduce high-dimensional semantic embeddings into two dimensions that can be visualized.
Interactive Map
How to Use
- Hover over any point to see the MicroSim title, description, and subject area
- Click on any point to open the MicroSim in a new tab
- Legend filtering - Click on subject areas in the legend to show/hide them
- Check All / Uncheck All - Use the buttons to quickly toggle all subject areas
- Zoom and pan - Use the Plotly toolbar to zoom into clusters of interest
Understanding the Map
What Does Position Mean?
Points that are close together represent MicroSims with similar:
- Learning objectives
- Subject matter
- Concepts covered
- Technical approaches
- Target audience
Points that are far apart represent MicroSims that differ significantly in their educational content or purpose.
How It Works
-
Metadata Extraction: Each MicroSim's metadata (title, description, learning objectives, subject area, etc.) is combined into a text representation
-
Semantic Embeddings: The
all-MiniLM-L6-v2model from SentenceTransformers converts each text into a 384-dimensional vector that captures semantic meaning -
Dimensionality Reduction: PCA reduces these 384 dimensions to 2 dimensions while preserving as much variance (similarity relationships) as possible
-
Visualization: The 2D coordinates are plotted using Plotly with color-coding by subject area
Why PCA?
Principal Component Analysis finds the directions of maximum variance in the data. The first principal component (x-axis) captures the direction of greatest variation between MicroSims, while the second (y-axis) captures the next most significant direction of variation.
This means:
- The x-axis typically separates MicroSims by their most distinguishing characteristic (often subject area or primary learning domain)
- The y-axis captures secondary distinguishing features (often visualization type, complexity level, or sub-topic)
Clusters and Patterns
Look for:
- Subject area clusters - MicroSims of the same subject often form visible groups
- Cross-subject bridges - MicroSims that span multiple disciplines appear between clusters
- Outliers - Unique or highly specialized MicroSims appear at the edges
Technical Details
| Aspect | Value |
|---|---|
| Embedding Model | all-MiniLM-L6-v2 |
| Original Dimensions | 384 |
| Reduced Dimensions | 2 |
| Reduction Method | PCA |
| Visualization Library | Plotly.js |
Related Resources
- Faceted Search - Search MicroSims by filters
- Similar MicroSims Finder - Find similar MicroSims for any given URL
- Metadata Quality Report - Quality metrics for MicroSim metadata
This visualization is generated from embeddings created by src/embeddings/generate-embeddings.py