Intent Classification Pipeline
Before a chatbot can act, it must decide what the user wants — the intent. This diagram traces the query "I need to change my reservation for tomorrow" through text processing, two competing feature-extraction approaches (TF-IDF and BERT), a neural classifier, and a confidence check that decides whether to act or escalate to a human. Hover over any box for details.
Interactive Demo
You can embed this MicroSim in your own page with the following iframe:
1 | |
Overview
The pipeline has four labeled layers:
- Text processing (blue) — normalization (lowercasing, punctuation removal), tokenization, and optional stopword filtering (shown dashed because it is not always used).
- Semantic encoding (purple) — two parallel paths. TF-IDF is a simple, fast sparse vectorization; BERT encoding (highlighted) produces contextual embeddings and is recommended for better generalization.
- Classification (orange) — a neural network with a softmax output layer maps the feature vector to a probability for every candidate intent.
- Prediction and confidence (gray/yellow/green/red) — the probability table
shows
modify_reservation: 0.87as the winner. A confidence threshold of 0.70 decides the outcome: predictions at or above it proceed with the action (green); anything below it escalates to a human agent (red).
Showing both TF-IDF and BERT lets students compare a classic and a modern approach side by side, while the threshold step models how production systems avoid acting on low-confidence guesses.
Lesson Plan
- Trace the pipeline: Have students follow the example query through each layer and name what changes at each step.
- Compare features: Why might BERT generalize better than TF-IDF? When would TF-IDF still be a reasonable choice?
- Reason about the threshold: What happens to the example query at 0.87? What would happen if the top intent scored only 0.55?
- Discuss escalation: Why is escalating to a human better than acting on a low-confidence prediction?