Skip to content

Response Generation Architecture

Not every chatbot answer is generated the same way. A well-designed system routes each request to the cheapest strategy that can answer it correctly. This flowchart shows the full pipeline from user input to a validated response, including the three-way strategy decision and the quality-check feedback loop. Hover any stage for details.

Interactive Demo

Run MicroSim Fullscreen

To embed this MicroSim in your own page, use the following iframe:

1
<iframe src="main.html" width="100%" height="472" scrolling="no"></iframe>

Overview

The pipeline flows left to right:

  1. User Input (blue) enters the system.
  2. Intent Classification (orange decision) identifies the kind of request.
  3. Response Strategy (orange decision) routes the request down one of three paths:
  4. Template Engine for simple FAQ queries.
  5. Retrieval System for factual questions.
  6. LLM Generator for complex, open-ended questions.
  7. Response Formatter (green) combines the chosen output with injected context.
  8. Quality Checker (red) validates the response. On pass it goes to User Output (blue); on failure (dashed line) it loops back to the LLM generator.

Context Retrieval feeds context into both strategy selection and formatting (dotted "Context injection" arrows).

Lesson Plan

  • Pick the path. Give students sample questions and have them choose the template, retrieval, or LLM path and justify it on cost and accuracy.
  • Why validate? Discuss what the quality checker catches and why a feedback loop to the generator is worth the extra latency.
  • Context injection. Have students explain why context is injected at two points rather than one.
  • Cost ordering. Rank the three strategies by typical cost and latency, and relate that to the order a system should try them.

References