The CCGCC Pattern: Content → Concepts → Graph → Compression → Chatbot
The CCGCC pattern adds a critical stage between the knowledge graph and the chatbot: graph compression. Rather than dumping raw subgraph data into the LLM's context window, a compression algorithm transforms the selected subgraph into a compact, structured markdown representation designed to maximize the information density per token.
This pattern produces the best results of the three pipeline stages because the compressed graph context gives the LLM precisely the structured knowledge it needs — no more, no less.
Overview
1 2 3 | |
The five components of the CCGCC pattern are:
- Content — the source documents
- Concepts — entities and relationships extracted by an NLP pipeline
- Graph — a knowledge graph connecting concepts through typed relationships
- Compression — an algorithm that transforms a relevant subgraph into token-efficient markdown
- Chatbot — the conversational interface that uses the compressed context to generate responses
Why Compression Matters
The Context Window Is a Precious Resource
The LLM's context window is finite and shared across several competing demands:
- System prompt — instructions, persona, safety rules
- Conversation history — prior turns in the dialogue
- Retrieved context — the knowledge the model needs to answer
- Generation budget — tokens reserved for the model's response
Every token spent on verbose or redundant context is a token unavailable for richer knowledge or longer conversation history. The context window is a precious community resource — the system prompt, the user's question, the conversation history, and the retrieved knowledge all compete for space. Managing it well is the difference between a chatbot that gives precise, well-grounded answers and one that runs out of room and drops critical information.
Raw Subgraphs Are Wasteful
In the CCGC pattern, a subgraph serialized as raw triples or verbose natural language consumes far more tokens than necessary:
1 2 3 4 5 6 | |
The compression stage eliminates this waste.
The Compression Algorithm
Graph compression takes a selected subgraph and produces a structured markdown document optimized for LLM consumption. The algorithm has three phases.
Phase 1: Subgraph Selection
Using the traversal strategies from the CCGC pattern — seed node identification, neighborhood expansion, community retrieval, or path-based retrieval — the system selects the relevant portion of the knowledge graph.
Phase 2: Structural Analysis
The compression algorithm analyzes the subgraph to identify:
- Central nodes — concepts with the highest degree or betweenness centrality within the subgraph
- Hierarchical relationships — is-a, part-of, and category membership edges that form natural outlines
- Lateral relationships — connections between peer concepts at the same level of abstraction
- Attribute clusters — groups of properties attached to a single entity
This structural analysis determines how the information will be organized in the compressed output.
Phase 3: Markdown Generation
The algorithm renders the subgraph as compact, hierarchical markdown that mirrors the structure a human expert would use to explain the topic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Compare this to the raw triple format — the markdown version conveys more information in fewer tokens by:
- Using hierarchy to imply relationships (indentation replaces explicit "is-a" triples)
- Grouping attributes under their entity (eliminates repeated subject references)
- Omitting obvious predicates ("developed by" is implied by the parenthetical)
- Using natural shorthand that the LLM can parse fluently
Compression Techniques
The algorithm applies several specific techniques:
Hierarchical Nesting
Parent-child and category relationships become heading levels and bullet indentation. The structure itself carries meaning, so explicit relationship labels can be dropped.
Attribute Consolidation
Multiple properties of a single entity are merged into a single bullet or comma-separated list rather than one triple per property.
Redundancy Elimination
If a fact can be inferred from the structure (e.g., all items under "### Variants" are variants of the parent heading), the explicit relationship edge is omitted.
Priority Ranking
When the subgraph is too large to fit the token budget even after compression, facts are ranked by relevance to the user's question. Lower-ranked facts are dropped first, preserving the most important information.
Edge Summarization
Dense clusters of similar relationships are summarized rather than enumerated. Instead of listing 20 individual "uses" edges, the compressed output might say "Used by most modern NLP systems including BERT, GPT, and T5."
A Neuro-Symbolic Design
The CCGCC pattern is a neuro-symbolic architecture — it combines two fundamentally different reasoning approaches:
| Component | Paradigm | Strength |
|---|---|---|
| Graph traversal | Symbolic — classical, deterministic | Precise, explainable, follows exact relationships |
| LLM generation | Neural — statistical, probabilistic | Fluent, flexible, handles ambiguity and novel phrasing |
The symbolic stage (graph traversal and compression) ensures that the right knowledge reaches the LLM. The neural stage (language generation) ensures that knowledge is communicated naturally to the user.
This division of labor plays to each paradigm's strengths:
- Deterministic graph traversal will never hallucinate a relationship that doesn't exist in the graph. It provides a reliable, auditable chain of reasoning from question to retrieved context.
- The LLM excels at interpreting the user's intent, synthesizing the compressed context into a coherent narrative, and handling the infinite variety of natural language questions.
Neither paradigm alone achieves the quality of the combined system. Vector retrieval (CVC) misses structural relationships. Graph traversal alone cannot generate fluent natural language. The CCGCC pattern gets the best of both worlds.
Comparing All Three Patterns
| Aspect | CVC | CCGC | CCGCC |
|---|---|---|---|
| Context source | Raw text chunks | Subgraph triples | Compressed markdown |
| Token efficiency | Low — verbose fragments | Medium — structured but redundant | High — optimized per token |
| Relationship reasoning | Implicit only | Explicit edges | Explicit + hierarchical structure |
| Context window usage | Wasteful | Better | Best — managed as a precious resource |
| Answer quality | Good for simple questions | Better for connected questions | Best — precise, grounded, structured |
| Architecture | Pure neural | Neural + graph | Neuro-symbolic |
| Complexity | Low | Medium | Higher — requires compression layer |
Strengths of the CCGCC Pattern
- Maximum information density — the compression step packs more knowledge into fewer tokens
- Better answers — the LLM receives well-organized, relevant context that mirrors expert explanation
- Context window discipline — explicit management of the token budget prevents information loss
- Neuro-symbolic synergy — deterministic graph reasoning combined with neural language generation
- Explainable pipeline — each stage produces inspectable intermediate outputs (extracted concepts, graph, compressed markdown, generated answer)
- Scalable to large graphs — compression ensures that even graphs with millions of nodes can serve a fixed-size context window
Limitations of the CCGCC Pattern
- Compression design is non-trivial — the markdown generation algorithm must be tuned to the domain and the LLM's parsing strengths
- Information loss — any compression discards some detail; the ranking heuristic may drop facts that turn out to be relevant
- Pipeline latency — the additional compression stage adds processing time between query and response
- Requires the full CCGC pipeline — graph compression only works if the upstream extraction and graph construction stages are in place