Skip to content

The CCGCC Pattern: Content → Concepts → Graph → Compression → Chatbot

The CCGCC pattern adds a critical stage between the knowledge graph and the chatbot: graph compression. Rather than dumping raw subgraph data into the LLM's context window, a compression algorithm transforms the selected subgraph into a compact, structured markdown representation designed to maximize the information density per token.

This pattern produces the best results of the three pipeline stages because the compressed graph context gives the LLM precisely the structured knowledge it needs — no more, no less.

Overview

1
2
3
┌─────────┐    ┌──────────┐    ┌───────┐    ┌─────────────┐    ┌─────────┐
│ Content  │──▶│ Concepts │──▶│ Graph │──▶│ Compression │──▶│ Chatbot │
└─────────┘    └──────────┘    └───────┘    └─────────────┘    └─────────┘

The five components of the CCGCC pattern are:

  1. Content — the source documents
  2. Concepts — entities and relationships extracted by an NLP pipeline
  3. Graph — a knowledge graph connecting concepts through typed relationships
  4. Compression — an algorithm that transforms a relevant subgraph into token-efficient markdown
  5. Chatbot — the conversational interface that uses the compressed context to generate responses

Why Compression Matters

The Context Window Is a Precious Resource

The LLM's context window is finite and shared across several competing demands:

  • System prompt — instructions, persona, safety rules
  • Conversation history — prior turns in the dialogue
  • Retrieved context — the knowledge the model needs to answer
  • Generation budget — tokens reserved for the model's response

Every token spent on verbose or redundant context is a token unavailable for richer knowledge or longer conversation history. The context window is a precious community resource — the system prompt, the user's question, the conversation history, and the retrieved knowledge all compete for space. Managing it well is the difference between a chatbot that gives precise, well-grounded answers and one that runs out of room and drops critical information.

Raw Subgraphs Are Wasteful

In the CCGC pattern, a subgraph serialized as raw triples or verbose natural language consumes far more tokens than necessary:

1
2
3
4
5
6
# Uncompressed — 45 tokens for 3 facts
(transformer_architecture, invented_by, Vaswani_et_al)
(transformer_architecture, introduced_in, Attention_Is_All_You_Need)
(transformer_architecture, uses_mechanism, self_attention)
(self_attention, is_a, attention_mechanism)
(attention_mechanism, computes, weighted_sum_of_values)

The compression stage eliminates this waste.

The Compression Algorithm

Graph compression takes a selected subgraph and produces a structured markdown document optimized for LLM consumption. The algorithm has three phases.

Phase 1: Subgraph Selection

Using the traversal strategies from the CCGC pattern — seed node identification, neighborhood expansion, community retrieval, or path-based retrieval — the system selects the relevant portion of the knowledge graph.

Phase 2: Structural Analysis

The compression algorithm analyzes the subgraph to identify:

  • Central nodes — concepts with the highest degree or betweenness centrality within the subgraph
  • Hierarchical relationships — is-a, part-of, and category membership edges that form natural outlines
  • Lateral relationships — connections between peer concepts at the same level of abstraction
  • Attribute clusters — groups of properties attached to a single entity

This structural analysis determines how the information will be organized in the compressed output.

Phase 3: Markdown Generation

The algorithm renders the subgraph as compact, hierarchical markdown that mirrors the structure a human expert would use to explain the topic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Transformer Architecture
- Introduced by Vaswani et al. (2017) in "Attention Is All You Need"
- Core mechanism: self-attention (computes weighted sum of values)
- Components: encoder stack, decoder stack, multi-head attention
- Key property: processes all positions in parallel (unlike RNNs)

### Variants
- **BERT**: encoder-only, bidirectional, developed by Google
- **GPT**: decoder-only, autoregressive, developed by OpenAI
- **T5**: encoder-decoder, text-to-text framework, developed by Google

### Downstream Impact
- Enabled transfer learning for NLP via pre-training
- Foundation for modern RAG pipelines
- Replaced RNN/LSTM as dominant sequence architecture

Compare this to the raw triple format — the markdown version conveys more information in fewer tokens by:

  • Using hierarchy to imply relationships (indentation replaces explicit "is-a" triples)
  • Grouping attributes under their entity (eliminates repeated subject references)
  • Omitting obvious predicates ("developed by" is implied by the parenthetical)
  • Using natural shorthand that the LLM can parse fluently

Compression Techniques

The algorithm applies several specific techniques:

Hierarchical Nesting

Parent-child and category relationships become heading levels and bullet indentation. The structure itself carries meaning, so explicit relationship labels can be dropped.

Attribute Consolidation

Multiple properties of a single entity are merged into a single bullet or comma-separated list rather than one triple per property.

Redundancy Elimination

If a fact can be inferred from the structure (e.g., all items under "### Variants" are variants of the parent heading), the explicit relationship edge is omitted.

Priority Ranking

When the subgraph is too large to fit the token budget even after compression, facts are ranked by relevance to the user's question. Lower-ranked facts are dropped first, preserving the most important information.

Edge Summarization

Dense clusters of similar relationships are summarized rather than enumerated. Instead of listing 20 individual "uses" edges, the compressed output might say "Used by most modern NLP systems including BERT, GPT, and T5."

A Neuro-Symbolic Design

The CCGCC pattern is a neuro-symbolic architecture — it combines two fundamentally different reasoning approaches:

Component Paradigm Strength
Graph traversal Symbolic — classical, deterministic Precise, explainable, follows exact relationships
LLM generation Neural — statistical, probabilistic Fluent, flexible, handles ambiguity and novel phrasing

The symbolic stage (graph traversal and compression) ensures that the right knowledge reaches the LLM. The neural stage (language generation) ensures that knowledge is communicated naturally to the user.

This division of labor plays to each paradigm's strengths:

  • Deterministic graph traversal will never hallucinate a relationship that doesn't exist in the graph. It provides a reliable, auditable chain of reasoning from question to retrieved context.
  • The LLM excels at interpreting the user's intent, synthesizing the compressed context into a coherent narrative, and handling the infinite variety of natural language questions.

Neither paradigm alone achieves the quality of the combined system. Vector retrieval (CVC) misses structural relationships. Graph traversal alone cannot generate fluent natural language. The CCGCC pattern gets the best of both worlds.

Comparing All Three Patterns

Aspect CVC CCGC CCGCC
Context source Raw text chunks Subgraph triples Compressed markdown
Token efficiency Low — verbose fragments Medium — structured but redundant High — optimized per token
Relationship reasoning Implicit only Explicit edges Explicit + hierarchical structure
Context window usage Wasteful Better Best — managed as a precious resource
Answer quality Good for simple questions Better for connected questions Best — precise, grounded, structured
Architecture Pure neural Neural + graph Neuro-symbolic
Complexity Low Medium Higher — requires compression layer

Strengths of the CCGCC Pattern

  • Maximum information density — the compression step packs more knowledge into fewer tokens
  • Better answers — the LLM receives well-organized, relevant context that mirrors expert explanation
  • Context window discipline — explicit management of the token budget prevents information loss
  • Neuro-symbolic synergy — deterministic graph reasoning combined with neural language generation
  • Explainable pipeline — each stage produces inspectable intermediate outputs (extracted concepts, graph, compressed markdown, generated answer)
  • Scalable to large graphs — compression ensures that even graphs with millions of nodes can serve a fixed-size context window

Limitations of the CCGCC Pattern

  • Compression design is non-trivial — the markdown generation algorithm must be tuned to the domain and the LLM's parsing strengths
  • Information loss — any compression discards some detail; the ranking heuristic may drop facts that turn out to be relevant
  • Pipeline latency — the additional compression stage adds processing time between query and response
  • Requires the full CCGC pipeline — graph compression only works if the upstream extraction and graph construction stages are in place