RAG Operations Guide

ADR-0049: Multi-Agent LLM Memory Architecture

This guide covers the RAG (Retrieval-Augmented Generation) system operations for Qubinode Navigator.

Overview

The RAG system provides:

Document Storage: ADRs, DAG examples, provider docs
Semantic Search: Find relevant documents by meaning, not just keywords
Troubleshooting Memory: Learn from past solutions
Agent Decisions: Track decision history for learning

Architecture

┌────────────────────────────────────────────────────────────────┐
│                        RAG Store                                │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │ Embedding   │  │  PgVector   │  │    PostgreSQL           │ │
│  │ Service     │─▶│  Extension  │─▶│    Tables               │ │
│  │ (MiniLM)    │  │  (vector)   │  │                         │ │
│  └─────────────┘  └─────────────┘  │  - rag_documents        │ │
│                                     │  - troubleshooting_     │ │
│                                     │    attempts             │ │
│                                     │  - agent_decisions      │ │
│                                     │  - airflow_providers    │ │
│                                     └─────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘

Database Schema

rag_documents

Stores document embeddings for semantic search.

Column	Type	Description
id	UUID	Primary key
content	TEXT	Document content
content_hash	VARCHAR(64)	SHA256 for deduplication
embedding	vector(384)	MiniLM embedding
doc_type	VARCHAR(50)	adr, dag, provider_doc, etc.
source_path	TEXT	File path or URL
metadata	JSONB	Additional metadata
created_at	TIMESTAMP	Creation time

troubleshooting_attempts

Records troubleshooting history for learning.

Column	Type	Description
id	UUID	Primary key
session_id	UUID	Session identifier
task_description	TEXT	What was attempted
error_message	TEXT	Error encountered
attempted_solution	TEXT	Solution tried
result	VARCHAR(20)	success, failed, partial
embedding	vector(384)	For similarity search
confidence_score	FLOAT	Confidence when attempted
agent	VARCHAR(50)	Which agent made decision

agent_decisions

Logs all agent decisions for auditing.

Column	Type	Description
id	UUID	Primary key
agent	VARCHAR(50)	manager, developer, calling_llm
decision_type	VARCHAR(50)	task_execution, escalation, etc.
decision	TEXT	What was decided
reasoning	TEXT	Why it was decided
confidence	FLOAT	Confidence score
outcome	VARCHAR(20)	success, failed, pending

Operations

Document Ingestion

Via MCP Tool:

ingest_to_rag(
  content="# My Document\n\nContent here...",
  doc_type="guide",
  source="/path/to/file.md",
  metadata={"author": "team", "version": "1.0"}
)

Via Python:

from qubinode.rag_store import get_rag_store

store = get_rag_store()
store.ingest_document(
    content="Document content",
    doc_type="adr",
    source_path="/docs/adrs/adr-0001.md",
    metadata={"title": "ADR-0001"}
)

Document Search

Via MCP Tool:

query_rag(
  query="How do I deploy FreeIPA?",
  doc_types=["adr", "dag", "guide"],
  limit=5,
  threshold=0.7
)

Via Python:

results = store.search_documents(
    query="FreeIPA deployment",
    doc_types=["adr", "dag"],
    limit=5,
    threshold=0.7
)

for doc in results:
    print(f"Score: {doc['similarity']:.2f}")
    print(f"Type: {doc['doc_type']}")
    print(f"Content: {doc['content'][:200]}...")

Troubleshooting History

Log an attempt:

log_troubleshooting_attempt(
  task="Deploy FreeIPA server",
  solution="Added entry to /etc/hosts for DNS resolution",
  result="success",
  error_message="DNS lookup failed for ipa.example.com",
  component="freeipa"
)

Search similar errors:

get_troubleshooting_history(
  error_pattern="DNS",
  component="freeipa",
  only_successful=True
)

Statistics

Via MCP Tool:

get_rag_stats()

Via SQL:

-- Document counts by type
SELECT doc_type, COUNT(*)
FROM rag_documents
GROUP BY doc_type;

-- Troubleshooting success rate
SELECT result, COUNT(*)
FROM troubleshooting_attempts
GROUP BY result;

-- Recent agent decisions
SELECT agent, decision_type, confidence, created_at
FROM agent_decisions
ORDER BY created_at DESC
LIMIT 10;

Embedding Service

Configuration

Variable	Default	Description
`EMBEDDING_PROVIDER`	local	local or openai
`EMBEDDING_MODEL`	sentence-transformers/all-MiniLM-L6-v2	Model name
`EMBEDDING_DIMENSIONS`	384	Vector dimensions
`OPENAI_API_KEY`	-	Required for OpenAI

Local Model (Default)

Uses sentence-transformers/all-MiniLM-L6-v2:

384 dimensions
Works offline (air-gapped)
Good balance of speed and quality
~80MB model size

OpenAI Model (Optional)

Uses text-embedding-ada-002:

1536 dimensions
Requires API key and internet
Higher quality but external dependency

To switch:

export EMBEDDING_PROVIDER=openai
export EMBEDDING_MODEL=text-embedding-ada-002
export EMBEDDING_DIMENSIONS=1536
export OPENAI_API_KEY=your-key-here

Bootstrap DAG

The rag_bootstrap DAG initializes the knowledge base:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  ingest_adrs    │     │ ingest_dags     │     │ingest_providers │
└────────┬────────┘     └────────┬────────┘     └────────┬────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
                                 ▼
                    ┌─────────────────────────┐
                    │   verify_rag_health     │
                    └────────────┬────────────┘
                                 │
                                 ▼
                    ┌─────────────────────────┐
                    │ generate_lineage_facets │
                    └─────────────────────────┘

Trigger:

airflow dags trigger rag_bootstrap

Tasks:

ingest_adrs - Ingests docs/adrs/*.md
ingest_dag_examples - Ingests existing DAG files
ingest_provider_docs - Ingests provider documentation
ingest_guides - Ingests troubleshooting guides
verify_rag_health - Verifies system health
generate_lineage_facets - Generates OpenLineage facets

Maintenance

Rebuilding Index

If similarity search is slow:

-- Drop and recreate IVFFlat index
DROP INDEX IF EXISTS idx_rag_documents_embedding;

CREATE INDEX idx_rag_documents_embedding
    ON rag_documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

Cleaning Duplicates

-- Remove duplicate documents by content hash
DELETE FROM rag_documents a
USING rag_documents b
WHERE a.id > b.id
AND a.content_hash = b.content_hash;

Updating Embeddings

If you change embedding models:

from qubinode.rag_store import get_rag_store
from qubinode.embedding_service import get_embedding_service

store = get_rag_store()
embedding_service = get_embedding_service()

# Re-embed all documents
with store._get_connection() as conn:
    with conn.cursor() as cur:
        cur.execute("SELECT id, content FROM rag_documents")
        for row in cur.fetchall():
            doc_id, content = row
            new_embedding = embedding_service.embed(content)
            cur.execute(
                "UPDATE rag_documents SET embedding = %s WHERE id = %s",
                (new_embedding, doc_id)
            )
    conn.commit()

Performance Tuning

Index Lists

The IVFFlat index lists parameter affects performance:

Documents	Recommended Lists
< 1,000	50
1,000 - 10,000	100
10,000 - 100,000	200
> 100,000	400+

Query Optimization

-- Set probes for accuracy vs speed tradeoff
SET ivfflat.probes = 10;  -- Higher = more accurate, slower

Batch Operations

For bulk ingestion:

# Use batch embedding
texts = ["doc1", "doc2", "doc3", ...]
embeddings = embedding_service.embed_batch(texts, batch_size=32)

Troubleshooting

“pgvector extension not found”

CREATE EXTENSION IF NOT EXISTS vector;

Or check Docker image:

docker exec airflow-postgres-1 psql -U airflow -c "SELECT * FROM pg_extension WHERE extname = 'vector'"

“Embedding dimension mismatch”

Ensure EMBEDDING_DIMENSIONS matches your model:

MiniLM-L6: 384
ada-002: 1536

Slow Queries

Check index exists:

SELECT indexname FROM pg_indexes WHERE tablename = 'rag_documents';

Increase probes:
```
SET ivfflat.probes = 20;
```
Rebuild index with more lists (see Rebuilding Index)

Empty Results

Check document count:
```
SELECT COUNT(*) FROM rag_documents;
```

Lower threshold:

results = store.search_documents(query, threshold=0.3)

Run bootstrap:
```
airflow dags trigger rag_bootstrap
```

RAG Operations Guide

Overview

Architecture

Database Schema

rag_documents

troubleshooting_attempts

agent_decisions

Operations

Document Ingestion

Document Search

Troubleshooting History

Statistics

Embedding Service

Configuration

Local Model (Default)

OpenAI Model (Optional)

Bootstrap DAG

Maintenance

Rebuilding Index

Cleaning Duplicates

Updating Embeddings

Performance Tuning

Index Lists

Query Optimization

Batch Operations

Troubleshooting

“pgvector extension not found”

“Embedding dimension mismatch”

Slow Queries

Empty Results

Related Documentation