Training-Friendly RAG Demo

Local LLM + VectorDB RAG Training UI

This UI explains how a knowledge base is stored in VectorDB, how nearest-neighbor retrieval works, and how the LLM behaves before and after retrieval.

Core idea

  • VectorDB stores text plus vector representation
  • KNN retrieval finds nearest document vectors to the question vector
  • LLM writes the final answer using the retrieved context

Why 3D visualization is useful

  • Shows conceptually how similar items cluster together
  • Highlights nearest retrieved documents
  • Explains why Top K matters
  • Helps users understand RAG step-by-step

1. System & KB Operations

Real stored vectors are 1536-D. The 3D graph below is a training projection for explanation.

What a KB is

A KB is a real VectorDB collection. Each record stores human-readable text, metadata, and a vector used for similarity search.

2. Add Document

Stored structure

{
  "id": "doc1",
  "doc_id": "doc1",
  "title": "Sample Note",
  "text": "...",
  "lang": "en",
  "metadata_json": "{...}",
  "vector": [1536 values]
}

3. Chat Test

LLM usage

  • Before RAG: LLM sees only the question
  • After RAG: LLM sees the question plus retrieved context
  • Input side: user question triggers retrieval flow
  • Output side: LLM turns retrieved facts into a natural-language answer

4. Step-by-Step RAG Flow

Step 1: Store document text and vector in the KB
Step 2: Convert user question into a comparable vector representation
Step 3: Run KNN / nearest-neighbor search in VectorDB
Step 4: Select Top K nearest documents
Step 5: Send question + retrieved context to the LLM
Step 6: LLM generates final answer
The graph below visualizes these steps conceptually in a 3D coordinate space.

5. 3D Coordinate Visualization of Retrieval

Blue documents in the KB Red question point Green nearest retrieved documents Gray Lines question-to-nearest relationships

How to read this chart

  • Every document is projected to a 3D coordinate for teaching purposes
  • The question is projected the same way
  • Nearest highlighted points are the documents retrieved for RAG
  • This helps explain why the answer was grounded using those records
  • The real database search still runs on the stored higher-dimensional vectors

6. Recommended Demo Flow

1. Create KB
2. Add doc1: Michael is a person.
3. Add doc2: Tencent Cloud VectorDB is used for semantic retrieval and RAG.
4. Add doc3: Singapore is in Southeast Asia.
5. Click List Documents
6. Ask Before RAG
7. Ask After RAG
8. Observe nearest points in the 3D graph
9. Explain that nearest docs are sent to the LLM as context

7. Live Response

Ready.