Menu
Technical

How WordPress Chatbot RAG (Retrieval-Augmented Generation) Works

April 12, 2026
11 min read
By Sant Chat AI Team

You've heard AI chatbots "learn from your website." But how does that actually work? Let's break down RAG (Retrieval-Augmented Generation) in simple terms.

What is RAG?

RAG = Retrieval-Augmented Generation

Think of it like an open-book exam:

  • Traditional AI: Answers from memory (limited, outdated)
  • RAG-powered AI: Looks up the answer in your "book" (your website), then explains it

Why this matters for WordPress chatbots:

  • AI knows your specific content
  • Answers are always up-to-date
  • No training required
  • Works with any website

The Problem RAG Solves

Traditional AI Limitations

GPT-4 by itself:

  • Trained on data up to 2024
  • Doesn't know your business
  • Can't access your website
  • Makes up answers when uncertain ("hallucinations")

Example failure:

Visitor asks: "What's your refund policy?"

GPT-4 alone: "I don't know your specific refund policy."

Not helpful.

How RAG Fixes This

With RAG:

  1. AI searches your website for "refund policy"
  2. Finds your /refund-policy/ page
  3. Reads the content
  4. Generates an answer from that content

Example success:

Visitor asks: "What's your refund policy?"

RAG-powered AI:

  1. Searches your site
  2. Finds: "We offer 30-day money-back guarantees on all purchases"
  3. Answers: "We offer a 30-day money-back guarantee on all purchases. If you're not satisfied, contact us within 30 days for a full refund."

Helpful and accurate.


How RAG Works (Simple Explanation)

Step 1: Ingestion (Reading Your Website)

What happens:

  • Sant Chat AI scans your WordPress sitemap
  • Extracts text from all pages and posts
  • Cleans formatting (removes HTML, navigation, etc.)
  • Splits content into "chunks" (paragraphs or sections)

Example:

Your /shipping-policy/ page has 3 sections:

  1. Domestic Shipping (200 words)
  2. International Shipping (150 words)
  3. Expedited Shipping (100 words)

Result: 3 chunks stored separately

Why chunks?

  • Smaller pieces are easier to search
  • More precise answers
  • Faster retrieval

Step 2: Embeddings (Converting to Numbers)

This is where it gets magical.

What are embeddings? Embeddings convert text into numbers that represent meaning.

Simplified example:

Text: "We ship within 2-3 business days" Embedding: [0.2, 0.8, 0.1, 0.5, ...] (actually 1,536 numbers)

Text: "Delivery takes 2-3 days" Embedding: [0.21, 0.79, 0.12, 0.51, ...] (very similar numbers!)

Key insight: Phrases with similar meanings have similar embeddings, even if the words are different.

Why this matters:

  • AI understands synonyms ("ship" = "deliver")
  • Understands variations ("2-3 days" = "a few days")
  • Can find relevant content even if exact words don't match

Step 3: Storage (Vector Database)

All those embeddings are stored in a "vector database."

Think of it like a library:

  • Each chunk is a book
  • Embeddings are the Dewey Decimal numbers
  • Books with similar numbers are related

Example vector database entry:

Chunk ID: 123
Text: "We ship within 2-3 business days for domestic orders."
Embedding: [0.2, 0.8, 0.1, ...]
Source: /shipping-policy/

Sant Chat AI stores thousands of these chunks for your website.

Step 4: Query (Searching)

When a visitor asks a question:

Visitor asks: "How fast do you ship?"

What happens:

  1. AI converts the question to an embedding: [0.19, 0.81, 0.11, ...]
  2. Compares to all stored chunk embeddings
  3. Finds the closest matches (similar numbers = similar meaning)
  4. Retrieves the top 5-10 most relevant chunks

Example search results:

RankChunkSimilarity Score
1"We ship within 2-3 business days..."0.89
2"Expedited shipping available for $15..."0.74
3"International orders take 7-14 days..."0.68

AI now has the relevant content to work with.

Step 5: Generation (Crafting the Answer)

AI takes the retrieved chunks and generates a natural answer.

Input to AI:

  • Question: "How fast do you ship?"
  • Context (top chunks):
    • "We ship within 2-3 business days for domestic orders."
    • "Expedited shipping available for $15 (1-2 days)."
    • "International orders take 7-14 days."

AI generates:

"We ship domestic orders within 2-3 business days. If you need it faster, we offer expedited shipping (1-2 days) for $15. International orders typically take 7-14 days."

Why this is better than just showing the chunks:

  • Answer is conversational
  • Combines multiple sources
  • Tailored to the question
  • Easy to understand

The RAG Pipeline in Action

Real Example: Refund Question

1. Visitor asks: "Can I get a refund if I don't like the product?"

2. Query embedding created: [0.15, 0.72, 0.34, ...]

3. Vector search finds relevant chunks:

Top match (0.91 similarity):

"We offer a 30-day money-back guarantee. If you're not satisfied with your purchase, contact us within 30 days for a full refund."

Second match (0.78 similarity):

"Refunds are processed within 5-7 business days after we receive the returned item."

Third match (0.65 similarity):

"Products must be in original packaging and unused for a full refund."

4. AI combines these and generates:

"Yes! We offer a 30-day money-back guarantee. If you're not satisfied, you can get a full refund as long as the product is in original packaging and unused. Contact us within 30 days, and we'll process your refund within 5-7 business days after receiving the return."

5. AI responds to visitor

Visitor is happy!


Why RAG is Better Than Traditional Chatbots

Old Approach: Rule-Based Chatbots

How they work:

  • You define intents: "refund," "shipping," "pricing"
  • For each intent, you write sample phrases: "I want a refund," "How do I get my money back?"
  • You manually write responses

Problems:

  • Takes weeks to set up
  • Can't handle variations ("Can I get my money back?" vs. "Refund please")
  • Breaks when you update your website
  • Requires constant maintenance

RAG Approach

How it works:

  • Point AI at your website
  • AI reads everything automatically
  • Answers any question about your content
  • Updates when you publish new content

Advantages:

  • 5 minutes to set up
  • Handles infinite variations
  • Automatically stays current
  • No maintenance (just sync after updates)

How Sant Chat AI Uses RAG

The Full Workflow

Initial Setup (one-time):

  1. Install plugin on your WordPress site
  2. Connect API key
  3. Sync knowledge base
    • AI reads your sitemap
    • Extracts all page/post content
    • Creates embeddings for 500-2,000 chunks (depending on site size)
    • Stores in vector database

Time: 30-60 seconds

Every Conversation:

  1. Visitor asks question
  2. Question converted to embedding
  3. Vector search finds relevant chunks
  4. AI generates answer from chunks
  5. Answer sent to visitor

Time: 1-3 seconds

When You Update Content:

  1. Publish new page or update existing page
  2. Click "Sync Knowledge Base" in WordPress
  3. AI re-reads changed content
  4. Updates embeddings

Time: 10-30 seconds


Technical Deep Dive (Optional)

For the curious: Here's what's happening under the hood.

Vector Similarity Math

How does AI find similar embeddings?

Cosine similarity formula:

similarity = (A · B) / (||A|| × ||B||)

What this means:

  • Compares two vectors (embeddings)
  • Returns a score from 0 (completely different) to 1 (identical)
  • Sant Chat AI retrieves chunks with score > 0.7

Embedding Models

Sant Chat AI uses OpenAI's text-embedding-3-small model:

  • Each embedding: 1,536 dimensions
  • Cost: $0.00002 per 1,000 tokens
  • Speed: 10,000 chunks/second

Why this model?

  • High accuracy
  • Fast
  • Multilingual support
  • Affordable

Chunk Size Optimization

Sant Chat AI uses:

  • Target chunk size: 300-500 tokens (~200-350 words)
  • Overlap: 50 tokens between chunks
  • Why overlap? Ensures context isn't lost at boundaries

Example with overlap:

Chunk 1: "...We offer free shipping on orders over $50. International shipping is available..."

Chunk 2: "...International shipping is available to 100+ countries. Delivery times vary by location..."

Notice "International shipping is available" appears in both → better retrieval.

Retrieval Strategy

Sant Chat AI retrieves:

  • Top 5 chunks by similarity score
  • Minimum score: 0.65 (lower = too irrelevant)
  • Maximum tokens sent to AI: 2,000

Why limit to 5 chunks?

  • More chunks = longer response time
  • More chunks = higher cost
  • Diminishing returns after top 5

Common Questions About RAG

Q: Does the AI "remember" previous conversations?

A: No. Each question is answered independently using RAG.

But: Some chatbots (like Sant Chat AI Pro+) can maintain conversation history for context.

Example:

  • Visitor: "Do you ship to Canada?"
  • AI: "Yes, we ship to Canada. Delivery takes 7-10 business days."
  • Visitor: "How much does it cost?"
  • AI: knows "it" refers to Canadian shipping → "Shipping to Canada costs $15 for orders under $100, free for orders over $100."

Q: Can RAG make mistakes?

A: Yes, but rarely.

Possible errors:

  1. No relevant content found → AI says "I don't have information about that"
  2. Wrong content retrieved → AI answers based on unrelated chunk (rare with good embeddings)
  3. Content is contradictory → AI tries to reconcile, may confuse

Solution: Organize your content clearly, run weekly tests.

Q: Does RAG work with images/videos?

A: Not directly. RAG reads text only.

Workaround:

  • Add image alt text (AI can read this)
  • Add video transcripts or summaries

Q: How big of a website can RAG handle?

A: Very large.

Sant Chat AI limits:

  • Free/Starter: 500 pages
  • Pro: 2,000 pages
  • Business: 10,000 pages

For reference: Most WordPress sites have < 500 pages.

Q: Does RAG slow down my website?

A: No. All processing happens on external servers.

Impact on your site: Zero. The chatbot is a lightweight JavaScript widget.


Optimizing RAG Performance

1. Write Clear, Structured Content

Good for RAG:

## What is your refund policy?

We offer a 30-day money-back guarantee on all purchases.

**Requirements:**
- Product must be unused
- Original packaging required
- Proof of purchase needed

**Process:**
1. Contact support@yoursite.com
2. We'll send return instructions
3. Refund processed within 5-7 days

Why it works: Clear structure, all info in one place, easy to chunk.

Bad for RAG:

Our policy is great! Just email us if you want to return something. We might accept it depending on various factors.

Why it fails: Vague, no clear answer, AI can't extract useful info.

2. Avoid Contradictions

Problem:

  • Page 1: "Free shipping on orders over $50"
  • Page 2: "Free shipping on orders over $75"

Result: AI gets confused, might give wrong answer.

Solution: One source of truth per topic.

3. Update Regularly

RAG only knows what you've synced.

Best practice:

  • Publish new content → Sync immediately
  • Update existing content → Sync within 24 hours
  • Set reminder: Monthly sync (catches any missed updates)

4. Use FAQs Strategically

Create an FAQ page for RAG:

  • Top 50 most common questions
  • Direct, complete answers
  • Cover edge cases

Result: AI always finds the right answer because it's explicitly written.


The Future of RAG

RAG is rapidly evolving.

Current (2026): Text-based RAG (what Sant Chat AI uses)

Coming soon:

  • Multimodal RAG: AI reads images, videos, PDFs
  • Real-time RAG: AI searches live databases (order status, inventory)
  • Hybrid RAG: Combines website content + CRM data + support tickets

What this means for WordPress chatbots:

  • Smarter answers
  • More context
  • Fewer limitations

The Bottom Line

RAG is the technology that makes WordPress AI chatbots actually useful:

  1. Reads your entire website (automatic)
  2. Understands meaning (not just keyword matching)
  3. Finds relevant content (vector search)
  4. Generates natural answers (AI synthesis)
  5. Stays current (sync when you update)

Why it matters:

  • No manual training
  • Works out-of-the-box
  • Accurate answers
  • Scales to any website size

You don't need to understand the math behind RAG to use it—just know that it works.

Try RAG-powered chat on your WordPress site →

Questions about how RAG works? Ask our chatbot—it's powered by RAG, so it can explain itself!

Tags:RAGAI TechnologyVector SearchEmbeddingsTechnical Explanation

Ready to Add AI Chat to Your WordPress Site?

Start free with 300 AI responses per month. No credit card required.

Get Started Free