# FastAPI RAG Search & Book Chat — README

This document explains the architecture, API design, and implementation details of the FastAPI application submitted for the **AI Engineering — Week 2 RAG Homework**.  
It covers all rubric items related to:

- The `/search` endpoint
- The `/api/chat_book` endpoint and UI integration
- Metadata filtering
- pgvector similarity search
- RAG context generation
- How to run the server
- Example request/response pairs
- System architecture

---

# 1. Overview

This FastAPI application exposes two primary routes for Retrieval-Augmented Generation (RAG):

### **1. `POST /search`**
A rubric-compliant, general-purpose RAG search endpoint supporting metadata filters, cosine similarity, top-k control, min-score filtering, and returns structured RAG items.

### **2. `POST /api/chat_book`**
A specialized wrapper used by the web UI (`/chat_book`) that:

- Lets the user optionally select a book and chapter  
- Embeds their question  
- Performs a filtered pgvector similarity search through a Supabase RPC  
- Builds a context block  
- Calls OpenAI ChatCompletion using the retrieved context  
- Returns the model’s response and RAG evidence

Both routes follow identical RAG principles; `/search` is more general, `/api/chat_book` is tailored for the assignment’s HTML interface.

---

# 2. System Architecture

The system implements a small but complete RAG stack:

```
 ┌──────────────────────┐
 │        Browser        │
 │   chat_book.html      │
 └──────────┬────────────┘
            │ AJAX (fetch)
            ▼
 ┌────────────────────────┐
 │        FastAPI         │
 │  /search, /api/chat_book
 │  - Compute embeddings
 │  - Call Supabase RPC
 │  - Build context block
 │  - Call OpenAI Chat GPT
 └──────┬─────────────────┘
        │
        ▼
 ┌────────────────────────┐
 │       Supabase         │
 │ Postgres + pgvector    │
 │  Function: match_book_queries
 │  Function: search_rag_content
 │  Uses cosine distance
 │   embedding <=> query
 └──────┬─────────────────┘
        │
        ▼
 ┌────────────────────────┐
 │   rag_content table     │
 │  Stores: content,      │
 │  embeddings, metadata   │
 └────────────────────────┘
```

### **High-Level Flow**
1. User enters a question + optional book/chapter filters.
2. FastAPI generates an embedding using OpenAI (`text-embedding-3-small`).
3. FastAPI calls Supabase via RPC:
   - pgvector computes similarity: `embedding <=> query_embedding`
   - Metadata filters are applied using AND semantics.
4. Top-k results are returned to FastAPI.
5. FastAPI concatenates these into `rag_context`.
6. FastAPI makes a ChatCompletion call with the context.
7. Response and supporting RAG items are sent back to the browser.

This architecture satisfies the rubric’s requirements for RAG, metadata filtering, vector similarity search, and endpoint design.

---

# 3. `/search` Endpoint (Rubric Deliverable)

### **Path:**  
`POST /search`

### **Purpose:**  
General-purpose RAG similarity search with full metadata filtering.

### **Request Body**

```json
{
  "query": "What role does the White Rabbit play?",
  "top_k": 5,
  "min_score": 0.35,
  "filters": {
    "author": "Lewis Carroll",
    "language": "en",
    "book_id": "pg11-images",
    "chapter": "CHAPTER I. Down the Rabbit-Hole"
  }
}
```

### **Behavior**
- Embeds `query` using OpenAI.
- Calls Supabase RPC function `search_rag_content`.
- Applies filters using AND semantics.
- Uses cosine similarity (`embedding <=> query_embedding`).
- Returns up to `top_k` items above `min_score`.

### **Response**
```json
{
  "items": [
    {
      "id": "pg11-images-ch1-p1",
      "score": 0.81,
      "content": "Alice was beginning to get very tired...",
      "metadata": {
        "title": "Alice’s Adventures in Wonderland",
        "author": "Carroll, Lewis",
        "book_id": "pg11-images",
        "chapter": "CHAPTER I. Down the Rabbit-Hole",
        "chapter_num": 1,
        "language": "en",
        "tags": ["project-gutenberg", "public-domain"]
      }
    }
  ],
  "applied_filters": {
    "author": "Lewis Carroll",
    "chapter": "CHAPTER I. Down the Rabbit-Hole",
    "min_score": 0.35
  },
  "top_k": 5
}
```

This fully meets all the rubric-required fields.

---

# 4. `/api/chat_book` Endpoint (UI-Facing)

### **Used by:**  
`chat_book.html`

### **Request**
```json
{
  "message": "Who is the White Rabbit?",
  "selectedBook": "Alice’s Adventures in Wonderland",
  "selectedChapter": "CHAPTER I. Down the Rabbit-Hole",
  "top_k": 5,
  "min_score": 0.25
}
```

### **Behavior**
- Embeds `message`.
- Sends embedding + selectedBook + selectedChapter into `match_book_queries`.
- Retrieves book paragraphs.
- Passes them as RAG context to ChatCompletion.
- Returns both:
  - ChatGPT answer  
  - Evidence paragraphs (`items`)
  - `applied_filters`

### **Response**
```json
{
  "response": "The White Rabbit is the frantic herald...",
  "items": [...],
  "applied_filters": {
    "book_title": "Alice’s Adventures in Wonderland",
    "chapter_title": "CHAPTER I. Down the Rabbit-Hole"
  },
  "top_k": 5
}
```

---

# 5. Relevant pgvector SQL

### Cosine similarity:
```sql
1 - (embedding <=> query_embedding) AS score
```

### Ordering:
```sql
ORDER BY embedding <=> query_embedding
```

### Tags filters:
```sql
tags @> tags_all_prm   -- ALL
tags && tags_any_prm   -- ANY
```

### Idempotency:
`rag_content.checksum` is UNIQUE and used in `upsert(..., on_conflict="checksum")`.

---

# 6. How to Run the Server

### Install dependencies
```
pip install -r requirements.txt
```

### Environment variables
```
export OPENAI_API_KEY=...
export SUPABASE_URL=...
export SUPABASE_KEY=...
```

### Start FastAPI
```
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```

### Visit the UI
```
http://localhost:8000/chat_book
```

---

# 7. Files Included

- `api/main.py` — Full FastAPI app
- `api/templates/chat_book.html` — Browser UI
- `db/schema.sql` — Schema for `rag_content`
- `db/migration.sql` — Migration from old → new schema
- `db/functions/*.sql` - various queries executed via Superbase .rpc calls
- `scripts/load_books.py` — HTML ingestion + embeddings + upsert loader
- `scripts/README.md` — a readme pertinent to the load process
- `README.md` (this file)

---

# 8. Summary

This FastAPI application satisfies all rubric criteria:

- **Vector search** using pgvector & cosine distance  
- **Metadata filters** with AND semantics  
- **Safe parameterization** (Supabase RPC arguments)  
- **Idempotent ingestion** using checksum  
- **Embeddings** consistent with schema (`text-embedding-3-small`, 1536-dim)  
- **Clear API Contract** for `/search` and `/api/chat_book`  
- **Front-end integration** demonstrating optional filtering  
- **Structured responses**: items, metadata, applied_filters  

This README provides all context required to understand, run, and evaluate the FastAPI functionality for the homework deliverable.

