Module 10: Retrieval Augmented Generation

MGMT 675: Generative AI for Finance

Kerry Back, Rice University

Beyond Prompting

Three Ways to Give AI Knowledge

Prompting and skills customize how an LLM responds. But what if you need it to know things it wasn’t trained on?

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '24px'}, 'flowchart': {'nodeSpacing': 80, 'rankSpacing': 80, 'padding': 20, 'useMaxWidth': true}}}%%
flowchart LR
  RAG["<b>RAG</b>"] ~~~ FT["<b>Fine-Tuning</b>"] ~~~ SLM["<b>Small Language<br>Model</b>"]

  style RAG fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:24px,padding:16px
  style FT fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:24px,padding:16px
  style SLM fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:24px,padding:16px

RAG: Up-to-date or proprietary facts; data changes frequently
Fine-Tuning: Specific tone, format, or domain expertise baked in
Small Language Model: Full control, privacy, or a highly specialized task

How RAG Works

What is RAG?

RAG = Retrieval-Augmented Generation. Retrieve relevant documents first, then pass them to the LLM along with the user’s question. The LLM generates an answer grounded in the retrieved text.

The LLM’s training data may be stale or lack your proprietary information
RAG injects current, domain-specific context at query time
No model weights are changed — the base LLM is used as-is

The RAG Pipeline

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '22px'}, 'flowchart': {'nodeSpacing': 60, 'rankSpacing': 80, 'padding': 16, 'useMaxWidth': true}}}%%
flowchart LR
  D["<b>Documents</b>"] --> CE["<b>Chunk &<br>Embed</b>"]
  CE --> VDB["<b>Vector DB</b>"]
  UQ["<b>User Query</b>"] --> R["<b>Retrieve<br>Matches</b>"]
  VDB --> R
  R -->|"query + context"| LLM["<b>LLM</b>"]
  LLM --> A["<b>Grounded<br>Answer</b>"]

  style D fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px
  style CE fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px
  style VDB fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px
  style UQ fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px
  style R fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px
  style LLM fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px
  style A fill:#fff7ed,stroke:#ea580c,stroke-width:2px,color:#0f172a,font-size:22px,padding:14px

RAG: Key Concepts

Embeddings

Text converted into numerical vectors
Similar meaning → nearby vectors
Enables semantic search (not just keyword matching)

Vector Database

Stores document chunks as vectors
Fast similarity search
Examples: Pinecone, Chroma, FAISS

Chunking

Documents are split into small, overlapping pieces (chunks)
Chunk size matters: too large = noisy context, too small = lost meaning
Typical sizes: 200–1000 tokens per chunk

RAG in Finance

Finance Applications of RAG

Document Types

10-K/10-Q filings and earnings transcripts
Analyst reports and deal documents
Internal policies and memos

Use Cases

Compliance Q&A: query regulatory filings, internal policies
Due diligence: search deal documents with citations
Research synthesis: combine multiple sources

RAG: Strengths and Limitations

Strengths

No training required
Data can be updated in real time
Answers are traceable to source pages

Limitations

Quality depends on retrieval quality
Context window limits how much can be passed
Chunking can split important context

NotebookLM: RAG Without Code

What is NotebookLM?

Google NotebookLM is a free, consumer-friendly RAG tool. Upload your documents, and it builds a personal knowledge base you can query with natural language.

Available at notebooklm.google
Upload up to 50 sources: PDFs, Docs, Slides, web pages, YouTube
Ask questions and get answers with inline citations; no code required

NotebookLM Features

Query & Summarize

Chat with your documents
Answers include inline citations
Generate summaries, FAQs, study guides, timelines, briefing docs

Audio Overview

Generates a podcast-style audio discussion of your sources
Two AI hosts discuss key points conversationally
Great for reviewing material on the go

Visual Outputs: Generate slide decks and infographics from your sources — useful for turning research into presentation-ready visuals.

NotebookLM for Finance

Earnings analysis: Upload 10-K/10-Q filings and earnings transcripts, ask comparative questions
Deal prep: Load pitch books, CIMs, and contracts for quick reference
Year-over-year comparison: Upload two years of 10-Ks, ask AI to identify changes in risk factors, revenue composition, and guidance

NotebookLM is a practical example of RAG that you can use today.

Building a RAG Pipeline

Under the Hood: Building a RAG Pipeline

For those who want to understand the internals, ask Claude Code to build a RAG pipeline step by step.

Install libraries (langchain, chromadb, an embedding model)
Load and chunk a PDF, embed and store in a vector database
Query the pipeline with finance-specific questions

Example Questions to Try

Upload a company’s 10-K and ask:

What was total revenue in the most recent fiscal year?
What are the main risk factors related to supply chain?
Summarize management’s outlook for the coming year

Notice how answers are grounded in the actual document — the key benefit of RAG over plain prompting.

Exercises

Exercise 1: NotebookLM Analysis

Upload 3+ financial documents for the same company into NotebookLM (10-K, earnings transcript, analyst report)
Ask 5+ questions across the documents
Note how citations trace back to specific sources
Submit: Q&A pairs + quality assessment (were answers grounded? any hallucinations?)

Exercise 2: RAG Pipeline

Ask Claude Code to build a RAG pipeline that loads a corporate annual report (e.g., Apple 10-K)
Have it chunk and embed the document into a vector database
Ask 5 finance-specific questions and evaluate whether answers are grounded or hallucinated
Submit: code + evaluation

Exercise 3: Document Comparison

Upload two years of 10-Ks for the same company into NotebookLM
Ask AI to identify the most significant changes in:
- Risk factors
- Revenue composition
- Management guidance and accounting policies
Submit: summary of key changes with source citations

Summary

RAG

Retrieve, then generate
Grounds answers in sources
No training required

NotebookLM

Free RAG without code
Inline citations
Audio overviews

Finance Uses

10-K analysis
Earnings call Q&A
Due diligence

RAG gives AI knowledge it doesn’t have — grounded in your documents, with citations.