Google NotebookLM — source-grounded Gemini notebook (RAG-shaped tool)
Google NotebookLM — source-grounded Gemini notebook
NotebookLM is Google's source-first AI notebook. It only answers from the PDFs, Google Docs, web pages, YouTube videos, and audio files you upload — so hallucinations are far rarer than a generic chatbot. Where 04-gemini-api is the raw API and 08-google-ai-studio is the generic playground / Build, NotebookLM is a notebook with RAG built into the UI.
1. Identity
- Site:
notebooklm.google.com(or NotebookLM Plus inside Google Workspace) - Launch: 2023-07 (Project Tailwind) → 2024 GA → 2025 mobile apps
- Models: Gemini 1.5 / 2.0 / 2.5 (auto-upgraded over time)
- Capacity: up to 50 sources / notebook (each 500MB / 500k words on Free; 2M words on Plus)
- One-line: "a Gemini notebook that only reads your sources."
2. Supported source types
| Type | Notes |
|---|---|
| OCR runs automatically | |
| Google Docs | Drive integration |
| Plain text (.txt, .md) | UTF-8 preferred |
| Web URLs | HTML parser |
| YouTube URLs | Caption-based (rejects videos with no captions) |
| Audio (.mp3, .wav, …) | Transcribed (Speech-to-Text) before indexing |
| Pasted text | 10k-word chunks |
3. Core features
3.1 Citation-first answers
Every sentence in an answer carries a clickable citation pin (page / timestamp). You see exactly where it came from — the headline difference vs. ChatGPT/Gemini chat.
3.2 Audio Overview (auto podcast)
Generates a 5–15 minute podcast where two AI hosts discuss the source. Lectures, papers, contracts, manuals — turned into a commute-friendly listen. (English first; other languages improving.)
3.3 Mind Map
Auto-builds a hierarchical mind map of the source set — handy for studying or summarizing.
3.4 Notes (Studio)
Save chat answers as notes inside the same notebook. Notes can be re-fed as input (recursive synthesis).
3.5 Sharing (Plus)
Share whole notebooks. Permissions split into Viewer (ask-only) and Chat (ask + add notes).
4. Free vs Plus
| Item | Free | Plus (Google One AI Premium / Workspace add-on) |
|---|---|---|
| Notebooks | 100 | unlimited |
| Sources / notebook | 50 | 300 |
| Daily chat | 50 | 500 |
| Daily Audio Overviews | 3 | 20 |
| Sharing | ✗ | ✓ |
| Cost | $0 | $20/mo+ |
5. Use cases
| Scenario | Example |
|---|---|
| Learning | 50 lecture PDFs + videos → study notes + podcast |
| Research | 30 papers → comparison table + open-question map |
| Legal / policy | 100-page contract → clause-level Q&A + risk summary |
| Manuals | Internal guides → self-serve Q&A for new hires |
| Meetings | A year of minutes → decision trail per topic |
| Interviews | 10 hours of audio → insight extraction |
| Books | One non-fiction → per-chapter summary + applications |
6. Limits
- Not used for training — uploads aren't used to train models (per policy). On Free, chat inputs may be used (anonymised) for quality improvement → use Plus for sensitive material.
- Refuses out-of-source questions — common-knowledge questions get "not in your sources." Intentional.
- 50-source cap (300 on Plus) — large corpora need partitioning.
- No public API as of 2026-05 — to automate similar RAG, build with Gemini API + your own RAG (02-rag-pgvector).
- Korean text works well; Korean Audio Overviews still trail English.
7. Similar tools
| Tool | Strength | Weakness vs NotebookLM |
|---|---|---|
| ChatGPT (with files) | Knowledge + sources | weak citation surface |
| Claude Projects | 1M-token context | no podcast |
| Perplexity Spaces | Web + your sources | no mind map |
| Notion AI | Notes integration | thin RAG |
| Self-built RAG (pgvector + Gemini) | Full control · self-hostable | build/run cost |
8. Tips
- PDF quality — OCR matters. Text-extractable PDFs first; scanned PDFs need separate OCR.
- Split notebooks — one notebook per topic. Mixed topics → noisier answers.
- YouTube captions — many Korean videos have no auto-caption; verify before adding.
- Save Audio Overview transcript — Save to note makes the transcript searchable.
- Try Plus free — 1-month Google One AI Premium trial covers Plus features.
9. Self-hosting alternatives
NotebookLM itself isn't self-hostable. To build a similar workflow:
- Vector DB: pgvector (02-rag-pgvector) or Qdrant
- Embeddings: Gemini text-embedding-004 or OpenAI text-embedding-3-small (05-embeddings-deep)
- LLM: Gemini API (04-gemini-api) or LM Studio (01-local-llm-lmstudio)
- Citations: attach chunk id + page numbers as source metadata → UI links to the original
- Podcast: ElevenLabs / Google TTS + two-persona script generation
Self-built wins for control and private data; NotebookLM wins for speed-to-use on personal study.