Source PDFs Index — Deep Reference Layer¶
Last updated: 2026-04-27 Tags: sources, PDFs, deep reference, citations
📖 Full Oracle Documentation: OAC Documentation Hub
Summary¶
The raw/pdfs/ folder contains the authoritative Oracle documentation in PDF + searchable text. When wiki pages don't have enough depth, I (Claude) grep these text files at query time and cite the result. This is the deep-reference layer that makes the knowledge graph comprehensive.
Available Source Documents¶
| Text Size | Content | Maps to Wiki | |
|---|---|---|---|
getting-started-oracle-analytics-cloud |
37 KB | Orientation, editions, service mgmt | OAC Overview & Architecture, Subscribe & Provisioning |
whats-new-oracle-analytics-cloud |
262 KB | Release notes (current + historical) | Whats New & Release Updates |
building-semantic-models-oracle-analytics-cloud |
1.1 MB | Semantic Modeler full guide | Semantic Model |
smml-schema-reference-oracle-analytics-cloud |
162 KB | SMML JSON schema reference | Semantic Model, APIs, Embedding & Integration |
connecting-oracle-analytics-cloud-your-data |
589 KB | All data source connections | Data Sources & Connections |
visualizing-data-and-building-reports-oracle-analytics-cloud |
2.7 MB | Workbooks, dashboards, BI Publisher | Workbooks & Visualizations, Classic Dashboards & Analyses, BI Publisher, Maps & Geospatial Analytics |
administering-oracle-analytics-cloud-oracle-cloud-infrastructure-gen-2 |
570 KB | Service admin (Gen 2) | Administration & Service Console, Subscribe & Provisioning |
configuring-oracle-analytics-cloud |
1.1 MB | System configuration deep dive | Administration & Service Console |
OAC_REST_API_Guide |
217 KB | OpenAPI spec for OAC REST API | APIs, Embedding & Integration, OCI REST APIs & CLI for OAC |
Total deep reference: ~6.8 MB of text (~1.5M tokens) — fully searchable.
How to Query Deep Content (For Claude)¶
When a user asks a question:
Step 1: Grep raw/pdfs/*.txt for keywords from the question
Step 2: Read the matching sections (use line offsets)
Step 3: Read the related wiki page for structure
Step 4: Synthesize answer with citations:
- Wiki Page Name
- (Source: <pdf-name>, near line N)
Example¶
Q: "How do I set up an event polling table?"
Process:
1. grep -l "event polling" raw/pdfs/*.txt → finds matches in building-semantic-models.txt and smml-schema-reference.txt
2. Read those sections (~50 lines context)
3. Read Semantic Model for high-level structure
4. Answer with full procedure + cite source PDF + line numbers
Why This Beats Pre-Synthesized Wiki¶
| Approach | Coverage | Accuracy | Effort | Currency |
|---|---|---|---|---|
| Pre-synthesize 6.8 MB into wiki | 100% but lossy | Risk of summarization errors | Days | Stale on next docs update |
| Grep raw text at query time | 100% full fidelity | Direct quotes from Oracle | None — automated | Always current with PDF |
The Karpathy LLM Wiki pattern explicitly endorses this: synthesize structure in the wiki, keep sources raw and citable.
Refreshing Sources¶
When Oracle releases new doc versions:
- Download updated PDFs from docs.oracle.com/en/cloud/paas/analytics-cloud/
- Replace files in
raw/pdfs/ - Re-run
pdftotext: - Update
log.mdwith refresh date - Optionally ask Claude to re-lint the wiki against the new PDFs