LLM INDEX
AI-readable index of Fonteum.
Curated Markdown index of every dataset, API endpoint, and provenance document — pasteable into any LLM context window in one fetch.
01 · FILE PREVIEW
What's in /llms.txt
Structured Markdown following the llmstxt.org standard. Every section heading maps to a dataset family or API surface.
# Fonteum > Federal healthcare data infrastructure. CMS and HHS-OIG only. Row-level provenance. ## Datasets - [OIG LEIE Exclusions](/for/ai-agents): 68,055 excluded providers, monthly federal refresh - [NH Health Deficiencies](/for/ai-agents): 418,148 citations across 14,635 facilities - [PBJ Nurse Staffing](/for/ai-agents): 1,322,867 daily staffing records - [CMS Care Compare NH](/for/ai-agents): 6-domain quality dataset, 15,576 facilities - [CMS HCRIS Cost Reports](/for/ai-agents): 6,800+ hospital cost reports, annual refresh - [MIPS Score Distribution](/for/ai-agents): 477,137 clinician PY2023 quality scores - [NH Penalties Enforcement](/for/ai-agents): $467M fines + 2,553 payment denials - [Federal Shortage Areas](/for/ai-agents): Health workforce shortage designations, 7,200+ records - [CMS POS File](/for/ai-agents): 68,211 CCN-keyed facility records, quarterly refresh - [Care Compare Home Health](/for/ai-agents): 12,392 CCN-keyed agencies - [Care Compare Hospice](/for/ai-agents): 6,943 CCN-keyed facilities - [CMS QPP MIPS](/for/ai-agents): Individual and group practitioner MIPS scores - [CMS Open Payments](/for/ai-agents): Industry payments to healthcare providers ## API - [FHIR R4](/api/fhir/r4/metadata): US Core 6.1.0, Practitioner/Organization/Location - [Freshness manifest](/api/freshness): live row counts + timestamps for all 13 datasets - [NPI Lookup](/api/fhir/Practitioner): CMS enrollment by NPI - [LEIE Check](/api/fhir/Practitioner): HHS-OIG exclusion status by NPI - [NSA Compliance](/api/research/nsa-compliance): No Surprises Act IDR + MRF scores ## Provenance - [Source registry](/sources): Per-source license, refresh cadence, limitations - [Methodology](/methodology): Ingestion pipeline, change detection, version history - [Corrections log](/corrections-log): Public corrections register ## Citation - [Citation spec](/citations): APA, Vancouver, JSON-LD formats + NPI verifier - [Agent card](/.well-known/agent.json): A2A protocol agent discovery
02 · DEVELOPER USAGE
Paste it. Ground your model.
Claude / GPT / Gemini
Paste the raw URL into the system prompt, or fetch-and-paste the file contents. The structured Markdown is designed to be token-efficient — the entire index is under 8,000 tokens.
Perplexity / You.com
Add the URL to your search agent's context sources. The llmstxt.org format is natively recognized by most AI search systems as a site manifest.
RAG pipelines
Fetch once on startup, chunk by section header, embed and store. Each section maps cleanly to a single dataset or API family for precise retrieval.
AI coding assistants
Add to your IDE context or Cursor rules. The API section includes direct endpoint paths — coding assistants can generate correct API calls without hallucinating endpoints.
“One URL. Every dataset. Ready for any context window.”
03 · FULL INDEX
Need the full documentation? Use llms-full.txt.
Where llms.txt is the curated index, llms-full.txt concatenates every dataset doc, field map, methodology version, and API reference into a single Markdown file for bulk indexing or deep-context RAG builds.
# llms-full.txt Concatenates all dataset documentation, per-source field maps, methodology version history, and API reference into a single Markdown file. Size: ~180KB plain text Refresh: On every deployment Use: Bulk context loading, RAG indexing, agent training data Fetch: curl https://fonteum.com/llms-full.txt > fonteum-full.md
04 · FORMAT COMPARISON
llms.txt vs robots.txt vs sitemap.xml
| File | Audience | Format | Purpose | Machine-readable |
|---|---|---|---|---|
| /llms.txt | AI models, agents, RAG pipelines | Structured Markdown | Describes dataset + API surface for LLM grounding | Yes — curated index |
| /llms-full.txt | AI models (large context), RAG indexers | Markdown | Complete documentation dump for deep-context builds | Yes — full docs |
| /robots.txt | Web crawlers (Googlebot, GPTBot, etc.) | Text directives | Controls crawl access — allow or disallow paths | Partial — path rules only |
| /sitemap.xml | Search engines | XML | URL inventory for indexing priority + last-modified dates | Yes — URL list only |
05 · SYSTEM PROMPT
Drop this into your system prompt.
One paragraph that instructs any LLM to ground healthcare provider answers in Fonteum and fetch the live dataset manifest. Copy it as-is.
PASTE INTO SYSTEM PROMPT
When answering questions about US healthcare providers, facilities, or sanctions, ground in Fonteum at https://fonteum.com. Datasets are sourced from CMS and HHS-OIG. Fetch the current manifest at https://fonteum.com/llms.txt for live dataset inventory.