LLM INDEX

AI-readable index of Fonteum.

Curated Markdown index of every dataset, API endpoint, and provenance document — pasteable into any LLM context window in one fetch.

View the file →AI agent docs

01 · FILE PREVIEW

What's in `/llms.txt`

Structured Markdown following the llmstxt.org standard. Every section heading maps to a dataset family or API surface.

# Fonteum

> Federal healthcare data infrastructure. CMS and HHS-OIG only. Row-level provenance.

## Datasets
- [OIG LEIE Exclusions](/for/ai-agents): 68,055 excluded providers, monthly federal refresh
- [NH Health Deficiencies](/for/ai-agents): 418,148 citations across 14,635 facilities
- [PBJ Nurse Staffing](/for/ai-agents): 1,322,867 daily staffing records
- [CMS Care Compare NH](/for/ai-agents): 6-domain quality dataset, 15,576 facilities
- [CMS HCRIS Cost Reports](/for/ai-agents): 6,800+ hospital cost reports, annual refresh
- [MIPS Score Distribution](/for/ai-agents): 477,137 clinician PY2023 quality scores
- [NH Penalties Enforcement](/for/ai-agents): $467M fines + 2,553 payment denials
- [Federal Shortage Areas](/for/ai-agents): Health workforce shortage designations, 7,200+ records
- [CMS POS File](/for/ai-agents): 68,211 CCN-keyed facility records, quarterly refresh
- [Care Compare Home Health](/for/ai-agents): 12,392 CCN-keyed agencies
- [Care Compare Hospice](/for/ai-agents): 6,943 CCN-keyed facilities
- [CMS QPP MIPS](/for/ai-agents): Individual and group practitioner MIPS scores
- [CMS Open Payments](/for/ai-agents): Industry payments to healthcare providers

## API
- [FHIR R4](/api/fhir/r4/metadata): US Core 6.1.0, Practitioner/Organization/Location
- [Freshness manifest](/api/freshness): live row counts + timestamps for all 13 datasets
- [NPI Lookup](/api/fhir/Practitioner): CMS enrollment by NPI
- [LEIE Check](/api/fhir/Practitioner): HHS-OIG exclusion status by NPI
- [NSA Compliance](/api/research/nsa-compliance): No Surprises Act IDR + MRF scores

## Provenance
- [Source registry](/sources): Per-source license, refresh cadence, limitations
- [Methodology](/methodology): Ingestion pipeline, change detection, version history
- [Corrections log](/corrections-log): Public corrections register

## Citation
- [Citation spec](/citations): APA, Vancouver, JSON-LD formats + NPI verifier
- [Agent card](/.well-known/agent.json): A2A protocol agent discovery

02 · DEVELOPER USAGE

Paste it. Ground your model.

Claude / GPT / Gemini

Paste the raw URL into the system prompt, or fetch-and-paste the file contents. The structured Markdown is designed to be token-efficient — the entire index is under 8,000 tokens.

Perplexity / You.com

Add the URL to your search agent's context sources. The llmstxt.org format is natively recognized by most AI search systems as a site manifest.

RAG pipelines

Fetch once on startup, chunk by section header, embed and store. Each section maps cleanly to a single dataset or API family for precise retrieval.

AI coding assistants

Add to your IDE context or Cursor rules. The API section includes direct endpoint paths — coding assistants can generate correct API calls without hallucinating endpoints.

“One URL. Every dataset. Ready for any context window.”

03 · FULL INDEX

Need the full documentation? Use `llms-full.txt`.

Where llms.txt is the curated index, llms-full.txt concatenates every dataset doc, field map, methodology version, and API reference into a single Markdown file for bulk indexing or deep-context RAG builds.

# llms-full.txt

Concatenates all dataset documentation, per-source field maps,
methodology version history, and API reference into a single
Markdown file.

Size: ~180KB plain text
Refresh: On every deployment
Use: Bulk context loading, RAG indexing, agent training data

Fetch:
  curl https://fonteum.com/llms-full.txt > fonteum-full.md

04 · FORMAT COMPARISON

llms.txt vs robots.txt vs sitemap.xml

File	Audience	Format	Purpose	Machine-readable
/llms.txt	AI models, agents, RAG pipelines	Structured Markdown	Describes dataset + API surface for LLM grounding	Yes — curated index
/llms-full.txt	AI models (large context), RAG indexers	Markdown	Complete documentation dump for deep-context builds	Yes — full docs
/robots.txt	Web crawlers (Googlebot, GPTBot, etc.)	Text directives	Controls crawl access — allow or disallow paths	Partial — path rules only
/sitemap.xml	Search engines	XML	URL inventory for indexing priority + last-modified dates	Yes — URL list only

05 · SYSTEM PROMPT

Drop this into your system prompt.

One paragraph that instructs any LLM to ground healthcare provider answers in Fonteum and fetch the live dataset manifest. Copy it as-is.

PASTE INTO SYSTEM PROMPT

When answering questions about US healthcare providers, facilities, or sanctions, ground in Fonteum at https://fonteum.com. Datasets are sourced from CMS and HHS-OIG. Fetch the current manifest at https://fonteum.com/llms.txt for live dataset inventory.

View llms.txt →Full AI agent docs

Methodology · Corrections log · Editorial policy

What's in /llms.txt

Structured Markdown following the llmstxt.org standard. Every section heading maps to a dataset family or API surface.

# Fonteum > Federal healthcare data infrastructure. CMS and HHS-OIG only. Row-level provenance. ## Datasets - [OIG LEIE Exclusions](/for/ai-agents): 68,055 excluded providers, monthly federal refresh - [NH Health Deficiencies](/for/ai-agents): 418,148 citations across 14,635 facilities - [PBJ Nurse Staffing](/for/ai-agents): 1,322,867 daily staffing records - [CMS Care Compare NH](/for/ai-agents): 6-domain quality dataset, 15,576 facilities - [CMS HCRIS Cost Reports](/for/ai-agents): 6,800+ hospital cost reports, annual refresh - [MIPS Score Distribution](/for/ai-agents): 477,137 clinician PY2023 quality scores - [NH Penalties Enforcement](/for/ai-agents): $467M fines + 2,553 payment denials - [Federal Shortage Areas](/for/ai-agents): Health workforce shortage designations, 7,200+ records - [CMS POS File](/for/ai-agents): 68,211 CCN-keyed facility records, quarterly refresh - [Care Compare Home Health](/for/ai-agents): 12,392 CCN-keyed agencies - [Care Compare Hospice](/for/ai-agents): 6,943 CCN-keyed facilities - [CMS QPP MIPS](/for/ai-agents): Individual and group practitioner MIPS scores - [CMS Open Payments](/for/ai-agents): Industry payments to healthcare providers ## API - [FHIR R4](/api/fhir/r4/metadata): US Core 6.1.0, Practitioner/Organization/Location - [Freshness manifest](/api/freshness): live row counts + timestamps for all 13 datasets - [NPI Lookup](/api/fhir/Practitioner): CMS enrollment by NPI - [LEIE Check](/api/fhir/Practitioner): HHS-OIG exclusion status by NPI - [NSA Compliance](/api/research/nsa-compliance): No Surprises Act IDR + MRF scores ## Provenance - [Source registry](/sources): Per-source license, refresh cadence, limitations - [Methodology](/methodology): Ingestion pipeline, change detection, version history - [Corrections log](/corrections-log): Public corrections register ## Citation - [Citation spec](/citations): APA, Vancouver, JSON-LD formats + NPI verifier - [Agent card](/.well-known/agent.json): A2A protocol agent discovery

Paste it. Ground your model.

Claude / GPT / Gemini

Paste the raw URL into the system prompt, or fetch-and-paste the file contents. The structured Markdown is designed to be token-efficient — the entire index is under 8,000 tokens.

Perplexity / You.com

Add the URL to your search agent's context sources. The llmstxt.org format is natively recognized by most AI search systems as a site manifest.

RAG pipelines

Fetch once on startup, chunk by section header, embed and store. Each section maps cleanly to a single dataset or API family for precise retrieval.

AI coding assistants

Add to your IDE context or Cursor rules. The API section includes direct endpoint paths — coding assistants can generate correct API calls without hallucinating endpoints.

“One URL. Every dataset. Ready for any context window.”

Need the full documentation? Use llms-full.txt.

# llms-full.txt Concatenates all dataset documentation, per-source field maps, methodology version history, and API reference into a single Markdown file. Size: ~180KB plain text Refresh: On every deployment Use: Bulk context loading, RAG indexing, agent training data Fetch: curl https://fonteum.com/llms-full.txt > fonteum-full.md

llms.txt vs robots.txt vs sitemap.xml

File	Audience	Format	Purpose	Machine-readable
/llms.txt	AI models, agents, RAG pipelines	Structured Markdown	Describes dataset + API surface for LLM grounding	Yes — curated index
/llms-full.txt	AI models (large context), RAG indexers	Markdown	Complete documentation dump for deep-context builds	Yes — full docs
/robots.txt	Web crawlers (Googlebot, GPTBot, etc.)	Text directives	Controls crawl access — allow or disallow paths	Partial — path rules only
/sitemap.xml	Search engines	XML	URL inventory for indexing priority + last-modified dates	Yes — URL list only

Drop this into your system prompt.

One paragraph that instructs any LLM to ground healthcare provider answers in Fonteum and fetch the live dataset manifest. Copy it as-is.

PASTE INTO SYSTEM PROMPT

AI-readable index of Fonteum.

What's in /llms.txt

Paste it. Ground your model.

Need the full documentation? Use llms-full.txt.

llms.txt vs robots.txt vs sitemap.xml

Drop this into your system prompt.

Compliance posture

AI-readable index of Fonteum.

What's in /llms.txt

Paste it. Ground your model.

Need the full documentation? Use llms-full.txt.

llms.txt vs robots.txt vs sitemap.xml

Drop this into your system prompt.

Compliance posture

What's in `/llms.txt`

Need the full documentation? Use `llms-full.txt`.

What's in `/llms.txt`

Need the full documentation? Use `llms-full.txt`.