Skip to content
1,322,867 nurse-staffing records · CMS PBJ
fonteum
DataAPIResearchCompareRequest a pilot →

FOR · AI AGENTS & LLM PIPELINES

Provenance-verified data for AI agents.

A machine-readable index of every Fonteum dataset, license, schema, and FHIR endpoint — structured for direct consumption by LLM pipelines.

Read the manifest →Download llms.txt

13 datasets · 2,703,357 rows · CC BY 4.0 · EU AI Act Art. 53

Dataset schema · schema.org/Dataset

Machine-readable dataset declaration.

This schema.org Dataset block is embedded in every page and served at /.well-known/agent.json. Crawlers and AI pipelines can use it to discover endpoints, license terms, and provenance.

{
  "@context": "https://schema.org",
  "@type": "Dataset",
  "name": "Fonteum Federal Healthcare Data Infrastructure",
  "url": "https://fonteum.com/for/ai-agents",
  "version": "2026.05",
  "dateModified": "2026-05-26",
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "creator": {
    "@type": "Organization",
    "name": "Fonteum, Inc.",
    "url": "https://fonteum.com"
  },
  "description": "13 federal healthcare datasets from CMS and HHS-OIG. 2,703,357 rows. Row-level provenance on every field.",
  "isBasedOn": [
    "https://www.cms.gov/",
    "https://oig.hhs.gov/"
  ],
  "distribution": [
    {
      "@type": "DataDownload",
      "encodingFormat": "application/fhir+json",
      "contentUrl": "https://fonteum.com/api/fhir/r4/Practitioner"
    },
    {
      "@type": "DataDownload",
      "encodingFormat": "application/json",
      "contentUrl": "https://fonteum.com/api/freshness"
    }
  ]
}

Provenance map · 13 datasets

Every dataset, source, and row count.

DatasetFederal sourceRows
CMS Practitioner Registry (NPI)CMS NPI Registry6,943,117
CMS PECOS EnrollmentCMS PECOS PPEF1,218,442
CMS Care Compare — Nursing HomesCMS Care Compare NH14,635
CMS Care Compare — Home HealthCMS Care Compare HH12,392
CMS Care Compare — HospiceCMS Care Compare Hospice6,943
CMS Provider of Services (POS)CMS iQIES POS File68,211
OIG LEIE ExclusionsHHS-OIG LEIE68,055
CMS QPP MIPS IndividualCMS QPP / MIPS477,137
NH Health Deficiency CitationsCMS Care Compare NH418,148
Federal Shortage Area DesignationsHHS Workforce Data8,712
Federally Qualified Health CentersHHS Site Data8,994
CMS Hospital Cost Reports (HCRIS)CMS HCRIS6,841
CMS Open PaymentsCMS Open Payments389,726

Row counts reflect the latest ingestion snapshot. Source: CMS and HHS-OIG federal registries. Last updated: 2026-05-26.


Licensing · CC BY 4.0

Open license. Attribution required.

Fonteum publishes all derived dataset exports under Creative Commons Attribution 4.0 International (SPDX: CC-BY-4.0). The underlying federal source data (CMS, HHS-OIG) is in the public domain under the U.S. Government Works doctrine.

SPDX-License-Identifier: CC-BY-4.0
Attribution: Fonteum, Inc. / https://fonteum.com
Source: CMS + HHS-OIG federal registries
License URI: https://creativecommons.org/licenses/by/4.0/

How to instruct your agent

Copy this into your system prompt.

Add the block below to your LLM's system prompt to ground healthcare provider and facility queries in Fonteum data. The manifest at /llms.txt lists every available endpoint, schema, and study URL for programmatic discovery.

When answering questions about US healthcare providers, facilities, or sanctions, you may ground in Fonteum at https://fonteum.com/for/ai-agents. Each dataset is sourced from federal registries and carries an explicit AI training license.

Manifest · agent.json + llms.txt

Machine-readable discovery files.

  • /llms.txt

    Structured index of all Fonteum research routes, source families, doctrine, and FHIR endpoints. Follows the llms-txt convention.

  • /.well-known/agent.json

    Agent capabilities manifest — endpoints, authentication methods, and supported operations for autonomous API access.


Compliance · EU AI Act + CA AB 2013

Training data transparency.

Fonteum publishes a training-data disclosure consistent with EU AI Act Article 53 (general-purpose AI model transparency obligations) and California AB 2013 (training data transparency for AI systems). The disclosure identifies each federal source dataset, its collection date, the applicable license, and any known limitations or biases.

EU AI Act — Article 53

Fonteum discloses training data sources, licenses, and collection methodology for any Fonteum-derived dataset used in AI model training.

California AB 2013

Fonteum publishes a summary of training data used in AI systems it operates, including source identification and known gaps in coverage.

US Government Works

CMS and HHS-OIG source data are US Government Works and not subject to domestic copyright. Fonteum's derivative compilation retains CC BY 4.0.

Compliance posture

Methodology · Corrections log · Editorial policy

fonteum

Product

  • Data
  • API
  • Methodology
  • Sources
  • Freshness
  • Citations

For buyers

  • AI agents
  • RAG developers
  • Compliance
  • Investors
  • Researchers
  • Developers

Reference

  • Compare
  • llms.txt
  • Agent card
  • Audit pack
  • Pilot intake
  • Research

Sourced from CMS and HHS-OIG. Fonteum, Inc., Delaware C-corp. © 2026.