Skip to content
1,322,867 nurse-staffing records · CMS PBJ
fonteum
Research
PricingDocs
Request a pilot →
Provenance contract

14 fields. Every response. Every event.

Every Fonteum response payload — MCP tool result, webhook event, v1 API endpoint — carries the same canonical 14-tuple provenance block. The original 8 fields cover source attribution + freshness + methodology + confidence. 6 new fields close specific gaps relative to SOC 2 Type 2 (CC8.1 change management, CC6.1 data classification), HIPAA §164.312 integrity controls, FAIR Data Principles (F1 persistent identifier, R1.1 license), ICMJE academic citation standards, and SLSA Build Level 3 cryptographic provenance.

Cryptographic chain → Identity layer → Webhook events → Semantic search →

Original 8 fields (§sprint2-mcp-server)

Source attribution. Freshness. Methodology. Confidence.

01_sourcestring
Standard: FAIR R1.2 (data is associated with detailed provenance)

Canonical name of the upstream data source. Human-readable; matches the public name the source publishes itself under.

Example: "CMS NPPES NPI Registry (public API)"
02_source_urlstring
Standard: FAIR F4 (resources registered or indexed)

Public URL of the upstream source. Resolves to either the source's own portal or its API endpoint.

Example: "https://npiregistry.cms.hhs.gov/api/"
03_dataset_idstring
Standard: FAIR I1 (knowledge representation)

Fonteum-internal dataset slug. Stable across snapshots; matches data_sources.slug + the source-defaults registry key.

Example: "nppes-npi-registry"
04_snapshotstring (ISO date)
Standard: HIPAA §164.312(c)(1) (integrity controls require dated snapshots)

ISO date YYYY-MM-DD of the snapshot used to produce this response. Stable identifier for the upstream pull.

Example: "2026-05-10"
05_methodologystring
Standard: ICMJE author guidelines (cite the methodology version)

Fonteum methodology version that produced the response. Bumps land in /methodology/changelog with date + summary.

Example: "v2026.05.0"
06_last_checkedstring (ISO timestamp)
Standard: SOC 2 Type 2 CC7.1 (system monitoring)

ISO timestamp Fonteum last re-checked the value against the source. May be more recent than _snapshot when re-checks happen between snapshot pulls.

Example: "2026-05-10T07:00:00.000Z"
07_confidencenumber (0..1)
Standard: ICMJE author guidelines (declare confidence in derived values)

0..1 score. 1.0 = verbatim from source. <1.0 = derived via cross-source matching, name normalization, or other inference.

Example: 1.0
08_data_availabilitystring[]
Standard: FAIR A1 (retrievable by their identifier)

Availability flags. Common values: ["present"] (verbatim), ["pending_refresh"] (snapshot stale), ["archived"] (deprecated source). Multi-flag arrays allowed for compound states.

Example: ["present"]
6 new fields (§sprint3-14-tuple-extension)

Compliance + academic + cryptographic standards.

09_pipeline_versionstring | nullnullable
Standard: SOC 2 Type 2 CC8.1 (change management)

Git commit SHA (7-char short form) of the Fonteum ingestion code that produced the snapshot. Closes the change-management gap: every produced artifact traces back to the exact code revision. Always populated in deployed environments via VERCEL_GIT_COMMIT_SHA; "dev-local" in local dev.

Example: "abc1234"
10_doistring | nullnullable
Standard: ICMJE academic citation + FAIR F1 (persistent identifier)

Reserved for a persistent archival DOI for the methodology version. Currently always null — no DOI is minted at this time.

Example: null
11_licensestring | nullnullable
Standard: FAIR R1.1 (data are released with a clear and accessible data usage license)

SPDX identifier for redistribution rights. Federal sources (CMS, OIG, HRSA, BLS, BEA, Census) use US-Government-Works (public domain per 17 USC §105). Fonteum-derived datasets use CC-BY-4.0. Sources we don't recognize get null — caller can override.

Example: "US-Government-Works"
12_coverage_period_startstring | null (ISO-8601)nullable
Standard: HIPAA §164.312(c)(1) (integrity controls require knowing data range)

ISO-8601 date when the upstream source first started publishing this kind of data. Backstops the snapshot date with the source's own inception.

Example: "2007-09-15"
13_coverage_period_endstring | "ongoing" | nullnullable
Standard: HIPAA §164.312(c)(1)

ISO-8601 end date OR the literal "ongoing" for live sources. Future deprecated sources would set explicit end dates.

Example: "ongoing"
14_slsa_provenance_urlstring | nullnullable
Standard: SLSA Build Level 3 (cryptographic check of artifact origin)

URL to the SLSA Build Level 3 provenance artifact for this snapshot. Phase 1 ships a placeholder pointing at the GitHub Actions workflow run; full SLSA generator wires up in §sprint3-slsa-generator. Once populated, downstream consumers can re-validate the SLSA attestation to confirm the artifact wasn't tampered with.

Example: "https://github.com/fpobuilds/directoryventures-engine/actions/runs/123"
Backward compatibility

Additive only. No renames. No removals.

All 6 new fields are nullable. Subscribers + consumers built against the original 8-field contract continue to receive valid payloads — they just ignore the 6 extra keys per JSON contract. No versioned event_type bump required (per the buildEventPayload freeze rules). Field types are exposed via the central ProvenanceContract interface in src/mcp/types.ts + the ProvenancePayload interface in src/lib/events/types.ts. TypeScript consumers depending on those interfaces will see the 6 new fields as required-nullable; their values are guaranteed non-undefined.

Phase roadmap

Phase 1 ships the contract. Phases 2-3 wire the unwired fields.

  • Phase 1 (this wave): 14-field contract shipped. _pipeline_version + _license + _coverage_period_* always populated. _doi stays null — no DOI minting is active. _slsa_provenance_url stays null until the SLSA generator is wired.
  • §sprint3-slsa-generator (queued): wires the GitHub Actions SLSA Build Level 3 generator workflow + populates _slsa_provenance_url with the artifact URL.
  • §sprint3-15-tuple-witness-signatures (paid payer feature, deferred): 15th field — in-toto multi-party co-sign for high-stakes contexts.

Compliance posture

Methodology · Corrections log · Editorial policy

fonteum

Healthcare provider data, traced to source.


PLATFORM

  • Data platform
  • Pricing
  • FHIR API docs
  • For health-tech

RESEARCH

  • Research hub
  • Nursing homes
  • Methodology
  • Methodology changelog

COMPANY

  • About
  • Press
  • Contact
  • Trust & integrity

LEGAL

  • Privacy policy
  • Editorial policy
  • Corrections log

© 2026 FONTEUM RESEARCH · DATA SNAPSHOT MAY 8, 2026 · BUILT WITH CARE

  • X
  • LINKEDIN
  • PRESS