Specification · v0.6.2
Proof of Insight v0.6.2
The Proof of Insight™ Protocol
Version 0.6.2 — Working Draft
Status. This document is a working draft circulated for review by domain experts in cryptographic provenance, regulated artificial intelligence, and standards development. It is not a published standard. References to companion documents (conformance test suite, reference verifier, profile bindings) are forward references; those documents do not yet exist at the version cited.
Versioning. The value "0.6.2" appears
in the version field of every Insight Step (§2.1) and the
manifest_version field of every proof manifest (§2.7).
Document and protocol version are intentionally the same. Version 0.6.2
supersedes v0.6.1 with backward-compatible additions: the splitting of
L4 into L4A (independently attested) and L4R (reproducible); the
addition of an optional verification_basis field to proof
manifests, recording what gate of verification the producer claims is
achievable for a proof; an optional structured form for
conditioned-on edges that records the role and a
declared-relevance hash of the contextual relation; and clarifying
language on the scope of the patent non-assertion covenant in the
interim before publication of a conformance test suite. v0.6.1 L4 claims
map cleanly to v0.6.2 L4R; no producer compliant with v0.6.1 is rendered
non-compliant by v0.6.2. The protocol is pre-1.0 and no production
deployments exist at any version.
Editor's note. This draft prioritizes precision in §1–§3 (the technical core) over completeness in §4–§8. Sections §6 and parts of §7–§8 are structural placeholders pending stabilization of the core.
§0. Copyright, License, and IPR
§0.1 Document. This specification is © 2026 Arclio LLC and is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). The full license text is available at https://creativecommons.org/licenses/by/4.0/legalcode. The canonical version of this document is to be published at https://proofofinsight.org/spec/v0.6.2/; pending DNS configuration of that hostname, the working draft in the repository at https://github.com/proof-of-insight/spec is authoritative.
§0.2 Reference artifacts. The JSON schema of Appendix A is illustrative and is licensed under CC BY 4.0 with the rest of this document. Normative machine-readable schemas, the reference verifier implementation, and the conformance test suite are published separately under the Apache License, Version 2.0, including its patent grant provisions.
§0.3 Patent non-assertion. Arclio LLC covenants not to assert any patent claims it owns or controls against any implementation of this specification that conforms to the conformance test suite at the version of this document under which the implementation claims conformance. Until a conformance test suite is published for a given version of this specification, the covenant applies to implementations that make a good-faith implementation of the normative requirements of §§1–5 at that version. This covenant binds Arclio LLC only; it does not address claims that may be held by third parties.
§0.4 Trademarks. "Proof of Insight" and "PoI" are trademarks of Arclio LLC. Use of these marks to describe an implementation as conformant is governed by the PoI Conformance Mark Policy, published separately. Use of the marks in this specification and in any derivative work is permitted for the purpose of accurate reference to this specification.
§0.5 Editorial process. This specification is a working draft. Editorial control rests with Arclio LLC pending establishment of an independent editorial body. Proposed changes are submitted via the public issue tracker referenced at the canonical URL.
Abstract
This document specifies the Proof of Insight™ (PoI™) protocol, an evidence format for analyses produced by agentic artificial-intelligence systems in regulated contexts. PoI defines a content-addressed, signed directed acyclic graph (DAG) of typed derivation steps. A single verification algorithm, parameterized by step type, establishes that recorded outputs are bound to recorded inputs through recorded operations and attestations, that the binding is tamper-evident, that deterministic operations are replayable, and that non-deterministic operations are recorded under stated conditions with explicit re-execution semantics. PoI extends existing software-supply-chain attestation frameworks (in-toto, SLSA) and provenance data models (PROV-W3C) by admitting non-deterministic reasoning operations as first-class evidence steps and by providing a single mechanical verification algorithm over the resulting graph. As with those frameworks, PoI's verification claims concern the provenance and integrity of an analysis, not the semantic correctness of its conclusions; this distinction is constitutive of the protocol's scope and is stated explicitly in §1.3 and §1.4. The protocol is independent of any particular implementation, signing scheme, timestamp authority, or compliance regime; bindings to specific ecosystems are provided as informative profiles.
1. Introduction
1.1 Problem
Agentic artificial-intelligence systems are increasingly used to produce analyses that inform regulated decisions. Examples include readiness assessments for drug-development submissions, model-risk reviews in finance, decision support in clinical care, and reporting under environmental-compliance regimes. Such systems characteristically combine three classes of operation:
- Ingestion of external data (clinical trial records, market data, sensor logs).
- Deterministic computation over that data (database queries, statistical primitives, deterministic transformations).
- Non-deterministic reasoning, typically realized by large language model inference, which interprets, summarizes, or draws conclusions from the results of the previous two classes.
A substantial body of prior work addresses the first two classes. PROV-W3C provides a general data model for provenance. The in-toto and SLSA frameworks provide verifiable attestation for software supply chains, with Sigstore providing a deployed signing and transparency-log infrastructure. These frameworks were designed before agentic reasoning was a routine component of regulated workflows and they assume the operations they record are deterministic builds. None admits non-deterministic reasoning as a first-class operation; none provides a verification algorithm of the precision a regulator requires to act independently of the producer over a graph that includes such operations.
PoI builds on this lineage rather than replacing it. PoI's signature, content-addressing, and timestamping primitives are intended to be instantiated via the existing ecosystems (see profiles in §7). PoI's contribution is the typed-step taxonomy that admits non-deterministic reasoning, the typed edge relations that distinguish input from context from attestation, and the single mechanical verification algorithm parameterized by step type.
The motivating consequence is that analyses produced by agentic systems in regulated contexts currently rely on producer-controlled audit logs, narrative descriptions, or post-hoc reconstructions. None of these is mechanically verifiable. A regulator presented with such an analysis has no protocol-level basis on which to confirm even the structural integrity of the analysis — that the recorded outputs are bound to the recorded inputs through the recorded operations, without tampering, under identifiable attestation. PoI provides that structural basis. It does not, and no provenance protocol can, certify that the recorded analytical conclusions are correct; see §1.4.
1.2 Position
PoI is a verification protocol, not an analytical platform or workflow tool. Platforms that produce analyses — whether for clinical, financial, or other regulated contexts — may emit PoI proofs of those outputs to support independent verification. The protocol's design does not depend on, nor does it specify, the workflow that produced an analysis; it specifies the evidence structure against which that analysis can be mechanically verified.
PoI's central position is that a single primitive — a content-addressed, signed, typed derivation step — is sufficient to express the evidence required for regulated agentic analyses, and that the properties such evidence must possess emerge as theorems over composed steps rather than being designed in as separate features.
A PoI proof is a directed acyclic graph of such steps. Verification is a single algorithm parameterized by step type. Conformance levels are defined as constraints on which step types must appear and which roles must sign which steps.
This position commits the protocol to a small, fixed taxonomy: four
step types (observe, compute,
reason, attest) and three edge relations
(derived-from, conditioned-on,
about). Domain-specific specialization — clinical
prespecification, financial model tiering, sector-specific roles — is
achieved through the profile mechanism (§7.0) without modifying the base
taxonomy. The taxonomy is justified in §2.
1.3 Non-goals
PoI explicitly does not provide:
- Validation of input data. PoI binds an analysis to its claimed inputs through content hashing and attestor signature. It does not certify that the inputs are accurate, complete, current, or fit for purpose. Input validation is the responsibility of the attestor and any external authorities to which the attestor delegates (e.g., source-data integrity standards, vendor-qualification regimes).
- Validation of model outputs. PoI records that a model produced a specific output under recorded conditions. It does not certify that the output is correct, calibrated, or safe.
- Endorsement of analytical conclusions. PoI's verification algorithm establishes that recorded outputs are bound to recorded inputs through recorded operations and attestations, with the binding tamper-evident. For deterministic operations, replay can additionally confirm that the recorded output is the one the recorded function produces on the recorded inputs. For non-deterministic operations, the protocol records the operation's conditions and output but does not establish that the output follows from the inputs in any logical, statistical, or regulatory sense. Whether any recorded conclusion is correct, defensible, or appropriate remains a matter for human review and regulatory judgment, outside the scope of the protocol.
- A consensus mechanism. PoI proofs are produced by a single producer or a coordinated set of producers under a shared trust model. Distributed agreement among independent parties on the contents of a proof is out of scope.
- A retention or archival policy. PoI specifies how evidence is structured and verified. It does not specify how long evidence is kept, by whom, or under what conditions it may be deleted.
- A trust root for identity. PoI requires that every step be signed by an identifiable attestor, but does not specify how identities are issued, verified, or revoked. Identity is a profile concern (§7).
These non-goals are constitutive of the protocol's scope. A reader who expects PoI to provide any of the above is reading the wrong document. The companion scope statement in §1.4 makes the verification-versus-credibility distinction explicit.
1.4 Scope: Verification, Not Credibility
PoI establishes that an analysis was produced as recorded. It is silent on whether the recorded analysis was the right analysis. This distinction is consequential and is stated here as part of the protocol's primary scope, not as a footnote.
A proof that satisfies all conformance checks of §5 establishes:
- That the recorded inputs are bound to the recorded operations through identified attestations
- That the binding is tamper-evident
- That deterministic operations are replayable under their declared regime
- That non-deterministic operations are recorded with stated conditions and explicit re-execution semantics
- That predicates over the typed-step structure (e.g., "every reason step on the path to a regulated conclusion has at least one qualified-reviewer attestation") hold mechanically
A proof does not establish:
- That the recorded inputs are accurate, complete, or fit for purpose
- That the recorded methods are appropriate to the question being asked
- That the recorded conclusions are correct, calibrated, or defensible
- That a reviewer applying domain judgment would reach the same conclusion
- That the analysis is sufficient to support any specific regulatory action
The protocol therefore answers a process-fidelity question
and is silent on the analytical-correctness question. Both
questions matter for regulated decisions, but they are addressed by
different mechanisms: PoI by mechanical verification, analytical
correctness by qualified human review under domain-specific standards.
PoI is structured so that an attest step from a qualified
reviewer (§2.2.4) records the latter and binds it to the former. The
attest step records the review; it does not perform it.
Reproducibility of an analysis under PoI is not equivalent to credibility of that analysis. A reproducibly wrong analysis remains wrong. The protocol's value to a regulator is that it makes the analysis legible and mechanically inspectable, so that human judgment can focus on the questions that require it.
1.5 Conventions
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in BCP 14 (RFC 2119, RFC 8174) when, and only when, they appear in capitals.
Hash function references denote SHA-256 unless otherwise specified by a profile (§7). Where post-quantum cryptographic agility is required, profiles MAY substitute an alternative; canonical encoding rules accommodate this without changes to the protocol.
Canonical encoding follows RFC 8785 (JSON Canonicalization Scheme) unless a profile specifies an alternative. Signature scheme is profile-defined; this document does not mandate a specific scheme.
The term "attestor" denotes the identified party signing a step. The term "verifier" denotes any party — including but not limited to the producer of the proof — running the verification algorithm of §3 against a proof. The term "producer" denotes the system or organization assembling a proof from steps it generates.
2. The Insight Step
A PoI proof is composed of insight steps (hereafter "steps"). This section specifies the step structure, the four step types and their payloads, the three edge relations, the timestamp binding, the canonical encoding, and the rule by which a step's identity is computed.
2.1 Step structure
Every step is a record with the following fields. Field order is fixed by the canonical encoding rule of §2.5; the order in this table is the canonical order.
| # | Field | Type | Required | Description |
|---|---|---|---|---|
| 1 | version |
string | MUST | Protocol version. This document specifies
"0.6.2". |
| 2 | type |
string | MUST | One of "observe",
"compute", "reason",
"attest". |
| 3 | predecessors |
array of edges | MUST | Edges to predecessor steps. MAY be empty
only for observe. |
| 4 | payload |
object | MUST | Type-specific. Contents specified in §2.2. |
| 5 | attestor |
string | MUST | URI identifying the attestor. Resolution is profile-defined. |
| 6 | signature |
string | MUST | Signature over the canonical encoding of
fields 1–5, by the key bound to attestor. |
| 7 | timestamp |
object | MUST | Trusted timestamp over the canonical
encoding of fields 1–6 (including signature). Structure
specified in §2.4. |
The ordering of signature before timestamp is deliberate. The attestor signs the unsigned step content; a timestamp authority then countersigns the now-signed step. This binding order — sign, then timestamp the signed thing — establishes that the signed evidence existed at the time the timestamp authority attests, not merely that the unsigned content existed.
For canonicalization, three byte strings are defined:
to_sign(step) = canonical_encoding( fields 1..5 of step )
to_timestamp(step) = canonical_encoding( fields 1..6 of step ) // includes signature
full_step(step) = canonical_encoding( fields 1..7 of step )
The attestor signs to_sign(step) to produce
signature. The timestamp authority attests to the existence
of to_timestamp(step) — which by construction includes the
attestor's signature and excludes only the timestamp field
— at the timestamp value, producing timestamp.token. A
step's identity (§2.5) is
SHA-256( full_step(step) ).
A step is verified in three layers, in order: signature validity over
to_sign; timestamp validity over to_timestamp;
identity computation over full_step. A verifier MUST
perform all three.
2.2 Step types and payloads
PoI defines exactly four step types. The taxonomy is minimal in the
sense that no two types may be collapsed without losing a verification
distinction that this protocol relies on; it is not extensible without a
protocol revision. Domain-specific predicates that might naively appear
to require additional step types are instead expressed via the
attest step's claim-type vocabulary (§2.2.4) under a
profile (§7.0).
2.2.1 observe
An observe step records the ingestion of external data
into the proof.
payload = {
"content_hash": <SHA-256 of observed content, hex-encoded>,
"content_type": <string: MIME type or domain identifier>,
"source": <URI or structured descriptor of origin>,
"provenance": <optional reference to external provenance attestation>
}
An observe step MUST NOT have predecessors. It is the
only step type for which predecessors MAY be the empty
array.
A verifier cannot independently re-derive an observe
step. Verification of an observe step is limited to: (a)
confirming, where the original artifact is separately available, that
its hash matches content_hash; (b) confirming that the
attestor is authorized — under the profile's authority model — to make
observations from the named source; (c) verifying any external
provenance attestation referenced in provenance.
The provenance field is the integration point with
external attestation ecosystems (e.g., in-toto attestations describing
the origin of input data). Such external attestations are not part of
the PoI proof but are referenced from it.
The observe step as trust handoff boundary. Data
exists in the analysis as PoI evidence from the moment of the signed
observe step; what occurred upstream — including how the
data was collected, validated, qualified, or amended — is the
responsibility of source-data integrity standards (e.g., 21 CFR Part 11
§11.10 for clinical data, books-and-records rules for financial data)
and any external provenance attestations referenced via the
provenance field. A verifier confirming an
observe step confirms that the recorded content hashes to
the recorded value and that the attestor is authorized to make
observations from the named source; it does not confirm that the source
itself is trustworthy. This boundary is intentional and is the reason
source-data validation is enumerated as a non-goal in §1.3.
2.2.2 compute
A compute step records the application of a
deterministic function to inputs derived from predecessor steps.
payload = {
"function": <URI or content-hash identifier of the function>,
"invocation": <inline canonical invocation object | content-addressed
reference of form { "uri": <URI>, "hash": <SHA-256> }>,
"invocation_hash": <SHA-256 of canonical encoding of the invocation object>,
"output_hash": <SHA-256 of canonical output encoding>,
"output_artifact": <optional inline or by-reference output; required for
tolerance replay regimes; see §3.2>,
"environment": <execution environment descriptor; see below>
}
The canonical invocation object — referenced inline or by
content address in payload.invocation — is the structured
record that fully determines the function's inputs:
invocation = {
"function": <same as payload.function>,
"inputs": [
{
"name": <local name used inside the function>,
"step": <SHA-256 identity of predecessor step>,
"output_hash": <SHA-256 of that predecessor's output; redundant
with the predecessor record but recorded here for
verifier convenience and to bind the linkage>
},
...
],
"parameters": <function parameters not sourced from predecessor outputs>
}
invocation_hash is the SHA-256 of the canonical encoding
(§2.5) of this invocation object and MUST equal the hash computed by the
verifier over the resolved payload.invocation (whether
inline or fetched from the content-addressed reference). Hashing the
full invocation — inputs and parameters together — closes the
linkage between predecessor outputs and the step that consumes them. Two
compute steps with identical parameters but different
predecessor bindings produce different invocation_hash
values.
The verifier MUST be able to obtain the invocation object. If
payload.invocation is inline, it is read directly. If it is
a content-addressed reference, the verifier MUST resolve the reference
(the resolution mechanism is profile-defined) and confirm the fetched
object's canonical encoding hashes to the reference's stated hash.
Either form is acceptable; profiles MAY constrain which is permitted for
their conformance level.
A compute step MUST have at least one predecessor edge,
and every predecessor edge MUST have relation derived-from.
A compute step MUST NOT have predecessor edges of any other
relation. The set of derived-from predecessors MUST equal
the set of step identities appearing in invocation.inputs.
A compute step MUST NOT have any derived-from
predecessor of type attest (see §3.0 on
STEP_RESOLVE).
The environment descriptor specifies the conditions for
replay. It MUST include a replay_regime field with value
either "bit-identical" or "tolerance":
bit-identical: SHA-256 of the replayed output MUST equalpayload.output_hash.payload.output_artifactMAY be omitted.tolerance: replay produces an output that must be equivalent to the recorded output under a profile-defined equivalence predicate.payload.output_artifactMUST be present (inline or by content-addressed reference), and the verifier compares outputs under the equivalence predicate, not hashes. TheenvironmentMUST additionally specify the basis for the relaxed requirement (e.g., "non-associative floating-point reductions across BLAS implementations") and the equivalence predicate identifier the profile applies.
A compute step's verification semantics depend on
whether the function and environment are resolvable to the verifier;
details in §3.2.
2.2.3 reason
A reason step records the application of a
non-deterministic model — characteristically a large language model, but
not exclusively so — to inputs derived from predecessor steps.
payload = {
"model": {
"identifier": <URI or string identifier>,
"weights_hash": <SHA-256 hash of weights, where available>,
"version": <string, where applicable>
},
"replay_class": <"R1" | "R2" | "R3">,
"invocation": <inline canonical reason-invocation object | content-
addressed reference { "uri": <URI>, "hash": <SHA-256> }>,
"invocation_hash": <SHA-256 of canonical encoding of invocation>,
"input_messages": <inline canonical input message sequence | content-
addressed reference { "uri": <URI>, "hash": <SHA-256> }>,
"input_messages_hash": <SHA-256 of canonical encoding of input messages>,
"tool_call_log_hash": <SHA-256 of canonical encoding of tool calls and
responses occurring during inference, if any;
absent for non-tool-calling steps>,
"visible_rationale_hash": <SHA-256 of any model-produced explanation,
rationale, reasoning summary, or analytic
narrative exposed to users or reviewers;
absent if not applicable>,
"finding_type": <"conclusion" | "no-finding" | "insufficient-evidence"
| "negative-result" | <profile-defined>>,
"output_hash": <SHA-256 of canonical output encoding>,
"output_artifact": <optional inline or by-reference output; required
when replay_class = R1 (see below)>,
"sampling": {
"temperature": <number>,
"seed": <integer | null>,
"top_p": <number, optional>,
"top_k": <integer, optional>,
...
},
"redaction_policy": <optional reference to a profile-defined redaction
policy applied before hashing>
}
If finding_type is omitted, the value defaults to
"conclusion". The protocol's structural treatment of
non-conclusion findings is specified in §5.5.
The reason invocation object — referenced inline or by
content address in payload.invocation — is the structured
record that determines the model's input:
reason_invocation = {
"model": <same as payload.model>,
"input_bindings": [
{ "name": <name>, "step": <id>, "output_hash": <hash> },
...
],
"input_messages_hash": <same as payload.input_messages_hash>,
"context_frame": {
"conditioned_on": [ <step id>, ... ]
},
"sampling": <same as payload.sampling>
}
invocation_hash is the SHA-256 of the canonical encoding
of this object. The input_bindings list corresponds to
derived-from predecessors; the
context_frame.conditioned_on list corresponds to
conditioned-on predecessors (see below). As with
compute, the verifier MUST be able to obtain both the
invocation object and the input messages, either inline or by resolving
content-addressed references, and MUST confirm that each resolves to an
object hashing to its declared hash.
A reason step MUST have at least one predecessor edge.
Predecessor edges MAY have relation derived-from (the
predecessor's output is part of the model's input — typically bound by
name into one of the input messages) or conditioned-on (the
predecessor is declared as evidentiary context relevant to the current
step but is not a direct input to the operation). A reason
step MUST NOT have predecessor edges of relation about. A
reason step MUST NOT have any derived-from
predecessor of type attest (see §3.0).
The verifier reconstructs the input from derived-from
predecessors and confirms the reconstruction hashes to
input_messages_hash (§3.2). The verifier does not
mechanically verify how, or whether, conditioned-on
predecessors affected the operation; their inclusion is an assertion by
the attestor that the listed steps are evidentiary context for the
current operation. This assertion is surfaced to human reviewers and to
compliance predicates (§5), but is not subject to replay verification.
The conditioned-on relation is therefore primarily a
reviewability and regulatability mechanism, not a replay mechanism.
Visible rationale as evidence. The
visible_rationale_hash field records any model-produced
explanation, rationale, reasoning summary, or analytic narrative exposed
to users or reviewers. Unlike output_hash (which records
the model's conclusion) and tool_call_log_hash (which
records mechanical tool interactions),
visible_rationale_hash records the model's
narrative of its own analysis as exposed for review. Recording
the rationale as content-addressed evidence — rather than as a soft
"explanation" field — is one of the central mechanisms by which PoI
extends provenance frameworks into agentic reasoning: the rationale is
hash-bound to the inputs that produced it, the operation that produced
it, and any downstream step that consumes it. A reviewer auditing a
reason step inspects the rationale; a verifier confirms the
rationale hashes correctly; a downstream attest step can be
made about the rationale specifically.
The protocol does not assert that the recorded rationale is a
faithful representation of the model's hidden reasoning processes. Many
systems do not expose their internal reasoning at all, and exposed text
is often a post-hoc summary or generated explanation rather than a trace
of the actual computation. The protocol records what was exposed for
review; it does not claim that what was exposed is the reasoning. The
evidence value of visible_rationale_hash is that it pins
the producer to the explanation it surfaced, not that the explanation is
mechanistically true.
Tool-call granularity. The
tool_call_log_hash records the canonical encoding of the
full sequence of tool calls and their responses during inference as a
single hash. The default granularity is therefore the entire tool-call
sequence as a unit. A profile MAY require expanded representation — for
example, by mandating that each tool call be recorded as its own
compute substep with explicit derived-from
edges, with the reason step then taking the substeps as
predecessors. The base protocol does not specify when this expansion is
required; profiles bind the decision based on their reviewability and
replay requirements. Expanded representation produces a strictly larger
and more inspectable proof at the cost of greater encoding overhead.
Replay classes. A reason step's
replay_class declares the strength of replay claim the
producer is making:
- R1 — Recorded only. The model identifier and
conditions are recorded, but the producer is not claiming the step can
be re-executed by a verifier.
output_artifactMUST be present so that reviewers can inspect what was actually produced. No replay is performed; verification consists of structural and cryptographic checks plus inspection of the recorded artifact. - R2 — Re-executable. The model is accessible to the
verifier (by API, by hosted endpoint, or by other profile-recognized
means) at a version matching
model.version. Replay is permitted, but divergence between the replayed output andoutput_hashis expected to be possible and is not a verification failure (§3.3). - R3 — Reproducible. The model weights are
content-addressed by
model.weights_hash, the runtime, tokenizer, decoding parameters, and (where applicable) random seed are pinned insamplingandenvironment, and the producer claims that re-execution by any party with access to the weights will yield bit-identical output. Areasonstep claiming R3 is held to a stronger verification standard (§3.2) than R1 or R2.
Aspirational status of R3. R3 is currently reachable for open-weight models executed on profile-recognized runtimes. Closed-weight hosted models — including most large language models accessed via commercial APIs as of this draft — cannot satisfy R3 because their weights are not content-addressable to the verifier. Producers building analyses on such models will operate at R2 by necessity. The protocol records the structural distinction so that, as the ecosystem of open-weight models accessible at the necessary scale develops, analyses can migrate from R2 to R3 without protocol changes. R3 is therefore present in the specification as a forward target as much as a present capability. Conformance levels (§5) account for this distinction directly: L4A (independently attested) admits R2 throughout; L4R (reproducible) requires R3 for designated high-stakes outputs.
The promotion of reason to a first-class step type —
distinct from compute, with its own predecessor relations,
its own replay regime, its own visible-rationale evidence field, and its
own verification semantics — is the principal way in which PoI extends
prior provenance frameworks.
2.2.4 attest
An attest step records a signed claim about one
or more predecessor steps. It introduces no new derived output and
produces no output_hash.
payload = {
"claim_type": <URI from registered claim-type vocabulary>,
"role": <attestor role identifier>,
"claim_body": <inline structured claim object | reference>,
"claim_hash": <SHA-256 of canonical encoding of claim_body>
}
An attest step MUST have at least one predecessor edge,
and every predecessor edge MUST have relation about. An
attest step MUST NOT have predecessor edges of any other
relation.
attest steps unify several concepts that are commonly
designed as separate features. The vocabulary of claim_type
URIs is profile-defined and extensible. The base protocol mandates only
that claim_type resolve to a definition specifying the
structure of claim_body and the roles authorized to make
the claim. Claim types defined or anticipated for profile-level
registration include:
- Review and approval.
review/approve,review/conditional,review/rejectbyqualified-reviewerroles, used to record human review ofreasonorcomputesteps. - Independent validation.
validation/replay-confirmed,validation/output-confirmedbyindependent-validatorroles. - Qualification.
qualification/data-quality,qualification/vendor-statusbydata-providerorvendor-qualificationroles. - Prespecification.
prespecification/locked-plan,prespecification/locked-protocol,prespecification/locked-charterbyanalysis-plan-author,biostatistician,model-owner, or analogous roles. Theclaim_bodyidentifies a pre-locked external artifact (statistical analysis plan, study protocol, model risk management policy, charter) and includes its content hash, the timestamp at which it was locked, and the signatures of the authorizing parties at lock time. A prespecification attestation records that a downstreamcomputeorreasonstep was performed against a plan that existed and was locked before the data underlying the step was unblinded or otherwise observable to the analyst. Prespecification is the protocol-level mechanism by which a reviewer can distinguish between "this analysis was prespecified" and "this analysis was constructed to fit the observed result." The protocol provides the binding; the regulatory significance of that binding is a profile concern (§7.4). - Supersession.
supersession/retractandsupersession/replaceas defined in §5.4. - Adequacy and conditions.
adequacy/finding-confirmed,adequacy/finding-disputed, recording independent review of areasonstep'sfinding_typedetermination (§5.5).
This list is illustrative. The set of registered claim types is the protocol's primary extensibility surface; profiles add or refine claim types without modifying the base step taxonomy or verification algorithm.
Manifest-level attestations. A producer or external
party MAY wish to attest about a proof manifest as a whole rather than
about any single Insight Step within it. The protocol does not currently
express such attestations as Insight Steps, because an
attest step's about edge must reference a step
identity present in the proof DAG (§2.3, §3.1), and the manifest is
explicitly not an Insight Step (§2.7). Manifest-level attestations are
therefore separately-signed sibling objects to the manifest itself,
structurally analogous to the manifest signature. Their canonical form
is forward work for v0.7; in v0.6.2, a producer requiring manifest-level
attestation SHOULD instead issue an attest step about a
designated output step that uniquely identifies the analysis.
2.3 Edge relations
Every entry in a step's predecessors array is an edge.
The protocol admits two equivalent surface forms for edges. The compact
form is a two-field object:
edge = {
"step": <SHA-256 identity of predecessor step>,
"relation": "derived-from" | "conditioned-on" | "about"
}
The extended form, used only with conditioned-on edges,
optionally pins the role and a declared-relevance hash of the contextual
relation:
edge = {
"step": <SHA-256 identity of predecessor step>,
"relation": "conditioned-on",
"context_role": "policy" | "prior-finding" |
"review-context" | "analysis-plan" | "other"
| <profile-defined string>,
"declared_relevance_hash": <SHA-256 of a short producer-supplied statement
of why the predecessor is relevant context>
}
The extended form is OPTIONAL in the base protocol. Profiles MAY
require it for conditioned-on edges within their regulated
scope. When the extended form is used, the verifier MUST canonically
encode the full edge object (including the extended fields) when
computing the step identity. The extended fields are not subject to
replay verification — they discipline what the producer asserts about
the contextual relation, they do not establish operational
influence.
The permitted relations per step type are summarized below. A proof in which a step has a predecessor edge with a relation not permitted for its type is malformed and MUST fail verification.
| Step type | derived-from |
conditioned-on |
about |
Min predecessors |
|---|---|---|---|---|
observe |
not permitted | not permitted | not permitted | 0 |
compute |
required | not permitted | not permitted | ≥1 |
reason |
permitted | permitted | not permitted | ≥1 |
attest |
not permitted | not permitted | required | ≥1 |
The semantics of each relation are:
derived-from: the predecessor's output is part of the input. For acomputestep, this means the predecessor's output is consumed by the function. For areasonstep, this means the predecessor's output is part of the prompt or system input. Aderived-frompredecessor MUST be a step type for whichSTEP_RESOLVE(§3.0) is defined; it MUST NOT be anatteststep.conditioned-on: the predecessor is declared as evidentiary context relevant to the current step but is not a direct input to the operation. Common cases include a priorreasonstep whose conclusion frames the current step's question through producer-side workflow logic, or a policy or charter asserted as governing the analysis but not literally bound into the model input. The protocol recordsconditioned-onpredecessors as part of the invocation but does not establish how, or whether, they affected the operation; their inclusion is an attestor assertion that may be surfaced for human review or compliance predicates. Where the extended form is used, the assertion is further pinned bycontext_roleanddeclared_relevance_hash, giving reviewers a producer-supplied basis for the contextual relation without making the relation replay-verifiable. Distinguishingconditioned-onfromderived-frommatters because the verifier's replay procedure differs (§3.2).about: the current step asserts something regarding the predecessor. The current step does not depend on the predecessor's output for its own derivation; it claims something about it.
2.4 Timestamp binding
Every step MUST carry a timestamp field of the following
form:
timestamp = {
"value": <RFC 3339 timestamp>,
"authority": <URI identifying the timestamp authority>,
"token": <opaque token from the timestamp authority>
}
The timestamp authority's binding mechanism is profile-defined. A
profile MUST specify how a verifier confirms that token
represents a valid attestation by authority that
to_timestamp(step) — the canonical encoding of fields 1–6,
which includes signature and excludes only the
timestamp field itself — existed at value. The
timestamp therefore covers the signed step content (including the
attestor signature), giving the property that the signed evidence is
what is timestamped, not merely the unsigned content.
PoI does not mandate any single timestamp authority. Acceptable authority types include but are not limited to: RFC 3161 timestamping services, Sigstore Rekor inclusion proofs, public-blockchain anchors, and notarial attestations. Profiles bind specific authority types and specify their verification procedures (§7).
A timestamp on a step MUST NOT precede the timestamps of any of its predecessors. A verifier MUST check this ordering as part of structural validation (§3.1).
2.5 Canonical encoding and identity
A step's canonical encoding is the JSON representation of the step record using RFC 8785 (JSON Canonicalization Scheme). All hashes referenced in this document are computed over the canonical encoding of the relevant object.
A step's identity is computed as:
id(step) = SHA-256( canonical_encoding(step) )
where canonical_encoding(step) includes all fields,
including signature. The identity is determined only after
the attestor has signed the step.
2.6 Construction-time invariants
A step is well-formed if and only if:
- Its
typeis one of the four defined values. - Its
predecessorsarray satisfies the relation constraints of §2.3 for its type, including the prohibition onatteststeps appearing asderived-frompredecessors. Edges of relationconditioned-onMAY use the extended form (§2.3); edges of other relations MUST use the compact form. - Its
payloadmatches the schema for its type (§2.2), including the presence ofinvocation(and, forreason,input_messages) either inline or as a content-addressed reference. - Its
timestampis structurally valid and does not precede any predecessor's timestamp. - Its
signatureverifies against the public key bound toattestor. - Each predecessor referenced exists in the proof, and the proof is
acyclic when traversed via
predecessorsedges.
A proof is well-formed if and only if every step in it is well-formed and condition (6) holds globally. Well-formedness is necessary but not sufficient for verification; verification additionally applies type-specific checks and conformance-level checks (§3, §5).
2.7 The proof manifest
A PoI proof is not transmitted or verified as a bare collection of steps. It is carried alongside a proof manifest, a top-level object that names the steps comprising the proof, identifies the output steps whose outputs are claimed as conclusions, declares the conformance level being asserted, and binds the proof to one or more profiles under which it is to be verified.
manifest = {
"manifest_version": "0.6.2",
"proof_id": <UUID or content-derived identifier>,
"steps": [ <step identity>, ... ],
"outputs": [ <step identity>, ... ],
"conformance_claim": "L1" | "L2" | "L3" | "L4A" | "L4R",
"verification_basis": <optional> "replay-verifiable" |
"linkage-verifiable-only" |
"resolution-limited",
"profiles": [ <profile URI>, ... ],
"manifest_attestor": <URI identifying the producer claiming this proof>,
"manifest_signature": <signature over canonical encoding of preceding fields>
}
The manifest's outputs field replaces the parameter
O in §3.1's structural-validation algorithm. Each identity
in outputs MUST identify a step of type
compute or reason unless a profile explicitly
permits otherwise. observe steps carry external data rather
than derived output; attest steps carry claims rather than
output (§3.0). Permitting them as outputs without profile authorization
would create ambiguity in conformance evaluation.
The manifest is signed by the producer asserting the conformance
claim. The manifest signature is structurally identical to a step
signature but is not itself an Insight Step — it carries no
step type, no predecessor edges, and does not appear in the proof DAG.
This separation resolves an issue that would otherwise arise: a
conformance claim made via an attest step inside the proof
would require the proof to contain attest steps even when
claiming a level (L1) that prohibits them. The manifest sits outside the
typed-step DAG specifically to avoid this circularity.
The handling of attestations about the manifest (rather than about individual steps) is specified in §2.2.4 under "Manifest-level attestations."
Verification basis. The OPTIONAL
verification_basis field records the producer's claim about
what gate of verification the proof supports:
replay-verifiable: the producer asserts that everycomputestep is replayable under its declared regime and everyreasonstep is verifiable under its declared replay class, given the profile-defined resolution services. A verifier with full resolution access SHOULD be able to perform gates 1–5 (Appendix A).linkage-verifiable-only: the producer asserts that the proof supports gates 1–3 only (schema, structural, cryptographic), and does not undertake to make functions, models, or weights resolvable to the verifier. Replay is not claimed.resolution-limited: the producer asserts a mixed state — some steps support replay, others do not — and acknowledges that the achievable verification basis depends on the verifier's resolution capabilities and on which artifacts are supplied alongside the proof.
The field is informational at the base protocol level. Profiles MAY
require a specific verification_basis value as a
precondition for use of the proof in their regulated scope. A verifier
MUST report, alongside its accept/reject decision, the verification
basis it actually achieved (which may be weaker than the claimed basis
if resolution services are unavailable). When the achieved basis is
weaker than the claimed basis, the verifier MUST identify the steps for
which the gap arose.
The absence of verification_basis is permitted for
v0.6.2 compatibility with v0.6.1 proofs; in that case the verifier
treats the basis as unspecified and reports its achieved basis without
comparison.
3. Verification
This section specifies the verification algorithm. The algorithm is a
single procedure parameterized by the conformance level being checked.
For a fixed proof, fixed manifest, fixed profile set, fixed external
artifact set, fixed trust roots, and fixed replay configuration,
compliant verifier implementations MUST produce the same accept/reject
decision; verifiers MAY differ in the diagnostics they emit on
rejection. Where replay configurations differ (for example, where one
verifier has access to a model that another does not), the resulting
decisions MAY differ in well-defined ways; the protocol distinguishes
between failures that arise from the proof itself and failures that
arise from verifier-side resolution limits, and each FAIL
diagnostic identifies its source.
A verifier MUST report, alongside its accept/reject decision, the
verification basis (§2.7) it actually achieved. Where the manifest
carries a verification_basis claim and the achieved basis
is weaker, the verifier MUST identify the gap (which steps did not reach
replay-verifiable status, and why).
3.0 The STEP_RESOLVE operation
The verification algorithm uses an internal operation
STEP_RESOLVE(s), returning the "output" of a step
s. Its behavior is defined per step type:
observe: returns the observed artifact, identified bypayload.content_hash. If the artifact itself is not supplied to the verifier,STEP_RESOLVEreturnscontent_hashas an opaque reference; downstream linkage checks confirm hash equality without requiring artifact retrieval. The verifier MUST track which form was returned for downstream diagnostics.compute: returns the output, identified bypayload.output_hash. The output bytes are retrievable frompayload.output_artifactif present; otherwise from replay if replay is enabled and succeeds; otherwiseSTEP_RESOLVEreturnsoutput_hashas an opaque reference.reason: returns the output, identified bypayload.output_hash. The output bytes are retrievable frompayload.output_artifactif present (REQUIRED at R1, RECOMMENDED at R2 and R3 when the proof is intended to be reviewable without replay). OtherwiseSTEP_RESOLVEreturnsoutput_hashas an opaque reference.attest:STEP_RESOLVEis undefined.atteststeps record claims, not derived outputs. Anatteststep MUST NOT appear as aderived-frompredecessor of any step (§2.3, §2.6). A proof violating this constraint MUST fail well-formedness.
A step "resolves to bytes" if STEP_RESOLVE returns
concrete content rather than an opaque hash reference. Replay
verification (§3.2) requires that all derived-from
predecessors of the step under examination resolve to bytes; in their
absence, the step is verifiable in the weaker sense of structural,
signature-, and linkage-validity only, without replay. The verifier
records which sense applied per step for the verification-basis report
(§2.7).
3.1 Structural validation
Given a proof manifest M (§2.7) and the set of steps
P it references, the verifier first validates the manifest
itself and then the proof DAG:
STRUCTURAL_VALIDATE(M, P):
0. Verify M.manifest_signature against the public key bound to
M.manifest_attestor. On failure: FAIL("manifest signature invalid").
Confirm that the set of step identities in M.steps equals the
set of identities computed for P. On mismatch:
FAIL("manifest does not describe proof").
Confirm that every identity in M.outputs is present in M.steps.
On mismatch: FAIL("output not in proof").
Confirm that every output step is of type `compute` or `reason`,
except where the named profiles explicitly permit other types as
outputs. On violation: FAIL("output of impermissible type").
1. For each step s ∈ P:
if not WELL_FORMED(s): return FAIL("step ill-formed", s)
2. Build the predecessor graph G over P. If G contains a cycle:
return FAIL("proof contains cycle")
3. For each edge (s → s') in G:
if s' ∉ P: return FAIL("dangling predecessor", s, s')
if timestamp(s') > timestamp(s):
return FAIL("timestamp inversion", s, s')
if relation = "derived-from" and type(s') = "attest":
return FAIL("attest cannot be derived-from",
s, s')
4. Let O = M.outputs.
5. Compute the structural ancestor closure A = transitive closure of
predecessors of O. This closure includes all steps regardless of
supersession status; the protocol does not erase the historical
derivation of any output. Steps in P \ A are "unreached." A
verifier MAY emit a warning for these but MUST NOT reject on
this basis.
6. Compute the effective ancestor closure A* = A minus any step
superseded by an attest of claim_type "supersession/retract" or
"supersession/replace" (§5.4). A* is the closure over which
conformance checks of §5 are evaluated.
7. For each output step o ∈ O: if any step in the structural
ancestor closure of o is superseded but o itself is not
superseded, return FAIL("output derived from superseded ancestor
not itself superseded", o). A producer correcting an analysis
MUST supersede both the retracted predecessor and any downstream
output that depends on it (or replace the output with a non-
superseded successor); see §5.4.
8. Return PASS.
WELL_FORMED(s) is the conjunction of conditions 1–5 of
§2.6 (condition 6 is checked globally in steps 2–3 above).
3.2 Type-specific verification
Given a structurally valid proof, the verifier walks the steps in topological order (predecessors before successors) and applies type-specific checks.
TYPE_VERIFY(s):
case s.type of:
observe:
a. If an external artifact for s is supplied, recompute its SHA-256
and compare with payload.content_hash. On mismatch: FAIL.
b. Resolve s.attestor under the profile's authority model. If the
attestor is not authorized to make observations from
payload.source: FAIL.
c. If payload.provenance references an external attestation,
verify it using the relevant external verifier. On failure: FAIL.
compute:
a. Obtain the canonical invocation object: either from
s.payload.invocation inline, or by resolving s.payload.invocation
as a content-addressed reference (profile-defined resolution).
Compute SHA-256 of its canonical encoding. If the result does
not equal s.payload.invocation_hash: FAIL.
b. Confirm that the invocation object's `inputs` list, when matched
by step identity, equals the set of `derived-from` predecessors
of s. For each entry (name, step, output_hash) in inputs,
confirm that output_hash matches the predecessor's recorded
output_hash. On any mismatch: FAIL.
c. Resolve s.payload.function. If the function is not resolvable,
replay is impossible and the step is verifiable in a weaker
sense only: it remains structurally, signature-, and linkage-
verified, but no replay claim is made. The verifier records
this with diagnostic `compute: function-unresolvable` and
accounts for the step under linkage-verifiable-only basis.
d. If replay is enabled, the function is resolvable, and every
`derived-from` predecessor resolves to bytes:
- If environment.replay_regime = "bit-identical":
re-execute the function on the resolved inputs in an
environment matching s.payload.environment. Compute
SHA-256 of the canonical replayed output. If it does not
equal s.payload.output_hash: FAIL.
- If environment.replay_regime = "tolerance":
s.payload.output_artifact MUST be present. Re-execute
the function on the resolved inputs. Apply the
equivalence predicate named in s.payload.environment to
the pair (replayed output, recorded output_artifact).
If the predicate returns false: FAIL. The verifier MAY
additionally hash the recorded output_artifact and
confirm it matches output_hash; this confirms the
artifact has not been tampered with relative to the
signed step but does not by itself verify replay.
reason:
a. Obtain the canonical reason-invocation object from
s.payload.invocation (inline or by resolving the reference).
Compute SHA-256 of its canonical encoding. If it does not
equal s.payload.invocation_hash: FAIL.
b. Confirm that the invocation object's input_bindings list, when
matched by step identity, equals the set of `derived-from`
predecessors of s, and that its context_frame.conditioned_on
list equals the set of `conditioned-on` predecessors of s. For
each input binding (name, step, output_hash), confirm that
output_hash matches the predecessor's recorded output_hash.
On any mismatch: FAIL.
c. Obtain the canonical input message sequence from
s.payload.input_messages (inline or by resolving the
reference). Compute SHA-256 of its canonical encoding. If it
does not equal s.payload.input_messages_hash: FAIL.
d. (Optional, recommended.) Reconstruct the input messages from
the resolved `derived-from` predecessor outputs using the
binding names recorded in the invocation, and confirm the
reconstruction is consistent with the resolved
input_messages. The base protocol does not normatively
specify a single reconstruction rule because the templating
and message-assembly conventions are model- and producer-
specific; profiles MAY require a specific reconstruction
predicate.
e. Dispatch on s.payload.replay_class:
- R1 (recorded only):
s.payload.output_artifact MUST be present. Hash the
artifact and confirm it matches s.payload.output_hash.
No re-execution is performed. Record the step as
`reason-class: R1, replay: not-attempted` and
contributing linkage-verifiable basis only.
- R2 (re-executable):
If replay is enabled and s.payload.model is resolvable
at the recorded version, re-execute under
s.payload.sampling. Compute SHA-256 of the canonical
replayed output and compare to s.payload.output_hash:
- Match: record `reason-class: R2, replay: stable`.
- Mismatch: record `reason-class: R2, replay:
divergent` together with the replayed output hash.
This is NOT a verification failure (§3.3).
If model is unresolvable, record `reason-class: R2,
replay: model-unavailable` and contribute linkage-
verifiable basis only.
- R3 (reproducible):
s.payload.model.weights_hash MUST be present. The
verifier MUST resolve weights by hash; if unable, FAIL
with `reason-class: R3, replay: weights-unavailable`.
Re-execute under the fully pinned conditions. Compute
SHA-256 of the canonical replayed output. If it does
not equal s.payload.output_hash: FAIL. (At R3,
divergence IS a verification failure, because R3
asserts reproducibility.)
attest:
a. Resolve all `about` predecessors. Confirm each is present in
the proof. (Ill-formed proofs would have failed at step 1
of STRUCTURAL_VALIDATE; this is a defensive check.)
b. Resolve s.attestor and confirm the attestor's role
(s.payload.role) is authorized — under the profile's
authority model — to make claims of type s.payload.claim_type
about steps of the predecessor's type. On unauthorized: FAIL.
c. Confirm s.payload.claim_hash matches the canonical encoding
of s.payload.claim_body. On mismatch: FAIL.
A note on the semantics of "hash match." For SHA-256 digests, comparison is exact: digests match or they do not. There is no notion of hash equality "within tolerance." Where tolerance replay is required (numerical equivalence under non-deterministic floating-point reductions, for example), the protocol carries the recorded output as an artifact and applies a profile-defined equivalence predicate to the outputs, not to the hashes. This separation — exact comparison for hashes, predicate comparison for outputs — is the protocol's way of admitting numerical practicality without compromising the integrity guarantees of content-addressing.
3.3 The semantics of reason divergence
A reason step that re-executes to a different output
than its recorded output_hash is divergent. Divergence is
information, not a verdict. The protocol commits to recording
divergence; what to do with the record is delegated to the conformance
regime and to human reviewers.
This delegation is deliberate. Possible meanings of divergence include:
- Stochasticity within tolerance. The model is non-deterministic by design and the divergent output expresses substantively the same conclusion. Whether two outputs express the same conclusion is generally not mechanically decidable.
- Material change in conclusion. The divergent output expresses a conclusion that, if it had been recorded originally, would have led to a different downstream analysis. This is a finding for human review and may trigger a supersession (§2.2.4).
- Model drift or unavailability. The model identifier resolves to a model whose behavior has changed since the original step was produced. This is a finding for the producer's model-management process.
A future version of this protocol may specify a richer semantics for
divergence — for example, by typing each reason step's
output_hash under one or more semirings whose
composition rules determine which downstream conclusions remain valid
under which divergence conditions. Specifying this semantics is out of
scope for v0.6.2.
3.4 Complexity
For a proof with n steps, the graph-theoretic portion of
verification — well-formedness checks, cycle detection by Kahn's
algorithm, ancestor closure computation, supersession analysis — runs in
O(n) time and space. Total verification cost is dominated
by the additive cost of cryptographic operations (signature and
timestamp verification, scaling per-step but constant in
n), external resolution (function and model retrieval,
content-addressed reference resolution, external provenance
verification), artifact retrieval, replay execution itself (which for
reason steps may be the dominant term), and profile
predicate evaluation. The structural claim is that the graph-traversal
cost is linear and tractable; the operational cost is dominated by terms
outside the graph itself, which scale with the producer's resource
decisions rather than with proof size.
This complexity bound is the basis for the claim that the graph-theoretic portion of PoI verification is decidable and tractable for proofs of practical size.
4. Properties (Derived)
This section derives the properties commonly demanded of regulated evidence — traceability, reproducibility, reviewability, robustness, regulatability — as consequences of the protocol specified in §2 and §3. Each property is stated as a proposition followed by a brief justification. None of these properties is a separate protocol feature; each follows from the construction of the Insight Step and the verification algorithm.
4.1 Traceability
For every step s not of type observe,
every input to s is reachable, by transitive traversal of
predecessors edges, to one or more observe
steps.
This holds by induction on the predecessor graph: every
non-observe step has at least one predecessor (§2.3), and
the graph is acyclic (§2.6), so every backward path terminates at an
observe. Traceability is therefore not an external
property; it is a structural invariant of any well-formed proof.
4.2 Reproducibility
Every compute step is replayable under its declared
replay regime — bit-identical, or output-equivalent under a
profile-defined predicate. Every reason step is verifiable
under its declared replay class — R1 (recorded only), R2 (re-executable
with possible divergence), or R3 (reproducible bit-identically against
pinned weights).
This follows directly from §2.2.2 and §2.2.3. The protocol distinguishes deterministic and non-deterministic operations by step type, and within each, distinguishes the strength of the reproducibility claim being made. Reproducibility is not a single property but a regime-and-class lattice; the conformance levels of §5 select which regions of the lattice are admissible for which classes of analysis. Per §2.2.3, R3 is currently aspirational for closed-weight hosted models; the lattice is structured so that practical analyses operate at R2 under L4A without prejudice to migration to L4R as the open-weight ecosystem develops.
4.3 Reviewability
Every step's payload is independently inspectable given its predecessors and the profile-defined resolution mechanism for its identifiers.
Content-addressing (§2.5) ensures that any inspector with the proof and access to the profile's resolution services can recover the data described by every payload field. No proprietary tooling is required. Visible rationale (§2.2.3) is independently inspectable in the same way.
4.4 Robustness (tamper-evidence)
Any modification to any field of any step changes that step's identity. Any modification to a step's identity invalidates every descendant that references it.
This follows from the definition of step identity as a hash over the
canonical encoding (§2.5) and the use of identities (not positions or
pointers) in predecessors edges (§2.3). The cost of forging
a modification is the cost of producing a SHA-256 collision and of
forging or coercing every signature on every descendant step.
4.5 Regulatability
Compliance rules expressible as predicates over typed steps and signed roles in the proof DAG can be mechanically checked against any well-formed proof.
The combination of typed steps (§2.2), typed edge relations (§2.3),
and role-bearing attest steps (§2.2.4) is sufficient to
express predicates such as: "every reason step on the path
to a regulated conclusion has at least one about
predecessor of type attest with role
qualified-reviewer," or "every compute step
performing a confirmatory analysis is preceded by a
prespecification/locked-plan attestation with timestamp
earlier than its earliest input data lock." A profile MAY supply a
library of such predicates corresponding to a specific compliance regime
(e.g., 21 CFR Part 11, SR 11-7, GMLP).
This is the precise form of the protocol's claim to support compliance: not that PoI implements any specific regulation, but that PoI's structure permits expressing compliance rules as decidable predicates over the proof. The mapping between PoI mechanisms and specific regulatory frameworks is provided informatively in Appendix C.
Editor's note (§4). Each of these subsections currently states the property and gives a one-paragraph justification. In the published version, each will be elevated to a numbered proposition with a sketch of proof. The proofs are short — none requires more than half a page — but their correctness depends on the precise wording of §2 and §3, which is why they are deferred from this draft.
5. Conformance Levels
PoI defines five conformance levels: L1, L2, L3, L4A, and L4R. L4 in v0.6.1 has been split, in v0.6.2, into L4A (independently attested) and L4R (reproducible), to accommodate the reality that hosted-model workflows can satisfy strong independent-attestation requirements while remaining structurally unable to satisfy R3 reproducibility. v0.6.1 L4 claims correspond to v0.6.2 L4R; producers needing R2 throughout under independent attestation use L4A.
Each level is defined as a set of constraints on which step types
must appear in a proof, which replay classes are admissible for
reason steps that are ancestors of output steps, which
roles must sign which steps, and which checks the verifier must
perform.
A producer claims conformance by setting the
conformance_claim field in the proof manifest (§2.7) to the
asserted level. The manifest is signed by the producer; the conformance
claim is therefore signed evidence in itself, but it is not an
Insight Step and does not appear in the proof DAG. A verifier confirms
or refutes the conformance claim by running the verification algorithm
of §3 under the constraints of the claimed level.
5.1 Level definitions
L1 — Provenance. The proof DAG contains only steps
of type observe and compute. All steps are
signed. All compute steps declare a
replay_regime (bit-identical or tolerance) and meet the
corresponding replay requirements when verified by a party with access
to the function and environment. The proof DAG contains no
reason and no attest steps. At L1,
attestor resolution MAY be limited to cryptographic key
resolution sufficient to verify the signature; the bound key is not
required to be linked to a verified organizational or individual
identity. Profiles MAY permit self-declared or locally-issued keys at
L1. Sufficient for low-risk computational analyses where no reasoning
operations contribute to conclusions, no in-proof attestations are
claimed as part of the evidence, and identity binding is not required by
the consuming process. (External attestations, including the manifest's
own signature, are unaffected by this restriction.)
L2 — Identity. L1 plus: at L2, the key resolved for
every attestor MUST be bound, under the profile's authority
model, to a verified organizational or individual identity and role. All
timestamps resolve to a profile-recognized timestamp authority.
Sufficient for regulated computational analyses without reasoning
components.
L3 — Reasoning. L2 plus: reason steps
MAY appear. Every reason step that is an ancestor of an
output step MUST declare replay_class of at least R2 — that
is, R1 (recorded-only) reason steps are not permitted as ancestors of
output steps at L3. The model identifier MUST be resolvable under the
profile's resolution mechanism at the recorded version.
attest steps MAY appear and follow the rules of §2.2.4.
Sufficient for agentic analyses contributing to regulated conclusions
where re-execution access (but not full reproducibility) is
available.
L4A — Independently Attested. L3 plus the following two requirements:
Independent qualified review. For every output step
sof typereason, there MUST exist anatteststepain the proof such that:ahas anaboutedge pointing tos,a.payload.roleis in the profile-defined set of qualified review roles,a.payload.claim_typeis in the profile-defined set of approval claim types,a.attestorresolves to an identity distinct froms.attestor.
(Note that
ais a successor or sibling ofsin the DAG, not a predecessor. The relationship is established by theaboutedge fromatos, not by any edge fromstoa.)Prespecification for confirmatory analyses. Profiles designating a
computeorreasonstep as a "confirmatory analysis" output MUST require that anatteststep ofclaim_typeprespecification/locked-plan(or a profile-specific variant) exists, with anaboutedge pointing to the analysis step, whoseclaim_bodyreferences an external locked plan with a timestamp earlier than the earliestobservestep in the analysis step's ancestor closure that corresponds to data unblinded after the lock. The base protocol provides the structural mechanism; each profile binds the determination of "confirmatory analysis" and "unblinded data" to its domain.
L4A admits reason steps with replay_class
R2 throughout, including for outputs that downstream regulatory regimes
may consider high-stakes. The strength of L4A rests on the combination
of identity binding (L2), re-executability with model resolution (L3,
R2), independent qualified review, and prespecification rather than on
bit-identical reproducibility. L4A is the level appropriate to regulated
agentic workflows on closed-weight hosted models with strong
governance.
L4R — Reproducible. L4A plus the following requirement:
- Reproducibility for high-stakes reasoning. Profiles
MAY designate certain output classes as "high-stakes." For any
reasonstep that is an ancestor of a high-stakes output step,replay_classMUST be R3 (reproducible). For other output steps, R2 remains permitted. Profiles SHOULD acknowledge the aspirational status of R3 per §2.2.3 in defining their high-stakes-output set.
L4R is the strongest conformance claim and is sufficient for regulated analyses where bit-identical reproduction of high-stakes reasoning outputs is both required and achievable — characteristically open-weight model deployments under controlled runtimes. L4R subsumes L4A: a proof satisfying L4R necessarily satisfies L4A.
Choosing between L4A and L4R. Producers SHOULD claim the higher of the two levels their analysis can support. Profiles MAY restrict their regulated scope to one of the two — for example, a profile addressing a regulatory regime that demands bit-identical reproducibility for primary endpoints would require L4R; a profile addressing a regime that demands independent qualified review and prespecification but accepts non-deterministic reasoning would accept L4A. The protocol does not by itself decide which level a given regulatory regime should require; this is a profile and regulator concern.
5.2 Decision procedure
For each level, the additional checks beyond L1's are mechanically
expressible as predicates over the manifest, the proof DAG, and per-step
attributes. The verifier of §3 takes the level as a parameter (read from
M.conformance_claim) and applies the corresponding
predicate set. No level requires the verifier to interpret
natural-language content; all checks are over types, hashes, identities,
replay classes, edge relations, roles, and timestamps.
A useful framing of the level definitions is as the following graph query: at each level, the set of output-reaching steps must satisfy a level-specific predicate set. The predicate set grows monotonically with level: L1 ⊂ L2 ⊂ L3 ⊂ L4A ⊂ L4R. An L3 verifier checking an L4A-claimed proof would not detect missing review attestations as failures; an L4A verifier checking an L1-claimed proof would not require them. Conformance is checked against the claimed level, not against the highest level the proof structurally satisfies.
5.3 The handling of reason divergence at L3, L4A, and
L4R
At L3 and L4A, a reason step with
replay_class R2 that diverges on re-execution is recorded
as divergent (§3.2 reason.e) but is not a verification failure.
The proof's conformance status is unaffected by R2 divergence at the
protocol level. Profiles MAY require that divergence above a threshold
trigger a supersession (§5.4) before the proof is admissible to the
regulated process the profile addresses; the protocol provides the
mechanism, the threshold and trigger are profile-level decisions.
At L4R, a reason step in the path of a high-stakes
output step MUST be R3 and therefore MUST replay bit-identically.
Divergence in this case IS a verification failure (§3.2 reason.e, R3
branch). This is the central reason for the R3 class: it admits the
strongest, regulator-grade claim that recorded outputs can be
reproduced. The split between L4A and L4R isolates this claim from the
independent-attestation claims, so that regulated workflows whose
reasoning is structurally non-reproducible can still earn the
independent-attestation guarantees of L4A without overclaiming
reproducibility.
5.4 Supersession and amendment
The protocol does not permit modification of a step once signed. Amendment is expressed by appending new steps:
- An
atteststep ofclaim_type: supersession/retractabout the original step records the retraction. - A new step (typically of the same type as the original) records the corrected derivation.
- An
atteststep ofclaim_type: supersession/replaceabout both the original and the replacement records the binding between them.
Verifiers MUST treat a step as superseded if there exists in the
proof an attest step with
claim_type: supersession/retract or
supersession/replace about it. As specified in §3.1, the
structural ancestor closure preserves the historical derivation of every
output; the effective ancestor closure used for conformance evaluation
excludes superseded steps. An output that depends on a superseded
predecessor and is not itself superseded fails structural validation
(§3.1 step 7); a producer correcting an analysis MUST therefore
supersede the affected output as well, or replace it with a
non-superseded successor.
This pattern preserves the immutability of the chain while admitting the practical reality that regulated analyses are revised, and prevents the inadvertent retention of conclusions whose derivation has been retracted.
5.5 Insufficient-evidence findings
A reason step MAY declare its finding_type
as no-finding, insufficient-evidence,
negative-result, or a profile-defined variant (§2.2.3).
Such steps are first-class evidence: they record what the analysis
examined and what it could not conclude, and they are subject to the
same well-formedness, signature, timestamp, and replay requirements as
any other reason step. They are not the absence of
analysis; they are the recorded result of an analysis that did not
produce a positive finding.
The protocol's treatment of finding_type is structural.
A finding_type of insufficient-evidence does
not exempt the step from any conformance check; the step's input
bindings, conditioned-on relations, replay class, and (at L4A or L4R)
qualified review are checked as usual. The protocol does not by itself
decide whether an insufficient-evidence determination is correct; that
determination is the substance of qualified review, and may be the
subject of an adequacy/finding-confirmed or
adequacy/finding-disputed attestation (§2.2.4).
The motivation for first-class treatment is regulatory. In
pre-submission contexts and adverse-event review, the analyses that
found nothing are as important as the analyses that found something. A
proof that records its no-finding outcomes alongside its conclusions
provides a regulator a complete view of what the analysis examined; a
proof that silently omits them does not. Profiles SHOULD require
explicit finding_type for reason steps in
regulated contexts rather than relying on the default
conclusion.
Editor's note (§5). The level definitions are committed; the precise role vocabularies, approval claim-type vocabularies, high-stakes output classifications, and confirmatory-analysis designations that L4A and L4R require are profile-specific and not specified here. Reference vocabularies for the clinical-trials and finance regimes are sketched in §7.4 and §7.5 respectively.
6. Threat Model
Editor's note (§6). This section is a placeholder pending stabilization of §2–§3. The intended contents:
The threat model section will enumerate, for each of the following adversaries, what PoI provides and what it does not:
- Tampering producer. A producer attempting to alter a proof after the fact. (Defended by hash chaining, signatures, and timestamping.)
- Compromised attestor key. A producer or external party in possession of an attestor's signing key. (Not defended at the protocol level; identity revocation is a profile concern.)
- Colluding attestors. Multiple attestors cooperating to produce a fraudulent but internally consistent proof. (Not defended at the protocol level for L1–L3; partially mitigated at L4A and L4R by requirement of independent attestation roles.)
- Source-data fabrication. An attestor presenting
fabricated data as an
observe. (Out of scope per §1.3 and §2.2.1; delegated to source-data integrity standards.) - Model substitution. A
reasonstep whose recorded model identifier does not correspond to the model actually used. (Defended where weights hashes are recorded and the profile permits resolution of the hash.) - Replay-time model drift. A model whose weights have changed between original execution and verification. (Defended where weights are hashed; not defended where weights are referenced only by version string.)
- Timestamp authority compromise. A timestamp authority issuing tokens for backdated content. (Defended only to the extent the profile's authority is itself trustworthy; PoI inherits the trust assumptions of its timestamp profile.)
- Post-hoc prespecification fabrication. A producer
constructing a "prespecification" attestation after observing the data.
(Defended where the prespecification attestation's own timestamp
predates the earliest unblinded
observestep in the analysis ancestor closure, and the timestamp authority is itself trustworthy — see threat 7.) - Silent ancestor retraction. A producer retracting a predecessor without retracting the output that depends on it, attempting to retain a conclusion whose derivation has been disavowed. (Defended by §3.1 step 7: an output with a superseded structural ancestor that is not itself superseded fails verification.)
- Verification-basis overclaiming. A producer
claiming
verification_basis: replay-verifiablefor a proof whosereasonsteps are not actually re-executable to verifiers. (Detected when the verifier reports the achieved basis alongside the accept/reject decision; profiles requiring a specific basis as a precondition for use will reject the proof at the profile gate.)
The section will end with an explicit list of attacks PoI does not defend against, mirroring the non-goals of §1.3 in adversarial form.
7. Profiles (Informative)
7.0 The Profile Mechanism
A PoI profile is a specification document that binds the abstract protocol of §§2–3 to a concrete cryptographic ecosystem, identity model, and predicate vocabulary. Profiles are the protocol's mechanism for domain and ecosystem specialization without revising the base protocol.
The profile mechanism specified in this subsection is normative. The individual profile sketches in §7.1–§7.5 are informative placeholders; they become normative only if separately published as formal profile specifications.
A profile MAY specify:
- The signature scheme and key infrastructure (e.g., Sigstore keyless via OIDC)
- The timestamp authority and its verification procedure (e.g., Rekor inclusion proofs, RFC 3161 tokens)
- The canonical encoding algorithm if other than JCS
- The function-identifier resolution mechanism for
computesteps - The model-identifier and weights resolution mechanism for
reasonsteps - The resolution mechanism for content-addressed
invocationandinput_messagesreferences - The registered vocabulary of
claim_typeURIs and the authorized roles for each - Whether the extended form of
conditioned-onedges (§2.3) is required, and if so the registeredcontext_rolevocabulary - The interpretation of "high-stakes output" and "confirmatory analysis" at L4A and L4R (§5.1)
- The required
verification_basis(§2.7) for proofs within the profile's regulated scope - The equivalence predicate vocabulary for tolerance-regime
computereplay - The
finding_typevocabulary beyond the base set (§5.5) - The input-message reconstruction predicate for
reasonstep verification (§3.2 reason.d) - Domain-specific predicates expressed as additional
attestclaim types - Whether output steps may be of types other than
computeandreason(§2.7)
A profile MUST NOT modify the four-step taxonomy, the three edge relations, or the verification algorithm of §3. Extension is by predicate vocabulary and resolution mechanism, not by new step types or new verification semantics. This constraint preserves the property that a single verifier implementation can verify a proof under any profile, given the profile's resolution services.
A proof manifest's profiles field (§2.7) lists the
profile(s) under which the proof is to be verified; the verifier MUST
apply each named profile's resolution rules and predicate set in
addition to the base protocol's requirements.
7.1 Sigstore profile
Binds signature to Sigstore's keyless signing scheme.
Binds timestamp to Rekor inclusion proofs. Specifies
attestor URI form using OIDC subject identifiers. Specifies
the authority model for §3.2's observe and
attest checks via OIDC claim verification.
7.2 in-toto interoperability profile
Specifies how an observe step's provenance
field references an in-toto attestation, and how a verifier consumes the
in-toto attestation as part of the observe check. Provides
a reverse mapping: a sequence of in-toto attestations representing a
build pipeline can be lifted into an L1 PoI proof.
7.3 RFC 3161 timestamping profile
Specifies token format and authority resolution for environments where Sigstore Rekor is not appropriate.
7.4 Clinical Trials Profile (Informative, Forward-Reference)
A profile targeting clinical-trial readiness analyses and AI-assisted clinical decision support. Anticipated specializations include:
- Identity model. Industry-standard signing scheme
(e.g., Sigstore via institutional OIDC) with role bindings for
principal-investigator,biostatistician,qualified-reviewer,independent-validator,sponsor-medical-monitor, anddata-monitoring-committee-member. - Source-data integrity bindings.
observestepprovenancereferences MUST resolve to attestations from systems operating under 21 CFR Part 11 §11.10 (clinical data management systems with audit trails meeting predicate-regulation requirements), or to equivalent attestations under EU CTR Annex 11 or ICH E6(R3) §5.5. - Prespecification predicates. Required
claim_typevocabulary includingprespecification/locked-sap,prespecification/locked-protocol,prespecification/locked-charter, each binding to a pre-DCO content-hashed external artifact. - Subgroup analysis predicates.
claim_typevocabulary includingsubgroup-analysis/prespecifiedandsubgroup-analysis/post-hoc, allowing structural distinction between confirmatory and exploratory subgroup findings. - Insufficient-evidence handling.
finding_typevocabulary aligned with ICH E9(R1) estimand framework, distinguishing intercurrent-event treatment and missingness handling. Required forreasonsteps contributing to safety or efficacy determinations. - Bias and fairness predicates.
claim_typevocabulary forbias-assessment/subgroup-performance,bias-assessment/representativeness, supporting structural distinction between assessed and unassessed bias dimensions. - Conformance level for hosted-model workflows. Anticipated to accept L4A for analyses on closed-weight hosted models with the required independent attestations, prespecification, and identity governance; L4R required for analyses where bit-identical reproducibility of primary endpoint reasoning is achievable and demanded by the regulatory regime.
- High-stakes output classification at L4R. Any
reasonorcomputestep contributing to a primary or co-primary efficacy endpoint, or to a safety determination at the protocol's prespecified threshold.
7.5 Finance Profile (Informative, Forward-Reference)
A profile targeting model risk management under SR 11-7 and analogous regulatory regimes. Anticipated specializations include:
- Identity model. Institutional signing with role
bindings for
model-developer,model-validator(independent per SR 11-7),model-owner,model-risk-officer. - Source-data integrity bindings.
observestepprovenanceMUST resolve to attestations from systems operating under firm-level books-and-records controls. - Tiering predicates.
claim_typevocabulary aligned with the firm's model tiering (typically three or four tiers); high-stakes classification under L4R at this profile is mapped to the firm's tier-1 or tier-2 designations. - Effective challenge predicates.
claim_typevocabulary formodel-validatorattestations recording independent replication, conceptual soundness review, and ongoing-monitoring sign-off. - Lifecycle predicates.
claim_typevocabulary forproduction-deployment,periodic-revalidation,retirement, supporting the lifecycle obligations of SR 11-7.
Profile sketches §7.1–§7.5 are informative with respect to the base protocol — non-normative until separately published as formal profile specifications — but a producer MUST identify the profile(s) under which a proof is to be verified, by reference, in the proof manifest (§2.7).
Appendix A. JSON Schema for the Insight Step (Illustrative)
Editor's note. This appendix gives a JSON Schema (Draft 2020-12) for the Insight Step. The schema is illustrative, not normative; the normative definition is the prose of §2. JSON Schema validation is the weakest of five distinct validity gates a proof must pass.
The five gates, in order of increasing strength:
- Schema-valid. The step's JSON encoding satisfies the schema below. This catches typos, malformed fields, missing required keys, and most type errors. It does not check hash linkage, signature validity, role authorization, topological ordering, acyclicity, or conformance-level predicates.
- Structurally valid. The step (or proof of steps)
satisfies §2.6 and, at the proof level, §3.1. This adds acyclicity,
predecessor existence, timestamp ordering, the manifest-to-proof
correspondence, and the prohibition on
attest-as-derived-from. - Cryptographically valid. Signatures over
to_signverify; timestamps overto_timestampverify; the step identity matchesSHA-256(full_step). This adds the integrity guarantees of the underlying cryptographic primitives. - Type-verification-valid. §3.2's type-specific
checks pass for every step. For
computeandreason, this includes invocation reconstruction and (where applicable) replay. - Conformance-valid. §5's level-specific predicates hold for the manifest's claimed level, evaluated over the effective ancestor closure A* (§3.1).
A claim of "this proof is valid" must specify which gate is meant.
Most commonly the relevant claim is gate 5 (conformance-valid against a
specific claimed level under a specific profile). The schema below is
gate 1. Note that the schema below does not enforce all conditional
requirements stated in prose (for example, output_artifact
required at R1 or under tolerance,
weights_hash required at R3); these are checked at higher
gates. The verification_basis field (§2.7) reports producer
claims about which gates the proof is expected to reach.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://proofofinsight.org/schemas/v0.6.2/step.json",
"title": "PoI Insight Step",
"type": "object",
"additionalProperties": false,
"required": ["version", "type", "predecessors", "payload",
"attestor", "signature", "timestamp"],
"properties": {
"version": { "const": "0.6.2" },
"type": { "enum": ["observe", "compute", "reason", "attest"] },
"predecessors": {
"type": "array",
"items": {
"oneOf": [
{
"type": "object",
"additionalProperties": false,
"required": ["step", "relation"],
"properties": {
"step": { "type": "string", "pattern": "^[0-9a-f]{64}$" },
"relation": { "enum": ["derived-from", "conditioned-on", "about"] }
}
},
{
"type": "object",
"additionalProperties": false,
"required": ["step", "relation", "context_role",
"declared_relevance_hash"],
"properties": {
"step": { "type": "string", "pattern": "^[0-9a-f]{64}$" },
"relation": { "const": "conditioned-on" },
"context_role": { "type": "string" },
"declared_relevance_hash": { "$ref": "#/$defs/sha256" }
}
}
]
}
},
"payload": { "type": "object" },
"attestor": { "type": "string", "format": "uri" },
"signature": { "type": "string", "minLength": 1 },
"timestamp": {
"type": "object",
"additionalProperties": false,
"required": ["value", "authority", "token"],
"properties": {
"value": { "type": "string", "format": "date-time" },
"authority": { "type": "string", "format": "uri" },
"token": { "type": "string", "minLength": 1 }
}
}
},
"$defs": {
"sha256": { "type": "string", "pattern": "^[0-9a-f]{64}$" },
"contentRef": {
"type": "object",
"additionalProperties": false,
"required": ["uri", "hash"],
"properties": {
"uri": { "type": "string", "format": "uri" },
"hash": { "$ref": "#/$defs/sha256" }
}
},
"inlineOrRef": {
"oneOf": [
{ "type": "object" },
{ "$ref": "#/$defs/contentRef" }
]
}
},
"allOf": [
{
"if": { "properties": { "type": { "const": "observe" } } },
"then": {
"properties": {
"predecessors": { "maxItems": 0 },
"payload": {
"type": "object",
"additionalProperties": false,
"required": ["content_hash", "content_type", "source"],
"properties": {
"content_hash": { "$ref": "#/$defs/sha256" },
"content_type": { "type": "string" },
"source": { "type": ["string", "object"] },
"provenance": { "type": ["string", "object"] }
}
}
}
}
},
{
"if": { "properties": { "type": { "const": "compute" } } },
"then": {
"properties": {
"predecessors": {
"minItems": 1,
"items": { "properties": { "relation": { "const": "derived-from" } } }
},
"payload": {
"type": "object",
"additionalProperties": false,
"required": ["function", "invocation", "invocation_hash",
"output_hash", "environment"],
"properties": {
"function": { "type": "string" },
"invocation": { "$ref": "#/$defs/inlineOrRef" },
"invocation_hash": { "$ref": "#/$defs/sha256" },
"output_hash": { "$ref": "#/$defs/sha256" },
"output_artifact": { "type": ["string", "object"] },
"environment": {
"type": "object",
"required": ["replay_regime"],
"properties": {
"replay_regime": { "enum": ["bit-identical", "tolerance"] }
}
}
}
}
}
}
},
{
"if": { "properties": { "type": { "const": "reason" } } },
"then": {
"properties": {
"predecessors": {
"minItems": 1,
"items": {
"properties": {
"relation": { "enum": ["derived-from", "conditioned-on"] }
}
}
},
"payload": {
"type": "object",
"additionalProperties": false,
"required": ["model", "replay_class",
"invocation", "invocation_hash",
"input_messages", "input_messages_hash",
"output_hash", "sampling"],
"properties": {
"model": {
"type": "object",
"additionalProperties": false,
"required": ["identifier"],
"properties": {
"identifier": { "type": "string" },
"weights_hash": { "$ref": "#/$defs/sha256" },
"version": { "type": "string" }
}
},
"replay_class": { "enum": ["R1", "R2", "R3"] },
"invocation": { "$ref": "#/$defs/inlineOrRef" },
"invocation_hash": { "$ref": "#/$defs/sha256" },
"input_messages": { "$ref": "#/$defs/inlineOrRef" },
"input_messages_hash": { "$ref": "#/$defs/sha256" },
"tool_call_log_hash": { "$ref": "#/$defs/sha256" },
"visible_rationale_hash": { "$ref": "#/$defs/sha256" },
"finding_type": {
"anyOf": [
{ "enum": ["conclusion", "no-finding",
"insufficient-evidence", "negative-result"] },
{ "type": "string", "pattern": "^[a-z][a-z0-9\\-/]*$" }
]
},
"output_hash": { "$ref": "#/$defs/sha256" },
"output_artifact": { "type": ["string", "object"] },
"sampling": { "type": "object" },
"redaction_policy": { "type": ["string", "object"] }
}
}
}
}
},
{
"if": { "properties": { "type": { "const": "attest" } } },
"then": {
"properties": {
"predecessors": {
"minItems": 1,
"items": { "properties": { "relation": { "const": "about" } } }
},
"payload": {
"type": "object",
"additionalProperties": false,
"required": ["claim_type", "role", "claim_body", "claim_hash"],
"properties": {
"claim_type": { "type": "string", "format": "uri" },
"role": { "type": "string" },
"claim_body": { "type": ["object", "string"] },
"claim_hash": { "$ref": "#/$defs/sha256" }
}
}
}
}
}
]
}Appendix B. Conformance Test DAGs (Forward Reference)
Editor's note. A companion document, the PoI Conformance Test Suite v0.6.2, will provide a set of concrete proof DAGs labeled with their expected verification outcome at each conformance level (L1, L2, L3, L4A, L4R). The test suite is the operational specification of conformance: an implementation is conformant iff it agrees with the test suite on every test case. Drafting the test suite is the next concrete deliverable after this draft stabilizes.
A second forward-reference companion is the PoI Core Test Profile — a minimal executable profile providing concrete bindings (Ed25519 signatures, RFC 3161 or mock timestamps, inline-only invocations and artifacts, SHA-256 throughout, deterministic toy compute functions, fixed claim-type vocabulary, no external model replay) sufficient to make §3 testable end-to-end without depending on a production identity or model-resolution infrastructure. The Core Test Profile is intended to expose remaining ambiguities in the base protocol before the regulatory profiles (§7.4, §7.5) are formalized.
Appendix C. Regulatory Mapping (Informative)
This appendix maps PoI features to commonly referenced regulatory and standards frameworks. The mapping is informative: PoI does not implement any specific regulation, and conformance to PoI does not by itself establish compliance with any regulation. The mapping is provided to facilitate use of PoI as evidence within compliance regimes.
C.1 21 CFR Part 11 (Electronic Records, Electronic Signatures)
| Requirement | PoI mechanism |
|---|---|
| §11.10(a) Validation of systems | Not addressed; system validation is upstream of PoI |
| §11.10(b) Generation of accurate copies | Supports fidelity checking for recorded artifacts via content-addressing (§2.5); generation of human-readable accurate copies is implementation-specific |
| §11.10(c) Protection of records | Supports tamper-evident protection of records (§4.4); operational protection measures are profile- and deployment-defined |
| §11.10(e) Secure, computer-generated audit trails | Provides a hash-chained typed-step audit trail with signatures and timestamps (§2); operational deployment under §11.10(e) is profile-defined |
| §11.10(g) Authority checks | Supports profile-defined attestor role authorization (§2.2.4, §3.2 attest.b); enforcement is profile- and deployment-defined |
| §11.50 Signature manifestations | Provides signed attest steps
with attestor role recorded (§2.2.4); display semantics are
implementation-specific |
| §11.70 Signature/record linking | Cryptographically links signature to record via canonical encoding of step content (§2.5) |
C.2 SR 11-7 (Model Risk Management)
| Principle | PoI mechanism |
|---|---|
| Model development documentation | Supports documentation through
compute and reason steps with full invocation
hashing |
| Effective challenge | Supports the structural form via L4A/L4R
independent qualified review via attest steps; the
substance of challenge is human-supplied |
| Ongoing monitoring | Supports lifecycle predicates defined in §7.5 finance profile |
| Outcomes analysis | Supports reason steps with
insufficient-evidence findings (§5.5) |
| Model inventory | Supports manifest registry across proofs (profile-level) |
C.3 FDA AI/ML Draft Guidance (Seven-Step Credibility Framework)
| Step | PoI mechanism |
|---|---|
| Define the question of interest | Supports binding via prespecification attestations to study protocol (§7.4) |
| Define the context of use | Supports profile-level "high-stakes output" classification (§5.1) |
| Assess model risk | Supports producer-level threat-model documentation (§6); risk tiering is profile-defined |
| Develop the model | Supports model development records via
compute/reason steps with invocation
hashing |
| Verify and validate the model | Supports verification via replay regimes (§2.2.2, §2.2.3) and L4A/L4R independent validation; validation substance is human-supplied |
| Document the model | The proof itself, manifest, and profile bindings provide the documentation structure |
| Maintain the model | Supports maintenance via supersession patterns (§5.4) and lifecycle predicates |
C.4 NIST AI Risk Management Framework (AI RMF 1.0)
| Function | PoI mechanism |
|---|---|
| GOVERN | Supports governance via conformance levels, manifest signing, and editorial process |
| MAP | Supports mapping via profile binding and attestor role definitions |
| MEASURE | Supports measurement via replay regimes, divergence semantics (§3.3), and insufficient-evidence findings (§5.5) |
| MANAGE | Supports management via supersession (§5.4) and lifecycle predicates in profiles |
C.5 ISO/IEC 42001 (AI Management System)
| Clause area | PoI mechanism |
|---|---|
| AI system lifecycle records | Supports lifecycle records via hash-chained proof DAG |
| Data quality controls | Supports referencing of upstream data
quality via observe step provenance (delegated per
§2.2.1) |
| Validation and verification | Supports validation/verification via
replay regimes and attest claim types |
| Continual improvement | Supports continual improvement via supersession and lifecycle predicates in profiles |
C.6 ICH E6(R3) (Good Clinical Practice) and ICH E9(R1) (Statistical Principles)
| Principle | PoI mechanism |
|---|---|
| Quality by design | Supports quality-by-design via prespecification predicates (§7.4) |
| Risk-based approach | Supports risk-based scoping via profile-level high-stakes classification at L4A/L4R |
| Data integrity (ALCOA+) | Supports ALCOA+ properties via tamper-evidence, attestor identification, and timestamping |
| Investigator oversight | Supports oversight via qualified-reviewer attestations at L4A/L4R |
| Estimand framework | Supports estimand-aligned
finding_type vocabulary in clinical profile (§7.4) |
C.7 What These Mappings Do Not Establish
Each row above identifies a structural correspondence between a PoI mechanism and a regulatory requirement. None establishes that a proof conforming to PoI at any level constitutes compliance with the regulation. Compliance is determined by the regulator under the regulation's own evidentiary standards; PoI's role is to provide structured, mechanically verifiable evidence against which the producer's compliance claim can be assessed. PoI conformance is a necessary structural property for mechanical verification, not a sufficient compliance determination. The verification-versus-credibility distinction of §1.4 applies in full to every mapping in this appendix.
References
Editor's note. Reference list to be expanded. Anchor references already identified:
- RFC 2119, RFC 8174 — requirement keyword conventions.
- RFC 3161 — Time-Stamp Protocol.
- RFC 3339 — Date and Time on the Internet.
- RFC 6962 — Certificate Transparency.
- RFC 8785 — JSON Canonicalization Scheme.
- in-toto specification, latest version.
- SLSA specification, v1.0 or current.
- Sigstore project documentation, latest version.
- PROV-W3C, PROV-DM: The PROV Data Model.
- Souza et al., PROV-AGENT (full citation pending).
- Ojewale et al., Audit Trails for Generative-AI Workflows, FAccT 2026 (full citation pending).
- FDA, Artificial Intelligence and Machine Learning in Software as a Medical Device, draft guidance.
- FDA, Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products, draft guidance.
- ASME V&V40, Assessing Credibility of Computational Modeling through Verification and Validation.
- ICH E6(R3) — Good Clinical Practice.
- ICH E9(R1) — Statistical Principles for Clinical Trials, Addendum on Estimands.
- 21 CFR Part 11.
- Federal Reserve SR 11-7, Guidance on Model Risk Management.
- NIST AI Risk Management Framework (AI RMF) 1.0.
- ISO/IEC 42001 — Information technology — Artificial intelligence — Management system.
- ProvCaRe project documentation (clinical research provenance).
Appendix D. Changes from v0.6.1 (Informative)
This appendix summarizes the substantive changes in v0.6.2 relative to v0.6.1. The changes are backward-compatible additions and clarifications; no producer compliant with v0.6.1 is rendered non-compliant by v0.6.2.
L4 split (level-architecture addition)
- L4 (Attested) in v0.6.1 has been split into L4A (Independently Attested) and L4R (Reproducible). L4A requires identity, prespecification, and independent qualified review but admits R2 reasoning throughout. L4R adds the R3-for-high-stakes requirement that was the principal extra obligation of v0.6.1 L4.
- v0.6.1 L4 claims correspond to v0.6.2 L4R; producers needing R2 throughout under independent attestation use L4A.
- The split makes the protocol usable for regulated agentic workflows on closed-weight hosted models — which can satisfy L4A but cannot structurally satisfy R3 — without overclaiming reproducibility.
Verification basis on the manifest (additive, optional)
- New OPTIONAL
verification_basisfield on the manifest (§2.7), with valuesreplay-verifiable,linkage-verifiable-only, orresolution-limited. - The field records the producer's claim about what verification gate the proof is expected to support. A verifier MUST report the achieved basis alongside the accept/reject decision and identify any gap to a stronger claimed basis.
- Profiles MAY require a specific
verification_basisas a precondition for use of the proof in their regulated scope.
Conditioned-on extended form (additive, optional)
- The compact
{ step, relation }edge form is preserved as the default. An extended form forconditioned-onedges adds OPTIONALcontext_roleanddeclared_relevance_hashfields, pinning the producer's assertion about why the predecessor is contextually relevant without making the relation replay-verifiable. - Profiles MAY require the extended form for
conditioned-onedges within their regulated scope.
Patent covenant interim language
- §0.3 now states that, until a conformance test suite is published for a given version, the patent non-assertion covenant applies to good-faith implementations of the normative requirements of §§1–5. This closes the prior gap between the covenant's reliance on a published test suite and the absence of such a suite in pre-release versions.
Positioning clarification (§1.2)
- §1.2 now opens with an explicit infrastructure-vs-application distinction: PoI is a verification protocol, not an analytical platform or workflow tool. Producers of regulated analyses emit PoI proofs of their outputs; the protocol does not depend on, nor specify, the workflow that produced the analysis. This makes explicit a scope assumption that was previously implicit in §§1.3–1.4 and adds no new normative requirement.
Forward-reference additions
- Appendix B now references a planned PoI Core Test Profile — a minimal executable profile providing concrete bindings (Ed25519 signatures, RFC 3161 or mock timestamps, inline-only invocations, SHA-256, deterministic toy compute functions, fixed claim-type vocabulary, no external model replay) intended to make §3 testable end-to-end without depending on production identity or model-resolution infrastructure.
Threat model addition
- §6 adds threat 10 (verification-basis overclaiming) to the planned
enumeration: a producer claiming
replay-verifiablefor a proof whose reasoning is not actually re-executable. Detected when the verifier reports the achieved basis.
Deferred to v0.7
- Structured digest object (
{alg, value}) for hash agility. v0.6.2 continues to commit to SHA-256 with profile substitution permitted in principle but not specified. - Structured signature object (
{alg, key_id, value}) carrying explicit signature-scheme metadata. Currently pushed entirely to profiles. - Manifest-level attestations as a separately-signed sibling object
type. Currently a producer requiring manifest-level attestation issues
an
atteststep about a designated output step (§2.2.4). - Companion publication of the PoI Core Test Profile (Appendix B) and a first version of the conformance test suite.
End of v0.6.2 working draft.