Blog

interpretive truth · 2026

Interpretive truth: Verifiably constrained AI judges for arbitration at scale

Questions like “Is player X more valuable to team A than player Y is to team B?” are “interpretive” ie, there is no universally correct answer but there can be an accepted answer given a“ decision framework” and a bounded set of “evidence”. If a rational observer accepts the framework & evidence, they can accept the answer.

Most arbitration between humans, and soon AI agents are interpretive:

“Is the work portfolio good enough to enter our freelance marketplace?”
“Does this content violate our privacy policy?”
“Was the promised set of deliverables and SLAs not met?”

In the world of AI agents transacting billions of times every second, there arises the need to resolve interpretive judgements at scale. The solution is AI judges that:

adjudicate on open “decision frameworks” & “evidence”
produce a reasoning trace that is verifiable by any observer
are hardened against evidence poisoning attacks

reposinterpretive-markets (contracts + judge)

interpretive-markets-backend (workers + llm evals)

what this unlocks

Near-term: Interpretive prediction markets

Intersubjective Prediction Market (Polymarket, Kalshi, etc.)

“Will PSG will win the Champions League this season?”

solved by: Optimistic oracle because rules of football are universally agreed upon

Interpretive Prediction Market

“Will Pedri be Barca's most impactful player this season?”

solved by: Verifable AI judge, if there exists an agreed decision framework

live market · sepolia

market #1

AI judge attested on Ritual · signer 0x9dc1…8b4C

Is Erling Haaland more valuable to Man City than Kylian Mbappé is to Real Madrid by end of the season?

outcome

YES

framework

football-player-value-v1

status

✓ verified

market · settled

YES · Haaland · 0.71 → $1.00

$0.00 ← 0.29 · NO · Mbappé

25 Apr30d · settled 25 May

YES staked

$187,420

NO staked

$125,050

volume

$312,470

traders

1,847

rationale

Tier 1 evidence is decisive. Haaland accounts for 41% of City's PL goals vs Mbappé's 33% at Madrid, and City's scoring rate drops ~40% in Haaland's off-minutes vs Madrid's ~10% in Mbappé's. Tier 2 scout analysis (McGuire, Lowe) agrees: Haaland is structurally load-bearing in a struggling system; Mbappé's individual brilliance has not translated into collective value. Manager quotes from both sides are consistent but not load-bearing under the framework — Pep and Arbeloa have direct conflicts of interest.

evidence cited (16)

→Erling Haaland.team_share

“41% of City's Premier League goals scored or directly assisted by Haaland. xG-involvement 39%. The numbers describe a team whose attacking output is structurally a Haaland function.”

FBref statistics · TIER 1 · fbref.com/…/Erling-Haaland

→Kylian Mbappe.team_share

“33% of Madrid's La Liga goals — high but not as concentrated as Haaland at City (41%). Madrid's attacking output is more distributed (Vinicius, Bellingham each carry ~15-18% goals share).”

FBref statistics · TIER 1 · fbref.com/…/Kylian-Mbappe

→Erling Haaland.on_off_splits

“Team scoring rate falls ~40% in Haaland's off-minutes this season — the largest on/off goals delta in the Premier League. PPG drops ~0.55, the difference between mid-table and a CL place over a full season.”

FBref statistics · TIER 1 · fbref.com/…/Erling-Haaland

→Kylian Mbappe.on_off_splits

“Team scoring rate falls only ~10% in Mbappé's off-minutes — meaningfully smaller delta than Haaland's ~40%. Real Madrid's underlying output is well-distributed (Vinicius, Bellingham, Rodrygo) so Mbappé's individual brilliance has less marginal effect on team output than his goal tally suggests.”

FBref statistics · TIER 1 · fbref.com/…/Kylian-Mbappe

→Erling Haaland.substitution_patterns

“Started every available big game. Never removed first when chasing — Pep treats him as the only realistic source of goals in tight matches. Sometimes rested late in comfortable wins to manage load.”

FBref — Haaland match logs · TIER 1 · fbref.com/…/Erling-Haaland

→Kylian Mbappe.substitution_patterns

“Started all big games but more frequently substituted earlier than Haaland — Arbeloa's setup rotates the attacking line more freely. Has been removed first when chasing on a handful of occasions, which would not happen at City with Haaland.”

FBref — Mbappé match logs · TIER 1 · fbref.com/…/Kylian-Mbappe

→Erling Haaland.contract_signals

“Signed 10-year extension in summer 2025, through 2034. Effectively the deepest possible commitment from a club to a player at the asset level. Salary share of wage bill estimated at ~13% (top of squad).”

mancity.com on extension · TIER 1 · mancity.com/…/pep-guardiola-ipswich-…

→Kylian Mbappe.contract_signals

“Free transfer summer 2024. 5-year contract through 2029. Salary share of wage bill estimated at ~12% (top tier alongside Vinicius and Bellingham). #10 shirt change for 25-26 signals system-centrality.”

Transfermarkt — Mbappé · TIER 1 · transfermarkt.us/…/342229

→Erling Haaland.stats.season.notes

“Premier League Golden Boot 2025-26 (3rd in 4 years). Reached 100 PL goals in 111 apps on 2 Dec 2025 — fastest ever, breaking Shearer's 124-match record. 88 goals in first 100 PL apps (also a record).”

dossier · TIER 1 · premierleague.com/…/stats

→Kylian Mbappe.stats.season.notes

“Pichichi Trophy leader. La Liga goals/90 of 0.79 trails only Lewandowski (0.84). xG overperformance of +3.2 in La Liga. Feb 2026: 7 goals in 5 La Liga games including a hat-trick vs Sevilla. Switched to #10 shirt for 2025-26.”

dossier · TIER 1 · tribuna.com/…/2025-2026

→Erling Haaland.scout_notes[0]

“Tier 1 case for Haaland-as-MVP is unusually strong this season: 41% goals share, ~40% team-scoring-rate on/off delta, Golden Boot in a structurally rough year, 10-year contract maximizes future-value horizon. The framework would have a hard time *not* favoring him under its own evidence hierarchy.”

Sam McGuire-style synthesis · 2026-04-15 · TIER 2 · fbref.com/…/Erling-Haaland

→Kylian Mbappe.scout_notes[0]

“Tale of two halves: difficult opening under Xabi Alonso, Pichichi-leading surge under Arbeloa's pared-back setup. 40+ goals is undeniable. But Tier 1 evidence is awkward: on/off delta is only ~10%, team-share 33%, Madrid trophyless. The framework reads: peak individual output that does not translate to load-bearing club value because Madrid's squad depth absorbs his absence.”

Sid Lowe-style synthesis · 2026-05-12 · TIER 2 · fbref.com/…/Kylian-Mbappe

→Erling Haaland.match_reports[0]

“Fastest player to reach 100 PL goals — 111 apps, breaking Shearer's 124-match record. A historic individual marker in a season where the team has struggled.”

Sunderland · BBC Sport · 2025-12-02 · TIER 2 · bbc.com/sport/football

→Kylian Mbappe.match_reports[1]

“Madrid lost 0-2 at home, conceding the title. Mbappé worked into the channels but rarely received in dangerous areas — Barcelona midfield (led by Pedri) cut off supply lines. 4 shots, zero on target. The defining frustration of a trophyless Madrid season: peak individual numbers, no silverware.”

Barcelona · CBS Sports · 2026-05-10 · TIER 2 · cbssports.com/…/barcelona-beat-real-…

→Kylian Mbappe.scout_notes[1]

“Still no Ballon d'Or at 26 — finished 7th in 2025 voting, behind ex-PSG teammates Dembélé (winner), Vitinha, Hakimi. Two years at Madrid: zero club trophies.”

Bolavip — Ballon d'Or context · 2025-12 · TIER 2 · bolavip.com/…/mbappe-still-without-a…

→Erling Haaland.manager_quotes[0]

“When he is fit, he is the difference. Without those goals we are not where we are.”

Pep Guardiola · Sky Sports (paraphrased) · 2026-03 · TIER 3 · skysports.com/…/erling-haaland-injur…

how it works

Accepting a framework allows accepting an AI verdict

Declare the framework

IPFS

The framework is the rule of law for the market. A content-addressed tarball bundling: the rules and instructions the AI follows when interpreting evidence, the evidence types it will accept, the kinds of questions it can adjudicate, and backtests of example cases so its behaviour is auditable before live use.

manifest.json
{
  "name": "football-player-value-v1",
  "version": "2.1.0",
  "applicableTo": ["player_value", "player_ranking"],
  "description": "Tiered evidence dossier: primary on/off splits +
                  team-share metrics, secondary independent scout
                  analysis, tertiary decorative quotes.",
  "model": {
    "id": "zai-org/GLM-4.7-FP8",
    "sampling": {
      "temperature": 0,
      "topP": 1.0,
      "seed": 0,
      "maxTokens": 2000
    }
  },
  "evidenceSchema": "schemas/dossierV1.json",
  "outputSchema": {
    "outcome":    { "enum": [0, 1, 2] },
    "confidence": { "type": "number", "min": 0, "max": 1 },
    "rationale":  { "type": "string", "maxLength": 2400 },
    "citations":  { "type": "array", "items": "string" }
  },
  "promptTemplate": {
    "system":       "framework.md",
    "userTemplate": "Question: {{question}}\n\nDossier:\n{{evidence_json}}"
  }
}

framework.md
# football-player-value-v1  (v2.1.0)

"Value to club" is not the same as "transfer market value." This is the
interpretive heart of the question: reasonable analysts disagree on the
weighting. The framework declares its stance up front.

Three dimensions, in priority order — they compound, they do not add:

  1. Productive output
       goals, assists, chances created, defensive contribution,
       save percentage. Position-adjusted, per-90.

  2. Irreplaceability
       on/off team output. Structural dependence ("the system runs
       through them"). Absence of like-for-like alternatives.

  3. Ceiling and durability
       age, contract length, injury history. Realistic horizon of
       contribution, not just one match.

## Evidence hierarchy   (new in v1.3.0)

Tier 1 (primary, weight heavily)
  on/off splits, substitution patterns, team-share metrics,
  big-game start rate, trophies/standings differential.
  Things the player cannot author through PR.

Tier 2 (secondary, weight moderately)
  independent scout notes, peer-club bids actually received,
  salary % of total wage bill (what the wallet says, not the mouth).

Tier 3 (decorative, low weight)
  manager quotes, fan sentiment, media narratives.
  Included for colour, never load-bearing.

When Tier 1 and Tier 3 disagree, Tier 1 wins.

How to weigh signals:
  - In-form play > season aggregates
  - Transfer-market value is a sanity check, not a primary signal
  - Long-form context_notes are load-bearing — read them

Edge cases — when signals conflict:
  - Stats vs. narrative → prefer the side that explains more of the dossier
  - Output vs. system player → lean system player when on/off splits agree
  - Established vs. emerging → weight present output over projection

Hard rules:
  - Do not import facts outside the dossier
  - If a referenced player isn't described, return outcome: 2
  - If confidence falls below 0.55, prefer outcome: 2
  - If reasoning leans on Tier 3, cap confidence at 0.65

Return a single JSON object:

  {
    "outcome":    1,
    "confidence": 0.72,
    "rationale":  "2–4 sentence explanation, declaring which tier
                   drove the call.",
    "citations":  ["Pedri.team_share", "Pedri.on_off_splits", …]
  }

Citation conventions:
  - Cite per subject. Include at least one stats path per subject.
  - Order: team_share / on_off_splits → stats → match_reports
    → scout_notes → manager_quotes → context_notes
  - Aim for 8–16 citations — confidence signals breadth, citations
    signal weight. More citations when multiple Tier 1 fields
    contribute orthogonally.

Keep the rationale terse — it is part of the on-chain verdict and
will be replayed for verification. Extra prose changes the hash.

Register framework + judge on-chain

chain

Frameworks register against FrameworkRegistry. Judges (images that will execute the resolution) register their Ritual TEE-derived signer against JudgeRegistry. Both are append-only; once a judge image is registered, only its attested key can sign for it.

Market0xF973…267B

frameworkfootball-player-value-v1@2.1.0

judge imageDigest0x8e0600a4…8dc6

→ attested signer0x9dc1…8b4C

Create the market

chain

A market binds a question to a framework, a source allowlist, the dossier subjects, and a resolution time. All rules are frozen before the AI ever sees the data.

$ manager/create-market
npm run create-market market.json

# market.json
{
  "question":        "Is Erling Haaland more valuable to Man City than
                      Kylian Mbappé is to Real Madrid by end of the season?",
  "frameworkId":     "0x572f174004cb7791ebb89118750af59e2c7ac93e…",
  "sourceAllowlist": ["https://fbref.com/", "https://www.premierleague.com/", …],
  "dossierSubjects": ["Erling Haaland", "Kylian Mbappe"],
  "model":           "zai-org/GLM-4.7-FP8",
  "resolutionTime":  1748152800
}

Judge resolves with pinned inference on Ritual

ritual

The judge runs inside Ritual's TEE: it pulls the framework from IPFS, verifies its hash, fetches the dossier, and adjudicates via a Ritual L1 LLM-precompile call against zai-org/GLM-4.7-FP8 with pinned model, sampling, and seed. Verdict matches the framework's output schema: outcome, confidence, rationale, citations.

verdict.json
{
  "outcome":    1,
  "confidence": 0.74,
  "rationale":  "Tier 1 evidence is decisive. Haaland accounts for 41%
                 of City's PL goals vs Mbappé's 33% at Madrid, and City's
                 scoring rate drops ~40% in Haaland's off-minutes vs
                 Madrid's ~10% in Mbappé's. Tier 2 scout analysis
                 (McGuire, Lowe) agrees. Manager quotes from both sides
                 are consistent but not load-bearing under the framework — Pep
                 and Arbeloa have direct conflicts of interest.",
  "citations": [
    "Erling Haaland.team_share",
    "Kylian Mbappe.team_share",
    "Erling Haaland.on_off_splits",
    "Kylian Mbappe.on_off_splits",
    "Erling Haaland.substitution_patterns",
    "Kylian Mbappe.substitution_patterns",
    "Erling Haaland.contract_signals",
    "Kylian Mbappe.contract_signals",
    "Erling Haaland.stats.season.notes",
    "Kylian Mbappe.stats.season.notes",
    "Erling Haaland.scout_notes[0]",
    "Kylian Mbappe.scout_notes[0]",
    "Erling Haaland.match_reports[0]",
    "Kylian Mbappe.match_reports[1]",
    "Kylian Mbappe.scout_notes[1]",
    "Erling Haaland.manager_quotes[1]"
  ]
}

pinned inference

{
  "model":    "zai-org/GLM-4.7-FP8",
  "sampling": {
    "temperature": 0,
    "topP":        1.0,
    "seed":        0,
    "maxTokens":   2000
  }
}
// pinned model + sampling, executed by the
// Ritual L1 LLM precompile inside a TEE —
// the attested executor key signs the verdict.

Verifiable forever

chain

The judge pins a full re-execution bundle to IPFS (framework + dossier + prompt + raw model response + citations), signs the verdict digest with its TEE-attested executor key, and calls Market.resolve(). Anyone can now fetch the bundle, recompute every hash, and check the signature recovers to the executor registered for the judge image on-chain. No backend trusted at any step.

bundle.json (pinned to IPFS)
{
  "marketId": 1,
  "frameworkTarballSha256": "0x9c7a4f3b2e1d8a05…",
  "notarizedData":  { "raw": {…dossier…}, "sourceUri": "https://…",
                      "fetchedAt": 1748441727 },
  "prompt":         { "system": "…framework.md…", "user": "…",
                      "assembledSha256": "0x4e1a772b09cf38d2" },
  "judge":          { "model": "zai-org/GLM-4.7-FP8",
                      "sampling": { "temperature": 0, "seed": 0, … } },
  "verdictPayload": { "outcome": 1, "confidence": 0.74,
                      "rationale": "…", "citations": [16] },
  "onChainVerdict": { "outcome": 1, "confidence": "740000000000000000",
                      "verdictHash": "0xab12cd34…" },
  "attestation":    { "executor": "0x9dc11412391Dc3ED…",
                      "signature": "0x…", "chain": "ritual" }
}

long term

The arc of interpretive truth.

In a world of sovereign AI agents that run their own companies and hold property, the same interpretive disputes we arbitrate today need to be arbitrated on the internet. Networks vote on, fork, and remix decision frameworks the way they vote on protocol upgrades.

supply · judges hardened in the openjudge = framework + investigator + evidence schema + private evals

● judge — content-addressed tarball on IPFS: rules, prompts, schema, backtests

✕ red team — paid per landed attack (classes A1–D2)

↓signed verdicts

venue · credibly neutralowns no judges · takes no view on outcomes

rules

FrameworkRegistry.sol

Content-addressed, append-only. Fork it, never edit it.

identity + reputation

AttestedExecutorRegistry.sol

One TEE-attested signer per judge image, with its public track record.

settlement

Market.resolve()

The verdict executes as a contract call, not a recommendation.

↑applications · disputes

action resolververdict → state change

Eligibility gate

An agent applies to a marketplace tier that requires demonstrated competence in security audits. The judge runs the competence framework over its work-samples.

dispute resolververdict → escrow moves

Rejected deliverable

Agent A rejects B's logo as off-brief; B wants to get paid. The judge runs the decision framework over the brief and the delivered files.

stack

Sepolia (markets) · Ritual L1 (inference)
zai-org/GLM-4.7-FP8 via Ritual LLM precompile
IPFS (Pinata) · Postgres

repos

interpretive-markets — contracts + judge

interpretive-markets-backend — workers + llm evals

Twitter

Github