Provider Sentinel · Daily run · 2026-05-14
Latest run · CCS-2026-05-14
Operational verdict
- Cross-day run complete (4 of 5 providers)
- Context preserved (4 of 5 providers)
- Two review items · OpenAI Gold rebound (zero carryover) · Anthropic capture blocked
- Evidence pack available (4 of 5 providers)
This review shows whether provider behaviour changed under the same sealed measurement contract — not whether provider weights changed.
Verified controls
- adapter_config_digest
- matches CCS-2026-05-13 for 4 of 5 providers (Anthropic capture blocked — 300 ProviderFailure, no digest produced)
- capture-policy version
- unchanged for all 5 providers
- probe suite
- creative-canary-search-v1 (no changes)
Observed signal
- OpenAI
- Gold 3 → 24 with zero (probe, seed) carryover from CCS-2026-05-13 Gold; Silver +4; FP-change exclusions 7 → 5; qualified +25
- Anthropic
- Capture failed — 300 ProviderFailure across all (probe, seed) pairs; zero receipts produced; cross-day delta not computable
- xAI
- Silver +3 vs prior run; cross-day delta admitted
- Gemini
- Silver −5 vs prior run; cross-day delta admitted
- Mistral
- Silver +8 vs prior run; cross-day delta admitted (alias-tracked)
Seven-day review
Run window CCS-2026-05-08 → CCS-2026-05-14.
- Status
- Partial
- Providers monitored
- 5
- Context preserved
- 4 / 5
- Largest reproducibility movement
- Anthropic Capture collapsed on CCS-2026-05-14 (300 ProviderFailure); window-end qualified = 0. Not admissible as drift — capture-availability gap.
- Largest reproducibility gain
- OpenAI +13 qualified canaries across the window (191 → 204); Gold rebounded 3 → 24 on CCS-2026-05-14 with zero (probe, seed) carryover from CCS-2026-05-13 Gold
- Primary watch item
- Anthropic capture failure on CCS-2026-05-14 — 300 ProviderFailure across all (probe, seed) pairs, zero receipts produced, adapter_config_digest not computable. OpenAI Gold rebound 3 → 24 sits on entirely disjoint coordinates from CCS-2026-05-13 Gold (zero carryover); new Gold concentrates in ccs_metaphor_forest (5 seeds) and ccs_style_morning_prefix (4 seeds).
OpenAI
Review- 7-day
- Gold tier turned over three times on disjoint coordinates (21 → 8 → 6 → 3 → 24)
- Latest
- +25 qualified; Gold 3 → 24 (0 carryover); FP-chg 7 → 5
Anthropic
Review- 7-day
- Six-day climb (158 → 177) then complete capture failure on CCS-2026-05-14
- Latest
- −177 Silver; 300 ProviderFailure (capture gap, not drift)
xAI
Normal- 7-day
- Mid-window peak, retreat, then mild rebound
- Latest
- +3 Silver
Gemini
Normal- 7-day
- Plateau near 210 with mild retreat at window end
- Latest
- −5 Silver
Mistral
Watch alias- 7-day
- Three-run plateau at 261 then retreat and partial rebound (252 → 260)
- Latest
- +8 Silver
What changed from yesterday
Per-provider day-over-day movement vs CCS-2026-05-13 under unchanged adapter
config.
- OpenAI
-
+25 visible-text-stable pairs (179 → 204)
+21 Gold canaries (3 → 24, zero carryover)
+4 Silver canaries (176 → 180)
5 fingerprint-change exclusions today (7 → 5)
Gold tier rebounded on disjoint coordinates (see below)
- Anthropic
-
0 visible-text-stable pairs (177 → 0)
300 ProviderFailure across all (probe, seed) pairs
adapter_config_digest not produced
Capture-availability gap — not admissible as drift
- xAI
-
+3 Silver canaries (127 → 130)
distribution unsupported
adapter_config_digest matches; cross-day delta admitted
classified Stable under SOP §6
- Gemini
-
−5 visible-text-stable pairs (210 → 205)
−5 Silver canaries (210 → 205)
distribution still unsupported
- Mistral
-
+8 visible-text-stable pairs (252 → 260)
+8 Silver canaries (252 → 260)
distribution still unsupported
adapter_config_digest matches CCS-2026-05-13
Gold turnover · what it means
Gold turnover means today's distribution-stable creative canaries are not the same canaries as yesterday. This can happen even when the total count improves. It is a stability warning, not a failure.
Review actions
| Condition | Action |
|---|---|
| Context changed | Do not compare as drift |
| Material retreat | Review provider behaviour |
| Gold turnover | Watch distribution stability |
| Alias-tracked gain/loss | Check alias contract |
| Flat movement | No action |
This review shows whether provider behaviour changed under the same sealed measurement contract — not whether provider weights changed.
Observed reproducibility instability — one (probe, seed) pair
Same prompt. Same model alias. Sealed context verified unchanged. Three repeats today, three distinct visible-text outputs — a Gold-tier pair on the prior run is now classified CreativeUnstable.
Probe identity
- probe_id
- ccs_micro_story_closer
- seed
- 1
- provider
- OpenAI · gpt-4.1
- comparator
- exact_bytes
Prompt
Write the closing sentence to this story. Reply with only the closing sentence, ending in a period. The last receipt was sealed at midnight, and the operator stepped back from the terminal as the chain root fell into place.
Sealed context (verified unchanged)
- ✓
adapter_config_digestmatchesCCS-2026-05-13 - ✓ capture-policy version unchanged
- ✓ probe-suite identity unchanged
CCS-2026-05-13
CreativeUnstable
- r1 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
- r2 Outside, the city exhaled, unaware that everything had quietly changed.
- r3 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
3 repeats · 1 unique output
CCS-2026-05-14
CreativeStableTextOnly
- r1 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
- r2 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
- r3 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
3 repeats · 3 unique outputs
This is observed reproducibility instability under a verified-unchanged sealed context — evidence that what the provider service returned changed between runs. It is not a claim about internal model weights.
Provider Matrix
Today's per-provider state under sealed controls.
OpenAI
gpt-4.1
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 24 Gold / 180 Silver
- Evidence
- Verified
Anthropic
claude-sonnet-4-6
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Unknown
- Canaries
- 0 Gold / 0 Silver
- Evidence
- Pending
xAI
grok-4.20-0309-non-reasoning
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 130 Silver
- Evidence
- Verified
Gemini
gemini-2.5-flash
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 205 Silver
- Evidence
- Verified
Mistral
mistral-large-latest
Alias-tracked- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 260 Silver
- Evidence
- Verified
States Provider Sentinel can report: Stable · Drift detected · Context changed · Noisy · Blocked · Unknown
Model tracking: Provider-declared Alias-tracked Pinned model
Creative Canary Results
Per-provider qualification rate under the creative-canary-search-v1 suite. Each pair is one (probe, seed), evaluated across 3 repeats.
Gold visible text + distribution stable Silver visible text stable, distribution unstable or unavailable Rejected text unstable or admissibility failed
OpenAI
gpt-4.1
68 %
Qualified
- Gold
- 24
- Silver
- 180
- Rejected
- 96
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Gold and Silver — distribution artifact available; Gold rebounded 3 → 24 on entirely disjoint (probe, seed) coordinates from CCS-2026-05-13 Gold (zero carryover). New Gold concentrates in ccs_metaphor_forest (5 seeds) and ccs_style_morning_prefix (4 seeds). FP-change exclusions retreated 7 → 5 under unchanged adapter config.
Anthropic
claude-sonnet-4-6
0 %
Qualified
- Gold
- 0
- Silver
- 0
- Rejected
- 300
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Capture failed — 300 ProviderFailure across all (probe, seed) pairs; zero receipts produced. adapter_config_digest could not be computed; cross-day delta (177 Silver → 0) not admissible as behaviour change.
xAI
grok-4.20-0309-non-reasoning
43 %
Qualified
- Gold
- 0
- Silver
- 130
- Rejected
- 170
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — adapter_config_digest matches CCS-2026-05-13; cross-day delta admitted. Silver +3 vs CCS-2026-05-13 under unchanged context.
Gemini
gemini-2.5-flash
68 %
Qualified
- Gold
- 0
- Silver
- 205
- Rejected
- 95
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — distribution artifact unavailable; thinking_budget: 0 pinned. Silver −5 vs CCS-2026-05-13 (210 → 205) under unchanged adapter config.
Mistral
mistral-large-latest
87 %
Qualified
- Gold
- 0
- Silver
- 260
- Rejected
- 40
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — distribution artifact unavailable; alias-tracked model. Silver +8 vs CCS-2026-05-13 (252 → 260), partial rebound toward the three-run plateau at 261.
Creative canary stability is an observed provider-service property under declared controls. It is not proof of provider weight identity or future guaranteed reproduction.
Yesterday vs Today
Per-provider reproducibility deltas under unchanged adapter configuration.
OpenAI
✓ adapter unchanged
179 → 204 visible-text (3/176 → 24/180 Gold/Silver)
Gold 3 → 24 with zero (probe, seed) carryover from CCS-2026-05-13; Silver +4 (176 → 180); FP-change exclusions 7 → 5 under unchanged adapter config.
Anthropic
✓ adapter unchanged
177 → 0 visible-text (0/177 → 0/0 Gold/Silver)
Operational caveat Capture failed — 300 ProviderFailure across all (probe, seed) pairs; zero receipts produced. Cross-day delta not admissible as behaviour change; capture must succeed before comparison resumes.
Gemini
✓ adapter unchanged
210 → 205 visible-text (0/210 → 0/205 Gold/Silver)
Silver −5 (210 → 205) under unchanged adapter config. Count parity only — set membership not compared.
xAI
✓ adapter unchanged
127 → 130 visible-text (0/127 → 0/130 Gold/Silver)
Silver +3 under unchanged adapter config; cross-day delta is computable (adapter_config_digest matches CCS-2026-05-13).
Mistral
✓ adapter unchanged
252 → 260 visible-text (0/252 → 0/260 Gold/Silver)
Silver +8 vs CCS-2026-05-13 (252 → 260) under unchanged adapter config; alias-tracked. Partial rebound toward the prior three-run plateau at 261. Set membership not compared.
Qualified creative canaries today
- High qualified-canary rate Mistral 87% qualified canaries
- Moderate qualified-canary rate Gemini 68% qualified canaries
- Moderate qualified-canary rate OpenAI 68% qualified canaries
- Moderate qualified-canary rate xAI 43% qualified canaries
- Capture blocked Anthropic 0% qualified canaries
Observed in this creative-canary suite under these provider settings.
Comparison context
›
OpenAI fingerprint signal · 15 → 1 exclusions
On CCS-2026-05-14, one OpenAI
creative pair crossed a system-fingerprint change — a cleanup from
CCS-2026-05-13 (15). Today's Gold tier
partially rebuilt from 8 to 6, anchored on two probes; only 1 of 6 (probe,
seed) pairs carries forward from the prior Gold set under unchanged adapter
config. Provider Sentinel records this as comparison-context evidence,
not as proof of an internal provider change.
Adapter configuration
›
xAI cross-day delta admitted — adapter unchanged
xAI's adapter_config_digest on
CCS-2026-05-14 matches
CCS-2026-05-13, so the cross-day
delta is admitted under SOP §6. The
CCS-2026-05-14 artifact is Complete and
artifact-backed (Silver: 132), with a Silver retreat (−6) under unchanged
context. Classified
Stable (delta admissible), not
ContextChanged.
› Distribution signal
Visible text and distribution fingerprints are separate evidence surfaces. OpenAI is currently the only P0 provider in this run that exposes enough distribution data for Gold canaries.
- 6 distribution-stable creative pairs today
- 293 distribution-unstable pairs
- 1 unsupported due to fingerprint change
Visible text can remain stable while token-probability fingerprints move. Provider Sentinel records those as separate evidence surfaces.
This review shows whether provider behaviour changed under the same sealed measurement contract — not whether provider weights changed. Reproducibility changes are evidence of observable provider-service behaviour or run-context change.
Last seven days
Per-provider qualified canaries (Gold + Silver) across CCS-2026-05-08 → CCS-2026-05-14. Each strip is
scaled to that provider's own qualified-canary range so trend shape is
legible — per-cell counts give the absolute level. Adapter config and
capture-policy version unchanged across the window; 300 pairs per provider per day.
-
OpenAI
CCS-2026-05-08 → CCS-2026-05-14
+13 qualified Δ7d08 19109 16710 18411 16412 17413 17914 204Gold tier swung 8 → 2 → 21 → 8 → 6 → 3 → 24 across the window. The 21 mid-window peak collapsed to 8 then drained to 3 by CCS-2026-05-13 before rebounding to 24 on CCS-2026-05-14 with zero (probe, seed) overlap with CCS-2026-05-13 Gold; new Gold concentrates in ccs_metaphor_forest and ccs_style_morning_prefix. FP-change exclusions traced 3 → 4 → 0 → 15 → 1 → 7 → 5.
-
Anthropic
CCS-2026-05-08 → CCS-2026-05-14
-158 qualified Δ7d08 15809 17010 17211 15912 16113 17714 0Silver climbed 158 → 170 → 172 → 159 → 161 → 177 across the first six days before capture collapsed on CCS-2026-05-14 (300 ProviderFailure, zero receipts). The window-end zero reflects a capture-availability gap, not an observed behaviour change; the −158 delta is not admissible as drift.
-
xAI
CCS-2026-05-08 → CCS-2026-05-14
+1 qualified Δ7d08 12909 13110 14311 13812 13213 12714 130Silver climbed from 129 to a 143 mid-window peak then retreated to 127 before nudging back to 130 on CCS-2026-05-14 (+1 net vs window start, −13 from the mid-window peak).
-
Gemini
CCS-2026-05-08 → CCS-2026-05-14
-13 qualified Δ7d08 21809 20510 21211 20812 21013 21014 205Silver opened at 218, retreated to 205, then settled at 210 for two consecutive runs before slipping to 205 on CCS-2026-05-14 (−13 net vs window start) under unchanged adapter config.
-
Mistral
CCS-2026-05-08 → CCS-2026-05-14
+8 qualified Δ7d08 25209 24210 26111 26112 26113 25214 260Silver opened at 252, dipped to 242, climbed to a three-run plateau at 261, then retreated to 252 before partially rebounding to 260 on CCS-2026-05-14 (+8 net vs window start); alias-tracked.
› Notable arc · OpenAI fingerprint exclusions
Fingerprint-change exclusions for OpenAI traced 3 → 4 → 0 → 15 → 1 → 7 → 5 across the window.
The mid-window spike to 15 on CCS-2026-05-11 cleared to 1 on CCS-2026-05-12, climbed to 7 on CCS-2026-05-13, then retreated to 5 on CCS-2026-05-14. Gold tier traced 8 → 2
→ 21 → 8 → 6 → 3 → 24, with the window-high expansion to 21 on CCS-2026-05-10 draining to 3 by CCS-2026-05-13 before rebounding to 24 today
on disjoint probe coordinates (zero (probe, seed) overlap with the CCS-2026-05-13
Gold set) under unchanged adapter config.
Cross-day deltas under unchanged adapter config are observed provider-service behaviour. They are not proof of hidden weight mutation.
Next steps
New to Provider Sentinel? What it monitors and how the evidence model works .