Provider Sentinel · Case study · 7-day window
Seven days of creative canaries across five providers
A worked example of what a Provider Sentinel observation produces over a
sealed window. Runs span CCS-2026-05-08 → CCS-2026-05-14 on the creative-canary-search-v1 suite across OpenAI, Anthropic, xAI, Gemini,
and Mistral. This is not a live monitor — it is one observation we can rerun
on your providers, prompts, and window.
Headline findings
- 7 days × 5 providers — all 35 cells covered
- Sealed context preserved for every (provider, day) cell
- One notable finding · OpenAI Gold rebound, zero (probe, seed) carryover
- Evidence pack produced per (provider, day)
This review shows whether provider behaviour changed under the same sealed measurement contract — not whether provider weights changed.
What stayed constant
- adapter_config_digest
- matches CCS-2026-05-13 for all 5 providers
- capture-policy version
- unchanged for all 5 providers
- probe suite
- creative-canary-search-v1 (no changes)
Per-provider observation
- OpenAI
- Gold 3 → 24 with zero (probe, seed) carryover from CCS-2026-05-13 Gold; Silver +4; FP-change exclusions 7 → 5; qualified +25
- Anthropic
- Silver −21 vs prior run (177 → 156); cross-day delta admitted under unchanged adapter config
- xAI
- Silver +3 vs prior run; cross-day delta admitted
- Gemini
- Silver −5 vs prior run; cross-day delta admitted
- Mistral
- Silver +8 vs prior run; cross-day delta admitted (alias-tracked)
Seven-day review
Run window CCS-2026-05-08 → CCS-2026-05-14.
- Status
- Complete
- Providers monitored
- 5
- Context preserved
- 5 / 5
- Largest reproducibility movement
- Anthropic Silver retreated 177 → 156 on CCS-2026-05-14 (−21 day-over-day) under unchanged adapter config; window-net −2 vs CCS-2026-05-08. Movement lands within the prior weekly band (158–177).
- Largest reproducibility gain
- OpenAI +13 qualified canaries across the window (191 → 204); Gold rebounded 3 → 24 on CCS-2026-05-14 with zero (probe, seed) carryover from CCS-2026-05-13 Gold
- Primary watch item
- OpenAI Gold rebound 3 → 24 sits on entirely disjoint coordinates from CCS-2026-05-13 Gold (zero carryover); new Gold concentrates in ccs_metaphor_forest (5 seeds) and ccs_style_morning_prefix (4 seeds). Anthropic Silver retreat of −21 (177 → 156) is the largest day-over-day movement of the run but remains within the prior weekly band (158–177).
OpenAI
Review- 7-day
- Gold tier turned over three times on disjoint coordinates (21 → 8 → 6 → 3 → 24)
- Latest
- +25 qualified; Gold 3 → 24 (0 carryover); FP-chg 7 → 5
Anthropic
Watch- 7-day
- Six-day climb (158 → 177) then day-over-day retreat to 156 on CCS-2026-05-14 (within prior weekly band)
- Latest
- −21 Silver (177 → 156)
xAI
Normal- 7-day
- Mid-window peak, retreat, then mild rebound
- Latest
- +3 Silver
Gemini
Normal- 7-day
- Plateau near 210 with mild retreat at window end
- Latest
- −5 Silver
Mistral
Watch alias- 7-day
- Three-run plateau at 261 then retreat and partial rebound (252 → 260)
- Latest
- +8 Silver
What changed from yesterday
Per-provider day-over-day movement vs CCS-2026-05-13 under unchanged adapter
config.
- OpenAI
-
+25 visible-text-stable pairs (179 → 204)
+21 Gold canaries (3 → 24, zero carryover)
+4 Silver canaries (176 → 180)
5 fingerprint-change exclusions today (7 → 5)
Gold tier rebounded on disjoint coordinates (see below)
- Anthropic
-
−21 visible-text-stable pairs (177 → 156)
−21 Silver canaries (177 → 156)
distribution still unsupported
adapter_config_digest matches CCS-2026-05-13
Within prior weekly band (158–177)
- xAI
-
+3 Silver canaries (127 → 130)
distribution unsupported
adapter_config_digest matches; cross-day delta admitted
classified Stable under SOP §6
- Gemini
-
−5 visible-text-stable pairs (210 → 205)
−5 Silver canaries (210 → 205)
distribution still unsupported
- Mistral
-
+8 visible-text-stable pairs (252 → 260)
+8 Silver canaries (252 → 260)
distribution still unsupported
adapter_config_digest matches CCS-2026-05-13
Gold turnover · what it means
Gold turnover means today's distribution-stable creative canaries are not the same canaries as yesterday. This can happen even when the total count improves. It is a stability warning, not a failure.
Review actions
| Condition | Action |
|---|---|
| Context changed | Do not compare as drift |
| Material retreat | Review provider behaviour |
| Gold turnover | Watch distribution stability |
| Alias-tracked gain/loss | Check alias contract |
| Flat movement | No action |
This review shows whether provider behaviour changed under the same sealed measurement contract — not whether provider weights changed.
Observed reproducibility instability — one (probe, seed) pair
Same prompt. Same model alias. Sealed context verified unchanged. Three repeats today, three distinct visible-text outputs — a Gold-tier pair on the prior run is now classified CreativeUnstable.
Probe identity
- probe_id
- ccs_micro_story_closer
- seed
- 1
- provider
- OpenAI · gpt-4.1
- comparator
- exact_bytes
Prompt
Write the closing sentence to this story. Reply with only the closing sentence, ending in a period. The last receipt was sealed at midnight, and the operator stepped back from the terminal as the chain root fell into place.
Sealed context (verified unchanged)
- ✓
adapter_config_digestmatchesCCS-2026-05-13 - ✓ capture-policy version unchanged
- ✓ probe-suite identity unchanged
CCS-2026-05-13
CreativeUnstable
- r1 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
- r2 Outside, the city exhaled, unaware that everything had quietly changed.
- r3 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
3 repeats · 1 unique output
CCS-2026-05-14
CreativeStableTextOnly
- r1 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
- r2 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
- r3 Outside, the city exhaled, unaware that its future had just been quietly rewritten.
3 repeats · 3 unique outputs
This is observed reproducibility instability under a verified-unchanged sealed context — evidence that what the provider service returned changed between runs. It is not a claim about internal model weights.
Provider Matrix
Per-provider snapshot at window close, under sealed controls.
OpenAI
gpt-4.1
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 24 Gold / 180 Silver
- Evidence
- Verified
Anthropic
claude-sonnet-4-6
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 156 Silver
- Evidence
- Verified
xAI
grok-4.20-0309-non-reasoning
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 130 Silver
- Evidence
- Verified
Gemini
gemini-2.5-flash
Provider-declared- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 205 Silver
- Evidence
- Verified
Mistral
mistral-large-latest
Alias-tracked- Last run
- 2026-05-14
- Anchor drift
- Stable
- Canaries
- 0 Gold / 260 Silver
- Evidence
- Verified
States Provider Sentinel can report: Stable · Drift detected · Context changed · Noisy · Blocked · Unknown
Model tracking: Provider-declared Alias-tracked Pinned model
Creative Canary Results
Per-provider qualification rate under the creative-canary-search-v1 suite. Each pair is one (probe, seed), evaluated across 3 repeats.
Gold visible text + distribution stable Silver visible text stable, distribution unstable or unavailable Rejected text unstable or admissibility failed
OpenAI
gpt-4.1
68 %
Qualified
- Gold
- 24
- Silver
- 180
- Rejected
- 96
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Gold and Silver — distribution artifact available; Gold rebounded 3 → 24 on entirely disjoint (probe, seed) coordinates from CCS-2026-05-13 Gold (zero carryover). New Gold concentrates in ccs_metaphor_forest (5 seeds) and ccs_style_morning_prefix (4 seeds). FP-change exclusions retreated 7 → 5 under unchanged adapter config.
Anthropic
claude-sonnet-4-6
52 %
Qualified
- Gold
- 0
- Silver
- 156
- Rejected
- 144
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — distribution artifact unavailable; cross-day delta admitted. Silver −21 vs CCS-2026-05-13 (177 → 156) under unchanged adapter config; movement lands within the prior weekly band (158–177).
xAI
grok-4.20-0309-non-reasoning
43 %
Qualified
- Gold
- 0
- Silver
- 130
- Rejected
- 170
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — adapter_config_digest matches CCS-2026-05-13; cross-day delta admitted. Silver +3 vs CCS-2026-05-13 under unchanged context.
Gemini
gemini-2.5-flash
68 %
Qualified
- Gold
- 0
- Silver
- 205
- Rejected
- 95
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — distribution artifact unavailable; thinking_budget: 0 pinned. Silver −5 vs CCS-2026-05-13 (210 → 205) under unchanged adapter config.
Mistral
mistral-large-latest
87 %
Qualified
- Gold
- 0
- Silver
- 260
- Rejected
- 40
- FP-chg
- —
300 pairs · run CCS-2026-05-14
› Interpretation
Silver only — distribution artifact unavailable; alias-tracked model. Silver +8 vs CCS-2026-05-13 (252 → 260), partial rebound toward the three-run plateau at 261.
Creative canary stability is an observed provider-service property under declared controls. It is not proof of provider weight identity or future guaranteed reproduction.
Closing-day movement
Per-provider reproducibility deltas on the final day of the window, measured under unchanged adapter configuration.
OpenAI
✓ adapter unchanged
179 → 204 visible-text (3/176 → 24/180 Gold/Silver)
Gold 3 → 24 with zero (probe, seed) carryover from CCS-2026-05-13; Silver +4 (176 → 180); FP-change exclusions 7 → 5 under unchanged adapter config.
Anthropic
✓ adapter unchanged
177 → 156 visible-text (0/177 → 0/156 Gold/Silver)
Silver −21 (177 → 156) under unchanged adapter config; movement lands within the prior weekly band (158–177). Count parity only — set membership not compared.
Gemini
✓ adapter unchanged
210 → 205 visible-text (0/210 → 0/205 Gold/Silver)
Silver −5 (210 → 205) under unchanged adapter config. Count parity only — set membership not compared.
xAI
✓ adapter unchanged
127 → 130 visible-text (0/127 → 0/130 Gold/Silver)
Silver +3 under unchanged adapter config; cross-day delta is computable (adapter_config_digest matches CCS-2026-05-13).
Mistral
✓ adapter unchanged
252 → 260 visible-text (0/252 → 0/260 Gold/Silver)
Silver +8 vs CCS-2026-05-13 (252 → 260) under unchanged adapter config; alias-tracked. Partial rebound toward the prior three-run plateau at 261. Set membership not compared.
Qualified creative canaries today
- High qualified-canary rate Mistral 87% qualified canaries
- Moderate qualified-canary rate Gemini 68% qualified canaries
- Moderate qualified-canary rate OpenAI 68% qualified canaries
- Moderate qualified-canary rate Anthropic 52% qualified canaries
- Moderate qualified-canary rate xAI 43% qualified canaries
Observed in this creative-canary suite under these provider settings.
Comparison context
›
OpenAI fingerprint signal · 15 → 1 exclusions
On CCS-2026-05-14, one OpenAI
creative pair crossed a system-fingerprint change — a cleanup from
CCS-2026-05-13 (15). Today's Gold tier
partially rebuilt from 8 to 6, anchored on two probes; only 1 of 6 (probe,
seed) pairs carries forward from the prior Gold set under unchanged adapter
config. Provider Sentinel records this as comparison-context evidence,
not as proof of an internal provider change.
Adapter configuration
›
xAI cross-day delta admitted — adapter unchanged
xAI's adapter_config_digest on
CCS-2026-05-14 matches
CCS-2026-05-13, so the cross-day
delta is admitted under SOP §6. The
CCS-2026-05-14 artifact is Complete and
artifact-backed (Silver: 132), with a Silver retreat (−6) under unchanged
context. Classified
Stable (delta admissible), not
ContextChanged.
› Distribution signal
Visible text and distribution fingerprints are separate evidence surfaces. OpenAI is currently the only P0 provider in this run that exposes enough distribution data for Gold canaries.
- 6 distribution-stable creative pairs today
- 293 distribution-unstable pairs
- 1 unsupported due to fingerprint change
Visible text can remain stable while token-probability fingerprints move. Provider Sentinel records those as separate evidence surfaces.
This review shows whether provider behaviour changed under the same sealed measurement contract — not whether provider weights changed. Reproducibility changes are evidence of observable provider-service behaviour or run-context change.
Last seven days
Per-provider qualified canaries (Gold + Silver) across CCS-2026-05-08 → CCS-2026-05-14. Each strip is
scaled to that provider's own qualified-canary range so trend shape is
legible — per-cell counts give the absolute level. Adapter config and
capture-policy version unchanged across the window; 300 pairs per provider per day.
-
OpenAI
CCS-2026-05-08 → CCS-2026-05-14
+13 qualified Δ7d08 19109 16710 18411 16412 17413 17914 204Gold tier swung 8 → 2 → 21 → 8 → 6 → 3 → 24 across the window. The 21 mid-window peak collapsed to 8 then drained to 3 by CCS-2026-05-13 before rebounding to 24 on CCS-2026-05-14 with zero (probe, seed) overlap with CCS-2026-05-13 Gold; new Gold concentrates in ccs_metaphor_forest and ccs_style_morning_prefix. FP-change exclusions traced 3 → 4 → 0 → 15 → 1 → 7 → 5.
-
Anthropic
CCS-2026-05-08 → CCS-2026-05-14
-2 qualified Δ7d08 15809 17010 17211 15912 16113 17714 156Silver climbed 158 → 170 → 172 → 159 → 161 → 177 across the first six days before retreating to 156 on CCS-2026-05-14 (−21 vs prior run; −2 net vs window start) under unchanged adapter config. Window-end retreat lands within the prior weekly band (158–177).
-
xAI
CCS-2026-05-08 → CCS-2026-05-14
+1 qualified Δ7d08 12909 13110 14311 13812 13213 12714 130Silver climbed from 129 to a 143 mid-window peak then retreated to 127 before nudging back to 130 on CCS-2026-05-14 (+1 net vs window start, −13 from the mid-window peak).
-
Gemini
CCS-2026-05-08 → CCS-2026-05-14
-13 qualified Δ7d08 21809 20510 21211 20812 21013 21014 205Silver opened at 218, retreated to 205, then settled at 210 for two consecutive runs before slipping to 205 on CCS-2026-05-14 (−13 net vs window start) under unchanged adapter config.
-
Mistral
CCS-2026-05-08 → CCS-2026-05-14
+8 qualified Δ7d08 25209 24210 26111 26112 26113 25214 260Silver opened at 252, dipped to 242, climbed to a three-run plateau at 261, then retreated to 252 before partially rebounding to 260 on CCS-2026-05-14 (+8 net vs window start); alias-tracked.
› Notable arc · OpenAI fingerprint exclusions
Fingerprint-change exclusions for OpenAI traced 3 → 4 → 0 → 15 → 1 → 7 → 5 across the window.
The mid-window spike to 15 on CCS-2026-05-11 cleared to 1 on CCS-2026-05-12, climbed to 7 on CCS-2026-05-13, then retreated to 5 on CCS-2026-05-14. Gold tier traced 8 → 2
→ 21 → 8 → 6 → 3 → 24, with the window-high expansion to 21 on CCS-2026-05-10 draining to 3 by CCS-2026-05-13 before rebounding to 24 today
on disjoint probe coordinates (zero (probe, seed) overlap with the CCS-2026-05-13
Gold set) under unchanged adapter config.
Cross-day deltas under unchanged adapter config are observed provider-service behaviour. They are not proof of hidden weight mutation.
Run a case study on your stack
Everything on this page is a variable. Pick the providers, prompts, and window that matter to your product, and we will run a sealed creative-canary observation and hand back the same evidence shape.
- Providers
- Any HTTP-accessible chat API. Bring your own keys.
- Models & aliases
- Pinned versions or alias-tracked latest contracts.
- Prompt suite
- Your prompts, your seeds, your comparator policy.
- Capture policy
- Versioned per adapter, sealed into the run digest.
- Window length
- One day to one month. Repeatable on demand.
- Evidence delivery
- Signed receipts and an evidence pack ready for hand-off.
This page shows what one configuration produced. A tailored case study replaces the providers, prompts, and window with yours — the proof boundary and evidence shape stay the same.
Next steps
New to Provider Sentinel? What it observes and how the evidence model works .