The harness protected the science: no clean shared slot, no memory score.
v0.63 attempted to resume the unscored v0.62 domains, but the chat-slot idle barrier returned 40 consecutive fast 503 busy responses before the first benchmark row. That is a capacity/isolation result only. It does not weaken the v0.62 memory finding; it says institutional evals need an isolated lane or a lease/queue protocol.
What This Proves
- The v0.63 runner correctly refused to score recall rows when the shared single-flight chat lane was unavailable.
/healthis not a sufficient readiness signal for a benchmark or demo; health can be OK while the chat slot is occupied.- The experiment avoided contaminating memory-quality results with another service's endpoint usage.
What This Does Not Prove
- It does not prove the memory system failed.
- It does not prove the endpoint backend is broken.
- It does not score story, agent-loop, relationship, or psychology recall under v0.63 because no row ran.
Last Positive Memory Evidence
v0.62 remains the latest live memory-quality finding. It scored the research-update domain under 1024 and 2048 pressure with both tail-output variants.
| v0.62 fact | Value |
|---|---|
| Scored rows passed strict + semantic | 4 / 4 |
| Prompt tokens on successful rows | 247,962 |
| Pressure bands scored | 1024 and 2048 |
| 1024 row latency | about 77-79 seconds |
| 2048 row latency | about 195-196 seconds |
| Deck-safe claim | Research-update recall survived heavy pressure with exact current-fact IDs and provenance handles on scored rows. |
v0.63 Capacity Trace
The idle barrier sent a tiny chat request, Return exactly OK., before allowing any benchmark row. The lane never cleared during the configured wait budget.
| Probe slice | Status | Elapsed | Body |
|---|---|---|---|
| Attempt 1 | 503 | 0.1566s | Backend busy: 1 request(s) already in flight (limit: 1). Try again in 10s. |
| Attempt 40 | 503 | 0.1552s | Backend busy: 1 request(s) already in flight (limit: 1). Try again in 10s. |
| Status count | 503 x 40 | configured 30s interval | No chat-slot idle success before stop. |
| Stop reason | partial | pre-row | idle_probe_failed_before_first_row |
Institutional Frame
Hypernym Infinite Memory is a memory control plane for model fleets: per-tenant memory stores, controller-curated recall, exact provenance handles, and a lower-cost path to long-memory inference for small local models.
Capability Thesis
Use cheap, local, mobile-class models with external memory/control logic so the serving cost does not scale like naive long-context attention for every user and every turn.
What We Have Seen
Under controlled rows, exact current-fact IDs and memory-key/provenance handles can survive high pressure; under shared serving, queue discipline becomes the first operational bottleneck.
Hyperscaler Question
Not merely “can it accept more tokens?” The real question is whether it reduces the cost and reliability penalty of long-memory inference across many users and tenants.
Decision Path
v0.62 scored memory
Research-update rows passed at 1024/2048 pressure. This is the quality evidence to carry forward.
v0.62 hit shared-slot busy
Later domains returned fast 503s. Those rows were not scored as memory failures.
v0.63 added idle barrier
The runner waited before the first resumed row and required a clean chat probe.
No clean lane appeared
40/40 probes were busy. Next institutional run needs lane reservation or endpoint isolation.
CTO Optimization Findings
- Add a chat-slot lease or reservation API so evals can distinguish “busy because another tenant is using it” from “backend did not recover.”
- Expose a readiness endpoint for chat-slot availability, not just process health.
- Keep pre-row and post-long-row idle barriers in every institutional harness.
- Report memory quality only on HTTP 200 scored rows; report shared-capacity windows separately.
- For large-model/hyperscaler scenarios, treat this as memory-plane scheduling plus provenance, not just bigger context.
Next Clean Run
- Run v0.63 again on an isolated lane or after a confirmed service window.
- Do not change the row matrix; the failed condition was access, not prompt design.
- Expected rows: story-canon, agent-loop, relationship, and psychology domains.
- Success criteria: at least one full domain row group completes with strict/semantic scoring and no idle-probe contamination.
Data Trace
Every claim on this page points to a local artifact that a CTO, auditor, or later agent can inspect directly.
| Evidence | Path | Use |
|---|---|---|
| v0.63 live scores | research/tracks/hypernym-infinite-mim/results/v0.63-unscored-domain-drain-resume/20260610T_unscored_domain_drain_resume_live_codex_v1/scores.json | Aggregate zero-row result, stop reason, and idle summary. |
| v0.63 idle attempts | research/tracks/hypernym-infinite-mim/results/v0.63-unscored-domain-drain-resume/20260610T_unscored_domain_drain_resume_live_codex_v1/idle-probe-before-first-row-attempts.json | All 40 fast 503 busy probes. |
| v0.63 manifest | research/tracks/hypernym-infinite-mim/results/v0.63-unscored-domain-drain-resume/20260610T_unscored_domain_drain_resume_live_codex_v1/run-manifest.json | 12-row planned matrix and idle-probe configuration. |
| v0.63 preflight health | research/tracks/hypernym-infinite-mim/results/v0.63-unscored-domain-drain-resume/20260610T_unscored_domain_drain_resume_live_codex_v1/preflight-health.json | Health OK before failed chat-idle window. |
| v0.63 final health | research/tracks/hypernym-infinite-mim/results/v0.63-unscored-domain-drain-resume/20260610T_unscored_domain_drain_resume_live_codex_v1/final-health.json | Health OK after failed chat-idle window. |
| v0.63 snapshot | .forge/artifacts/cxdb-hypernym-infinite-mim-post-v063-snapshot-20260610T073942Z.md | Durable context handoff for future sessions. |
| v0.63 RL trace | .forge/artifacts/rl-traces-HYPERNYM_INFINITE_MIM_v063_20260610T073942Z.jsonl | Machine-readable research-policy update. |
| v0.62 live scores | research/tracks/hypernym-infinite-mim/results/v0.62-tail-contract-cross-domain-pressure/20260610T_tail_contract_cross_domain_pressure_live_codex_v1/scores.json | Last positive memory-quality evidence. |
| Working memory | research/tracks/hypernym-infinite-mim/WORKING_MEMORY.md | Human-readable current state and resume instructions. |
Compound Research Chain
| Artifact | Pointer |
|---|---|
| Current public board | https://hypernym-infinite-memory-v09.pages.dev/ |
| Previous immutable v0.62 board | https://087eddb2.hypernym-infinite-memory-v09.pages.dev/ |
| v0.62 local board | .forge/artifacts/hypernym-infinite-mim-v0.62-cto-board.html |
| v0.63 local board | .forge/artifacts/hypernym-infinite-mim-v0.63-cto-board.html |
| v0.63 strategy | research/tracks/hypernym-infinite-mim/v0.63-unscored-domain-drain-resume-strategy.md |
| Compound visualization standard | research/tracks/hypernym-infinite-mim/compound-research-visualization-standard.md |