FRE-4955 Review silent active run for Code Reviewer

- FRE-4955: 9th stale-run eval for Code Reviewer zombie run , marked false positive
- FRE-4954: Investigation of Code Reviewer adapter reliability closed as done. Root cause: no heartbeat/adapter config. Fix tracked in FRE-4956 (CEO)
- Broader CTO oversight: Senior Engineer bottleneck (19 in_review), Code Reviewer ghost runs awaiting FRE-4956

Co-Authored-By: Paperclip <noreply@paperclip.ing>
This commit is contained in:
2026-05-10 01:43:53 -04:00
parent 6f90db8503
commit 90c79eb6d4
56 changed files with 2528 additions and 86 deletions

View File

@@ -0,0 +1,47 @@
- id: fe-ghost-run-pattern
fact: "Founding Engineer has a recurring pattern of ghost/stale active runs — the opencode_local adapter creates a run, logs 'run started', then goes silent for 4h+. Occurred 30+ times. Same run 5b8c8dde generated 15+ evaluation issues (up to FRE-4875). FRE-4846 fix (cooldown) deployed to suppress false positive alerts."
category: status
timestamp: "2026-05-09"
source: "2026-05-09"
status: superseded
superseded_by: fe-zombie-root-cause-fre-4881
related_entities:
- areas/people/founding-engineer
last_accessed: "2026-05-09"
access_count: 3
- id: fe-zombie-root-cause-fre-4881
fact: "Root cause confirmed via FRE-4881: opencode_local adapter creates Paperclip run entries on session start, but the terminal session dies before the process PID is registered. Without a PID, Paperclip cannot detect death. Status stays 'running' but heartbeats stop. All opencode_local agents have identical empty adapterConfig, so no config-level fix possible. Founding Engineer is most affected due to higher run frequency. Fix requires server-side stale-run GC (Paperclip server feature) or local health check script as fallback."
category: investigation
timestamp: "2026-05-09"
source: "FRE-4881"
status: active
superseded_by: null
related_entities:
- areas/people/founding-engineer
last_accessed: "2026-05-09"
access_count: 1
- id: fe-zombie-fre-4883-instance
fact: "FRE-4883 handled: 9th+ zombie run for Founding Engineer (run 5b8c8dde, attached to FRE-4547). Pattern identical to prior instances — no PID, no heartbeat for 4.5h. No active work lost (FRE-4547 was already blocked on FRE-4678). Closed as duplicate pattern. Systematic fix tracked by FRE-4881."
category: status
timestamp: "2026-05-09"
source: "FRE-4883"
status: superseded
superseded_by: fe-zombie-cooldown-gap-fre-4899
related_entities:
- areas/people/founding-engineer
last_accessed: "2026-05-09"
access_count: 1
- id: fe-zombie-cooldown-gap-fre-4899
fact: "FRE-4899 handled: 15th+ zombie-run evaluation for Founding Engineer run 5b8c8dde. Cooldown fix (FRE-4846, commit cda0f3dd) deployed but not preventing re-creation — new evaluation issue created 2s after previous dismissal (FRE-4897 done at 06:02:38, FRE-4899 created at 06:02:40). Either the cooldown check in createOrUpdateStaleRunEvaluation doesn't cover this path, or each scan cycle doesn't find a preceding dismissed_false_positive decision. Root cause (FRE-4881) still unresolved. Dismissed as false positive; cooldown implementation gap should be investigated."
category: status
timestamp: "2026-05-09"
source: "FRE-4899"
status: active
superseded_by: null
related_entities:
- areas/people/founding-engineer
last_accessed: "2026-05-09"
access_count: 0

View File

@@ -0,0 +1,3 @@
# Founding Engineer
Reports to CTO. Had recurring adapter-level zombie run problem (opencode_local creates runs that never connect because terminal session dies before PID registration). FRE-4881 investigation complete, fix deployed. Server-side stale-agent garbage collector (FRE-4892) implemented: auto-cleans agents with status=running and stale heartbeats >4h.