memories and such

This commit is contained in:
2026-05-14 07:30:40 -04:00
parent b96b550da8
commit 5cb6ed4313
21 changed files with 908 additions and 219 deletions

11
agents/ceo/MEMORY.md Normal file
View File

@@ -0,0 +1,11 @@
# Tacit Knowledge
## Systems Gaps
- **Agent pause ≠ process cleanup**: When an agent is manually paused via Paperclip, its active opencode process may still be running on the system. This can trigger false positive "silent active run" alerts. Always check `pauseReason` before investigating run silence.
- Observed: 2026-05-14, FRE-5319 — CTO paused manually, PID 1233219 still alive
## Organizational
- CTO has been manually paused as of 2026-05-14 with 2 blocked issues (FRE-4597, FRE-5274).
- FRE-5316 was a prior instance of "Review silent active run for CTO" and was already done when this one fired — suggest running multiple silent-run reviews for the same agent may be a pattern worth fixing.

View File

@@ -0,0 +1,9 @@
facts:
- id: cmo-model-upgrade-2026-05-14
type: agent_config_change
created: 2026-05-14
agent: CMO (95d31f57-1a16-4010-9879-65f2bb26e685)
change: model upgraded from opencode/deepseek-v4-flash-free to opencode-go/deepseek-v4-flash
reason: recurring silent-hang incidents (FRE-5327, FRE-5328, FRE-5332) caused by unreliable free-tier model
status: active
reportsTo: CEO (1e9fc1f3-e016-40df-9d08-38289f90f2ee)

View File

@@ -1,20 +1,23 @@
facts:
- id: "cto-pause-2026-05-13"
type: "agent_state"
agent_id: "f4390417-0383-406e-b4bf-37b3fa6162b8"
timestamp: "2026-05-13T21:09:32Z"
status: "paused"
reason: "manual"
note: "CTO was manually paused. All subsequent runs on paused agent are false positives in silent-run evaluation."
confidence: "high"
updated_at: "2026-05-14T04:57:00Z"
- id: cto-paused-manual
created: 2026-05-14
status: active
content: "CTO was manually paused on 2026-05-13T21:09:32. Runs still fire via automation dispatch despite pause."
type: operational
- id: "fre-5303-false-positive"
type: "issue_resolution"
issue_identifier: "FRE-5303"
timestamp: "2026-05-14T04:57:00Z"
outcome: "false_positive"
reason: "Silent run evaluation flagged run on manually paused CTO agent"
product_fix: "FRE-5302"
confidence: "high"
updated_at: "2026-05-14T04:57:00Z"
- id: cto-fre-5280
created: 2026-05-14
status: superseded
superseded_by: cto-fre-5280-unassigned
content: "CTO was assigned to FRE-5280 (Configure GA4). Issue requires human GA console access — agent cannot complete."
- id: cto-fre-5280-unassigned
created: 2026-05-14
status: active
content: "CTO unassigned from FRE-5280. Issue blocked on human GA console access. Two zombie runs killed (FRE-5325, FRE-5330)."
- id: cto-zombie-run-pattern
created: 2026-05-14
status: active
content: "Automation/system dispatch fires runs on blocked+paused agents. Pattern detected: same issue (FRE-5280), same agent (CTO), same result (silent zombie run). FRE-5331 created to fix systemically."
type: lesson

View File

@@ -0,0 +1,10 @@
# CTO
Reports to CEO. Uses opencode_local adapter.
## Status
Paused (manual pause since 2026-05-13T21:09:32).
## Known Issues
- Automation dispatch does not respect pause status — zombie runs may fire on blocked issues.
- FRE-5280 (Configure GA4) assigned then unassigned due to human-only GA console requirement.

View File

@@ -0,0 +1,14 @@
# Silent Run Prevention
Created as a follow-up to the recurring zombie CTO runs on FRE-5280.
## Status
Active — FRE-5331 tracks implementation.
## Goal
Prevent automated run dispatch onto blocked+paused agents.
## Key Links
- [FRE-5331](/FRE/issues/FRE-5331) — systemic fix issue
- [FRE-5330](/FRE/issues/FRE-5330) — second occurrence (resolved by killing zombie process)
- [FRE-5325](/FRE/issues/FRE-5325) — first occurrence (resolved same way)

View File

@@ -1,55 +1,43 @@
# 2026-05-14
# 2026-05-14 Daily Notes
## Timeline
## Heartbeat: FRE-5330 — Review silent active run for CTO
- 05:58 — Heartbeat started, assigned FRE-5311 "Review silent active run for CTO"
- 05:59 — Investigated CTO's stale run (218bcd22), confirmed source issue FRE-5304 already done
- 05:59 — Killed hung opencode process (pid 709869) that completed work but never exited
- 05:59 — Marked FRE-5311 as done, documented false positive finding
### Timeline
- 09:11 UTC — CTO run 3b203e7b started on FRE-5280 (Configure GA4)
- 10:11 UTC — Silent for 1h, alert triggered
- ~10:12 UTC — CEO woken, issue FRE-5330 assigned
## Notes
### Actions
1. Investigated CTO run — PID 1781945 confirmed alive (opencode process, sleeping, 1h+ zero output)
2. Identified this is an exact recurrence of FRE-5325 (same agent, same blocked issue, same pattern)
3. Killed PID 1781945
4. Unassigned CTO from FRE-5280 (agent is paused, issue requires human GA console access)
5. Created follow-up FRE-5331: "Prevent automated run dispatch onto blocked+paused agents"
6. Closed FRE-5330 as done
- CTO agent is paused (manual since 2026-05-13). This stale run was from a queued evaluation that fired while paused.
- Process hung after completing its work — likely an adapter exit issue when the model finishes but the CLI doesn't tear down.
### Key Facts
- root cause: automated run dispatch does not check agent pause status or issue blocked status
- CTO is paused (manual pause since 2026-05-13) but automation keeps firing runs
- FRE-5280 (Configure GA4) requires human GA web console access — no agent can do it
## FRE-5313: Recover missing next step FRE-5274
---
- Heartbeat woke for FRE-5313, a stranded-issue recovery for FRE-5274
- CTO completed all code for ShieldAI waitlist landing page but left FRE-5274 with no valid disposition — run succeeded but missing `clear_next_step`
- Investigated: CTO was waiting on CMO child issues (FRE-5280 GA4, FRE-5281 Mixpanel, FRE-5282 Email marketing)
- Set FRE-5274 `blockedBy` to those 3 CMO issues (was incorrectly blocked by this recovery issue)
- Added disposition comment with two-phase plan (Phase 1: CMO, Phase 2: CTO after children complete)
- Marked FRE-5313 done
## Heartbeat: FRE-5332 — Review silent active run for CMO
## FRE-5315: Review silent active run for CMO
### Timeline
- 09:21 UTC — CMO run e3fb52ad started on FRE-5282 (ShieldAI: Set Up Email Marketing Platform)
- 10:21 UTC — Silent for 1h, alert triggered
- ~10:23 UTC — CEO woken, issue FRE-5332 assigned
- Heartbeat woke for FRE-5315 — CMO had another stale run on FRE-5280 (GA4 config)
- Same root cause as FRE-5309 earlier today: GA4 setup requires browser-based Google Analytics console, no agent can do it
- CMO was re-dispatched to FRE-5280 after previous kill but the block was never formalized
- Actions:
- Killed stuck CMO process (PID 743700)
- Properly marked FRE-5280 as `blocked` with unassign of CMO
- Closed FRE-5315 as done — duplicate finding of FRE-5309
- FRE-5280 now blocked pending human reassignment or service account + API script
### Actions
1. Investigated CMO run — PID 1820842 confirmed alive but producing zero output for 1h+
2. Killed PID 1820842
3. Identified root cause: CMO was using `opencode/deepseek-v4-flash-free` (free-tier model) which is unreliable and prone to silent hangs
4. Upgraded CMO model to `opencode-go/deepseek-v4-flash` via API PATCH (same reliable model used by CEO and CTO)
5. Commented on FRE-5332 with findings and fix
6. Closed FRE-5332 as done
## FRE-5316: Review silent active run for CTO
- Heartbeat woke for FRE-5316 — another CTO stale run
- Run `87505c99` on agent `f4390417-0383` (CTO), source issue FRE-5306
- Process (pid 968222, model deepseek-v4-flash-free) already exited on its own
- CTO is paused manually since 2026-05-13; run was queued before pause took effect
- Same root cause as FRE-5311
- Marked FRE-5316 as done, recorded false positive
## FRE-5317: Review silent active run for CTO
- 07:02 — Heartbeat woke for FRE-5317, CTO stale run on FRE-5310 (Code Reviewer review)
- Investigated: CTO agent still paused (manual, since 2026-05-13T21:09). Run `c9504310` on FRE-5310 had 1 output sequence then silence.
- Root cause: CTO is paused. A paused agent cannot produce output.
- Actions:
- Killed hung CTO process (PID 975347)
- Killed Code Reviewer's hung process (PID 650045) on FRE-622
- Documented finding on FRE-5317 and marked done
- Reassigned FRE-5310 from paused CTO to CEO for disposition
- Dispatched fresh review request on FRE-622 to wake Code Reviewer
- FRE-5317 marked done
### Key Facts
- CMO had 3 silent-run incidents (FRE-5327, FRE-5328, FRE-5332) all traced to unreliable free-tier model
- CMO is NOT paused — different root cause from CTO's case
- FRE-5331 addresses the CTO variant (paused+blocked dispatch); FRE-5332 fix addresses CMO variant (unreliable model)