feat(coder/modules/agentapi): add state persistence#736
feat(coder/modules/agentapi): add state persistence#736
Conversation
| log "Sending SIGUSR1 to AgentAPI (pid $agentapi_pid) to save state" | ||
| kill -USR1 "$agentapi_pid" || true | ||
| # Allow time for state save to complete before proceeding. | ||
| sleep 1 |
There was a problem hiding this comment.
If we know the path at which the state file should be written, we can sleep a bit longer and validate that it was written.
There was a problem hiding this comment.
It could be an old one so we'd have to check timestamp.
But really, we don't care about confirming that it was written here as the TERM + wait covers that. It's more of an early invoke in case the workspace gets killed before we complete all the work here. The sleep is there to allow the serialization to initiate before we grab the messages via API call below.
636bc8c to
955441d
Compare
001c89d to
8a32b14
Compare
|
|
||
| # Source shared utilities (written by the coder_script wrapper). | ||
| # shellcheck source=lib.sh | ||
| source /tmp/agentapi-lib.sh |
There was a problem hiding this comment.
Review: Following existing pattern here but why don't we write all scripts to the provided module_dir?
| agentapi_pid=$(cat "$PID_FILE_PATH" 2> /dev/null || echo "") | ||
| fi | ||
|
|
||
| # State persistence is only enabled when the binary supports it (>= v0.12.0). |
There was a problem hiding this comment.
Review: We'll have to confirm this ends up being the version (cc: @35C4n0r)
AgentAPI can save and restore conversation state across workspace restarts. The base module exports env vars (AGENTAPI_STATE_FILE, AGENTAPI_SAVE_STATE, AGENTAPI_LOAD_STATE, AGENTAPI_PID_FILE) that the binary reads directly. No consumer module start scripts need changes. New variables: - enable_state_persistence (bool, default true) - state_file_path (string, defaults to $HOME/<module_dir_name>/state.json) - pid_file_path (string, defaults to $HOME/<module_dir_name>/agentapi.pid) State persistence requires agentapi >= v0.12.0. A shared version_at_least function in scripts/lib.sh gates both the env var exports in main.sh and SIGUSR1 in the shutdown script. The version is queried from the real binary (agentapi --version) rather than the Terraform variable, so it works correctly when install_agentapi is false. Shutdown script now performs a three-phase shutdown: 1. SIGUSR1 to trigger state save (gated on version + persistence enabled) 2. Log snapshot capture (existing behavior, now fault-tolerant via subshell) 3. SIGTERM for graceful termination with wait loop Also bumps agentapi module version to 2.2.0. Refs: internal#1257, internal#1256, registry#696
8a32b14 to
1435939
Compare
DevelopmentCats
left a comment
There was a problem hiding this comment.
LGTM aside from what has already been pointed out.
@35C4n0r You might have some better insight on this since agentapi.
AgentAPI can save and restore conversation state across workspace restarts.
The base module exports env vars (AGENTAPI_STATE_FILE, AGENTAPI_SAVE_STATE,
AGENTAPI_LOAD_STATE, AGENTAPI_PID_FILE) that the binary reads directly.
No consumer module start scripts need changes.
New variables:
State persistence requires agentapi >= v0.12.0. A shared version_at_least
function in scripts/lib.sh gates both the env var exports in main.sh and
SIGUSR1 in the shutdown script. Old binaries get a warning and graceful
skip instead of breakage.
Shutdown script now performs a three-phase shutdown:
Also bumps agentapi module version to 2.2.0 and claude-code to 4.7.6.
Closes coder/internal#1257
Refs coder/internal#1256
Refs #696