Skip to content

fix: simplify Hermes recreation lifecycle#413

Merged
OisinKyne merged 2 commits intomainfrom
fix/hermes-recreation-lifecycle
May 4, 2026
Merged

fix: simplify Hermes recreation lifecycle#413
OisinKyne merged 2 commits intomainfrom
fix/hermes-recreation-lifecycle

Conversation

@bussyjd
Copy link
Copy Markdown
Collaborator

@bussyjd bussyjd commented May 4, 2026

Summary

Supersedes the remaining actionable parts of #411 with a narrower fix:

  • bump Hermes agent image to nousresearch/hermes-agent:v2026.4.30
  • stop cloning/rebuilding Hermes into the persistent volume during pod init
  • validate the image-provided /opt/hermes/.venv/bin/hermes and required extras before the gateway starts
  • refresh Hermes wallet-metadata ConfigMap during wallet restore before restarting remote-signer
  • make generic obol agent wallet backup/restore help text runtime-neutral

Why

PR #411 contained several useful changes, but most are already on main and the remaining branch is conflicted. The highest-value unresolved fix is the Hermes recreation path: the pod should not mutate a PVC by cloning a git repo and building a venv on every cold start. That made recreation fragile and left read-only git/venv artifacts in volumes.

This PR keeps the simple image-based runtime contract while preserving safety: if the published image ever lacks the binary or extras we need, the init container fails clearly instead of booting a broken agent.

Validation

  • git diff --check
  • go test ./cmd/obol ./internal/hermes ./internal/stack -count=1
  • go test ./... -count=1
  • docker manifest inspect nousresearch/hermes-agent:v2026.4.30
  • docker run --rm --entrypoint sh nousresearch/hermes-agent:v2026.4.30 -ec 'test -x /opt/hermes/.venv/bin/hermes && /opt/hermes/.venv/bin/python3 -c "import fastapi, uvicorn, telegram, mcp, ptyprocess, simple_term_menu, googleapiclient"'

Notes

I intentionally did not bring over #411's litellm-config delete/recreate behavior. main already has the managed-fields mitigation, and replacing it with ConfigMap deletion should be a separate change only if we can reproduce that main still fails stack recreation.

@bussyjd bussyjd mentioned this pull request May 4, 2026
@OisinKyne OisinKyne marked this pull request as ready for review May 4, 2026 13:36
@OisinKyne OisinKyne merged commit ea341c7 into main May 4, 2026
6 checks passed
@OisinKyne OisinKyne deleted the fix/hermes-recreation-lifecycle branch May 4, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants