Skip to content

Detach a session from SubscriptionManager on close so fanout stops enqueueing into dead outboxes #73

@nficano

Description

@nficano

Category: bug Severity: major
Location: lib/arcp/runtime/session_actor.rb:345-359 (secondary: lib/arcp/runtime/subscription_manager.rb:42-45)
Spec:

What

close_session deregisters the session from Runtime#@sessions and stops the writer/heartbeat tasks, but never removes the session's subscription entries from SubscriptionManager. Since jobs outlive their sessions (spec §6.7), a job that is still running after its submitter (or an observer) disconnects keeps that session's @outbox queue in @subs[job_id]. fanout continues q.enqueue(envelope) into that queue, which nothing drains anymore (the writer task was stopped). Events therefore accumulate unbounded in the orphaned Async::Queue for the remainder of the job's lifetime — a per-job memory leak that scales with event volume. (Distinct from #58, which concerns the EventLog ring, and #63, which concerns rebind scan cost.)

Evidence

# lib/arcp/runtime/session_actor.rb:345-359 — no @subscription_manager.detach for this session
def close_session
  return if @closed
  @closed = true
  @heartbeat_task&.stop
  @writer_task&.stop
  @outbox.enqueue(:__arcp_close__)
  @transport.close
  @runtime.deregister_session(@session_id) if @session_id
  # ... resume bookkeeping; subscriptions for @session_id are left in place ...
end

# lib/arcp/runtime/subscription_manager.rb:42-45 — keeps enqueueing into the orphaned queue
def fanout(job_id, envelope)
  targets = @mutex.synchronize { @subs[job_id].dup }
  targets.each { |_s, _p, q| q.enqueue(envelope) }
end

Proposed fix

On close_session, remove this session's subscription rows across all jobs (e.g. add SubscriptionManager#detach_session(session_id) mirroring rebind_session, and call it from close_session). Note the interaction with resume: a resuming session re-binds via rebind_session, so detach should only fire on genuine teardown (not when a resume is expected) — or rely on the resume window to re-attach a fresh outbox and have detach reclaim entries whose session never resumes. Add a test asserting that, after a submitter disconnects mid-job, the job's subscriber list no longer references the closed outbox.

Acceptance criteria

  • After a session closes, its outbox is removed from every @subs[job_id].
  • fanout for a still-running job does not enqueue into a closed session's queue.
  • Resume still re-attaches the new actor's outbox correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    audit/bugAudit: bug / correctnesssev/majorSeverity: major

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions