Skip to content

fix(transport): forward connection-resilience kwargs to the PEL reclaimer#243

Draft
tsiaeggai wants to merge 1 commit into
mainfrom
fix/reclaimer-connection-resilience
Draft

fix(transport): forward connection-resilience kwargs to the PEL reclaimer#243
tsiaeggai wants to merge 1 commit into
mainfrom
fix/reclaimer-connection-resilience

Conversation

@tsiaeggai

@tsiaeggai tsiaeggai commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

The PEL reclaimer runs its own background Redis client, created with
aioredis.from_url(url, decode_responses=False) and nothing else — so it
ignored whatever connection settings the transport was given. Even when the
broker was configured with socket_timeout / socket_keepalive /
health_check_interval, the reclaimer's client had none of them. When the
underlying connection was silently dropped (cloud Redis failover, an
idle-connection reaper, a network blip), the reclaimer's blocking reads had no
socket timeout or keepalive to break them and hung indefinitely, with no path
to recovery short of a process restart.

This PR forwards the transport's connection-resilience kwargs to the reclaimer's
client so both connections recover the same way. The forwarded keys are a
whitelist of redis-py connection/resilience options valid for
aioredis.from_url; broker/FastStream-only kwargs (decoder, middlewares,
asyncapi_*, …) are intentionally excluded. decode_responses stays pinned to
False for binary passthrough and cannot be overridden by callers.

The change is opt-in and fully backward compatible: when no connection kwargs
are passed (or a pre-built broker= is supplied), the reclaimer behaves exactly
as before.

Closes #244

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test coverage improvement
  • CI/CD update
  • Dependency update

Related Issue

Fixes #244

Changes Made

  • RedisTransport.__init__ captures connection-resilience kwargs from **kwargs into self._connection_kwargs via a _RECLAIMER_CONNECTION_KEYS whitelist (socket_timeout, socket_connect_timeout, socket_keepalive, socket_keepalive_options, health_check_interval, retry_on_timeout, retry_on_error, max_connections, ssl_*). Copied, not popped — the broker still receives them too.
  • _setup_reclaimer passes connection_kwargs=self._connection_kwargs to PendingReclaimerManager.
  • PendingReclaimerManager.__init__ accepts connection_kwargs; start() applies them in from_url, with decode_responses=False pinned last so callers cannot break binary passthrough.
  • Added test_connection_kwargs_propagate_to_reclaimer (transport → manager propagation, broker-only kwargs excluded) and test_reclaimer_start_applies_connection_kwargs (kwargs reach from_url; decode_responses forced to False).
  • Updated sdk/CHANGELOG.md under [Unreleased] → Fixed.

Migration

None — fully backward compatible. No on-wire or API changes; the reclaimer only
gains the connection settings already passed to the transport.

Testing

  • Existing tests pass locally (make test)
  • Added new tests for new functionality

Local run against a Redis 7 container: 48 passed including the 2 new tests. Two
pre-existing failures (test_backoff_*) are unrelated to this change — a local
faststream-version mismatch (BinaryMessageFormatV1.encode not awaitable) that
also fails on the clean base commit.

Changelog

  • I have updated sdk/CHANGELOG.md under [Unreleased] section (required for code changes)

Checklist

  • My code follows the project's code style (make lint passes)
  • My code is properly formatted (make format applied)
  • I have added/updated tests that prove my fix/feature works
  • I have added/updated documentation as needed
  • All tests pass locally
  • I have reviewed my own code
  • My changes generate no new warnings
  • My commit messages follow the Conventional Commits standard

…imer

The reclaimer created its own Redis client with no socket timeout or
keepalive regardless of the settings passed to the transport, so a
silently dropped connection (cloud Redis failover, idle-connection
reaper) left its blocking reads hung indefinitely with no way to
recover.

Forward the broker's connection-resilience kwargs (socket_timeout,
socket_keepalive, health_check_interval, retry_on_timeout, ssl_*, …) to
PendingReclaimerManager so its independent client recovers the same way
the broker does. decode_responses stays pinned to False for binary
passthrough and cannot be overridden by callers.
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

QualOps Code Quality Analysis

Status: ✅ PASSED - No issues found

Summary

  • Total Issues: 0
  • Critical: 0 🔴
  • High: 0 🟠
  • Medium: 0 🟡
  • Low: 0 🟢
  • Files Analyzed: 0

No issues found in the analyzed code.

📊 Full Report

View detailed report


Powered by QualOps

@tsiaeggai tsiaeggai marked this pull request as draft June 10, 2026 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redis transport: PEL reclaimer ignores connection-resilience settings and hangs on a silently dropped connection

2 participants