Skip to content

feat(streaming): add max_errors error-recovery parameter to streaming helpers (#b854ab)#45

Draft
KhaledSalhab-Develeap wants to merge 3 commits into
feat/6bdf93-async-streaming-helpersfrom
feat/b854ab-streaming-max-errors
Draft

feat(streaming): add max_errors error-recovery parameter to streaming helpers (#b854ab)#45
KhaledSalhab-Develeap wants to merge 3 commits into
feat/6bdf93-async-streaming-helpersfrom
feat/b854ab-streaming-max-errors

Conversation

@KhaledSalhab-Develeap

@KhaledSalhab-Develeap KhaledSalhab-Develeap commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a max_errors parameter to stream_alerts and stream_incident_updates that tolerates N consecutive poll-cycle failures (with exponential backoff) before re-raising. Baseline and seen-set state are preserved across transient outages. Auth and rate-limit errors bypass tolerance and re-raise immediately. Implements Approach A from SPIKE-2d10f5.

What changed

File Change
src/hyperping/_protocols.py Added retry_config: RetryConfig attribute declaration to _AsyncClientProtocol (TYPE_CHECKING import to avoid circular)
src/hyperping/_async_streaming_mixin.py Both generators: `max_errors: int
tests/unit/test_streaming.py 9 new tests (T1-T9) in TestStreamErrorRecovery; updated exception imports

Design reasoning

The streaming max_errors layer sits above _request's per-request retry budget: _request handles transient HTTP failures; max_errors tolerates repeated poll-cycle failures after that budget is exhausted. Backoff reuses retry_config.initial_delay, backoff_factor, and max_delay so the behavior stays consistent with existing retry configuration. Inlining the try/except in each generator (rather than a shared helper) keeps the control flow readable without awkward callbacks for a ~12-line block.

Verification matrix

Check Result
uv run pytest tests/unit/test_streaming.py -v 26 passed
uv run pytest tests/ --ignore=tests/unit/test_otel.py -q 589 passed, 94.10% coverage
uv run mypy src/hyperping/_async_streaming_mixin.py src/hyperping/_protocols.py Success: no issues found
uv run ruff check src/ tests/unit/test_streaming.py All checks passed
uv run ruff format --check src/ tests/unit/test_streaming.py All formatted
Import smoke test ok
Pre-existing OTel failure (test_otel.py::TestSyncClientSpans::test_request_creates_span) Pre-existing on base branch; unrelated to this change

Acceptance criteria

  • max_errors parameter added to stream_alerts and stream_incident_updates with default 3
  • Consecutive poll-cycle failures within budget are tolerated (baseline/seen state preserved)
  • Generator raises after exceeding the error budget
  • HyperpingAuthError and HyperpingRateLimitError bypass tolerance and re-raise immediately
  • max_errors=None provides infinite tolerance
  • Error counter resets after a successful poll
  • Exponential backoff uses retry_config parameters; delays are capped at max_delay
  • All 9 TDD tests pass; full suite 589 passed

Follow-up items

  1. stream_incident_updates auth/rate-limit bypass tests (medium). The except (HyperpingAuthError, HyperpingRateLimitError): raise block inside stream_incident_updates (_async_streaming_mixin.py:176-177) has zero test coverage (only uncovered line in the file). Add test_stream_incident_updates_reraises_auth_error_immediately and test_stream_incident_updates_reraises_rate_limit_error_immediately to TestStreamErrorRecovery, symmetric to the existing stream_alerts tests.

  2. Symmetric error-recovery tests for stream_incident_updates (low). The max_errors=None infinite-tolerance, counter-reset-on-success, and backoff-progression tests exist only for stream_alerts. The two generators use identical error-handling code, so the risk is low, but coverage should be symmetric to guard against future divergence.

  3. HyperpingNotFoundError absorption after first poll (low). HyperpingNotFoundError is a subclass of HyperpingAPIError and is therefore absorbed by the error counter in stream_incident_updates. If an incident is deleted after streaming starts, the caller does not learn of the deletion until max_errors cycles later. The docstring implies HyperpingNotFoundError is only raised "on the first poll," but this is not enforced. Document the behavior explicitly or add it to the bypass list in a follow-up.

…ter (#b854ab)

Covers: budget tolerance, budget exhaustion, auth/rate-limit bypass,
counter reset on success, infinite tolerance (max_errors=None), and
exponential backoff verification for both stream_alerts and
stream_incident_updates.
…tream_incident_updates (#b854ab)

Both generators now accept max_errors (default 3) to tolerate N consecutive
poll-cycle failures with exponential backoff before re-raising. Backoff uses
retry_config.initial_delay/backoff_factor/max_delay for consistency with the
per-request retry layer. HyperpingAuthError and HyperpingRateLimitError
bypass tolerance and re-raise immediately. Adds retry_config attribute
declaration to _AsyncClientProtocol so mypy can verify access.
@ksalhab89

Copy link
Copy Markdown

Reviewer context (not a merge request):

Adds a max_errors parameter (default 3) to AsyncStreamingMixin.stream_alerts and stream_incident_updates for poll-cycle error recovery above the per-request retry budget.

Where to focus review: src/hyperping/_async_streaming_mixin.py. The new try/except wraps each poll: HyperpingAuthError and HyperpingRateLimitError re-raise immediately; HyperpingAPIError increments consecutive_errors and backs off using retry_config.initial_delay * backoff_factor ** (n-1) capped at max_delay. Verify the counter resets to 0 only after a fully successful cycle, and that the continue correctly skips the trailing poll_interval sleep. Also check _protocols.py now declares retry_config: RetryConfig on _AsyncClientProtocol.

Risks / verify: Backoff reuses request-level retry config, so total delay compounds with _request's own retries; confirm that is intended. max_errors=None means unlimited tolerance (silent infinite retry). Tests in test_streaming.py cover budget, auth, and rate-limit bypass.

CI status: No checks triggered (base is feat/b854ab..., not main). Not a failure.

Notes: Stacked on #39. Looks safe and well-tested.

Brings in main (via the async-streaming base) and the removal of the premature
OTel files. This PR is scoped to the streaming max_errors helper; OTel is
delivered solely by the PY-11 PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants