fix: tight loop in RPC mode event stream reconnect#1789
fix: tight loop in RPC mode event stream reconnect#1789
Conversation
* the in-process sync stream has a spec-compliant delay of maxStreamBackoff, but the RPC does not * this backoff is crucial to prevent tight loops sometimes cauesd by immediate error responses, often from proxies * this adds the same backoff to RPC, plus tests, and adds seme naming improvements to existing tests Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
There was a problem hiding this comment.
Code Review
This pull request implements a non-blocking backoff strategy for the gRPC event stream in RpcResolver to mitigate tight retry loops during failures. It introduces a ScheduledExecutorService for managing delayed retries and updates the stream observation logic to handle interruptions. The review feedback recommends utilizing the new scheduler for the initial stream observation to ensure consistent thread management and proper cleanup during shutdown.
| Thread listener = new Thread(this::observeEventStream); | ||
|
|
||
| listener.setDaemon(true); | ||
| listener.start(); |
There was a problem hiding this comment.
Instead of creating a manual thread, consider using the existing retryScheduler to execute the initial observeEventStream call. This ensures that the thread is managed by the scheduler and will be properly interrupted when shutdownNow() is called in the shutdown() method, adhering to the general rule for worker threads blocking on queue operations.
| Thread listener = new Thread(this::observeEventStream); | |
| listener.setDaemon(true); | |
| listener.start(); | |
| retryScheduler.execute(this::observeEventStream); |
References
- To shut down a worker thread that blocks on a queue operation (e.g., queue.take()), use thread interruption. The blocking call will throw an InterruptedException, allowing for a more robust and self-contained shutdown logic compared to using a simple boolean flag.
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
FYI: a lot of the diff in non-test code is whitespace