[debugger] Add compliance tests for probe when condition errors#7018
Open
watson wants to merge 1 commit into
Open
[debugger] Add compliance tests for probe when condition errors#7018watson wants to merge 1 commit into
when condition errors#7018watson wants to merge 1 commit into
Conversation
Contributor
|
|
Contributor
|
P403n1x87
approved these changes
May 26, 2026
c2ee329 to
3717e1d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3717e1dc86
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
3717e1d to
81ac26c
Compare
Add system tests that pin down the contract for two distinct failure paths when a dynamic-instrumentation probe's `when` condition can't be evaluated: * Invalid DSL at install time: the probe must be rejected and the tracer must emit a `status=ERROR` diagnostic so the developer knows the probe was dropped. No snapshot may be produced. * Structurally valid `when` that raises at runtime: the probe must install successfully and, when the condition raises, the tracer must emit a snapshot with `evaluationErrors[]` populated (mentioning the failing symbol) and NO captured data. These eval-error snapshots must also be rate-limited so a hot probe doesn't spam the user. The tests assert the ideal behavior; existing per-tracer deviations are recorded via manifest `bug` / `missing_feature` / `irrelevant` markers. Co-authored-by: Cursor <cursoragent@cursor.com>
81ac26c to
f7e3186
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Motivation
Tracers currently diverge in how they handle a probe whose
whencondition can't be evaluated - some silently drop the probe, some install it and then emit a snapshot with no eval error, some have no rate-limiting, etc.This PR pins down the contract we want every tracer to follow and gives us a clear baseline to drive each tracer toward conformance.
There is no existing RFC for the eval-error rate-limit policy. This PR uses the system-test as the cross-tracer commitment: at most 1 eval-error snapshot per probe per 5 minutes. The rate matches
dd-trace-py's implementation; the choice is deliberate (one snapshot is enough to alert the developer that a probe is misconfigured, anything more frequent is noise). Tracers that emit more frequently are tracked as bugs and will need to either adopt this rate or push back on the test.Changes
Two new compliance test classes in
tests/debugger/test_debugger_condition_errors.py:Test_Debugger_Invalid_Condition_DSL(1 test)test_invalid_condition_dsl_probe_rejected- a probe whosewhenDSL cannot be parsed must be rejected with astatus=ERRORdiagnostic and must not emit any snapshot.Test_Debugger_Runtime_Condition_Error(3 tests)test_runtime_condition_error_probe_installs- a structurally validwhenmust install successfully (the runtime failure happens later).test_runtime_condition_error_emits_error_only_snapshot- when the condition raises at eval time, the tracer must emit a snapshot withevaluationErrors[]populated (mentioning the unbound symbol) and an emptycapturesfield (no leaking of captured probe data).test_runtime_condition_error_rate_limited- two probe hits >1s apart must produce at most 1 eval-error snapshot per probe (i.e. the 1/5min commitment). The >1s spacing ensures each hit clears any general per-second snapshot sampler, so anything dropping the second snapshot must be the eval-error-specific rate limiter doing its job.Supporting changes:
probe_invalid_condition_dsl.json/probe_runtime_condition_error.json.method_and_language_to_line_number(tests/debugger/utils.py) registering the line number for the Node.jsLogProbeweblog method. The in-test conversion from method-probe to line-probe for Node.js is kept strictly local to this test file (with a docstring explaining why it's unsafe to generalize: method-probe -> line-probe is not a generally equivalent rewrite, and these tests only get away with it because they assert on diagnostics, not on captured data).SchemaBug(...)entries intests/schemas/test_schemas.pycovering a pre-existing schema-compliance gap indd-trace-pythat this PR's new tests surfaced for the first time (Python emitsstack: nullin eval-error snapshots; the schema requires an array). The bug existed but had never been triggered because no prior system-test exercised the eval-error code path. Tracked under the Python bug ticket.Manifest updates (compliance markers, not characterization):
agent.yml-irrelevantfor Go DI ondatadog-agent < 7.77.1(thewhenclause requires that agent version on Go). Scoped viaweblog_declarationto the Go weblogs only so the constraint doesn't accidentally apply to non-Go runs against an old agent (addressed in review).golang.yml,java.yml,nodejs.yml,php.yml,python.yml,ruby.yml,dotnet.yml- per-tracerbugormissing_featuremarkers for tests that don't yet conform. Bug tickets are placeholders to be replaced by each tracer team. Note: under the 1/5min rate-limit commitment above, all ofjava,nodejs,ruby,php,dotnetneed a rate-limitbugmarker (their current implementations are at 1/s or per-probe-configurable, not 1/5min).Workflow
🚀 Once your PR is reviewed and the CI green, you can merge it!
🛟 #apm-shared-testing 🛟
Reviewer checklist
tests/ormanifests/is modified ? I have the approval from R&P teamutils/-adjacent change is a single new entry inmethod_and_language_to_line_numberintests/debugger/utils.py(purely test-helper code).build-XXX-imagelabel is present