Skip to content

Add FFE first remote config request test#7023

Draft
leoromanovsky wants to merge 5 commits into
mainfrom
leo.romanovsky/ffl-2339-ffe-first-rc-system-tests
Draft

Add FFE first remote config request test#7023
leoromanovsky wants to merge 5 commits into
mainfrom
leo.romanovsky/ffl-2339-ffe-first-rc-system-tests

Conversation

@leoromanovsky
Copy link
Copy Markdown
Contributor

@leoromanovsky leoromanovsky commented May 27, 2026

Motivation

Customers can have an Agent running for a long time before a tracer with Feature Flags enabled starts. In that shape, the Agent may not have an FFE_FLAGS cache yet. The tracer needs to advertise FFE on its first Remote Config request so the Agent can use the new-client backend fetch path immediately. If FFE_FLAGS only appears on a later poll, the application can keep serving default flag values during startup.

There is a second startup shape we also want covered: the app starts while the Agent/RC endpoint is unavailable, then the Agent comes online after the tracer has already made its first request. In that case the provider must recover without an app restart once FFE_FLAGS is delivered.

We observed different behavior and corrected it in the Java client: DataDog/dd-trace-java#11465. The Go tracer is the reference shape: it subscribes FFE during tracer Remote Config startup in https://github.com/DataDog/dd-trace-go/blob/3ded6653e44aeb0d27bd5944e1e8033775473768/ddtrace/tracer/remote_config.go#L508-L512, and the OpenFeature bridge registers FFE_FLAGS with the FFE capability in https://github.com/DataDog/dd-trace-go/blob/3ded6653e44aeb0d27bd5944e1e8033775473768/internal/openfeature/rc_subscription.go#L42-L71.

Changes

This adds shared FFE system-test coverage under the existing FEATURE_FLAGGING_AND_EXPERIMENTATION scenario.

Test_FFE_First_Remote_Config_Request checks the first tracer /v0.7/config request captured by the library interface. It requires the client product list to include FFE_FLAGS, and it requires the advertised capabilities to include FFE_FLAG_CONFIGURATION_RULES. The test intentionally has no setup call to /ffe, so it checks the startup Remote Config subscription, not a subscription that appears only after the first flag-evaluation endpoint is hit.

Test_FFE_RC_Down_Then_Up makes the recovery case obvious. The proxy first returns 503 for /v0.7/config; the test waits until the tracer sees that failure and verifies a flag evaluation returns the in-code default. Then the test restores Remote Config by publishing an FFE_FLAGS config and evaluates the same flag again. A recovered provider returns the delivered value, on; a provider stuck in the startup error state keeps returning the default and fails the test.

The manifests now keep the new assertions active where the behavior is implemented and mark observed gaps explicitly. Java is gated to v1.63.0-SNAPSHOT. Python recovery is gated to v4.11.0-dev, while Python first-request subscription remains marked as bug (FFL-2339) until the dd-trace-py fix lands. Dotnet recovery is also marked as bug (FFL-2339) because the SDK currently returns null instead of the in-code default while RC is unavailable.

Decisions

This is intentionally a cross-language system test, not a Java-only assertion. The bug shape is about the tracer/provider Remote Config contract: FFE must be subscribed early enough, and a provider that starts before config is available must still recover when config arrives later.

The recovery test simulates the Agent/RC outage through the system-tests proxy instead of physically stopping and starting the Agent container. That keeps the test focused on the behavior SDKs can control: first RC fetch fails, a later RC payload succeeds, and evaluations switch from defaults to the delivered flag value without restarting the app.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

CODEOWNERS have been resolved as:

manifests/dotnet.yml                                                    @DataDog/apm-dotnet @DataDog/asm-dotnet
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java
manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
tests/ffe/test_dynamic_evaluation.py                                    @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core

@datadog-prod-us1-5
Copy link
Copy Markdown

datadog-prod-us1-5 Bot commented May 27, 2026

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 2 Pipeline jobs failed

Testing the test | System Tests (java, dev) / End-to-end #1 / spring-boot 1   View in Datadog   GitHub Actions

🔧 Fix in code (Fix with Cursor). 1 failed test due to AssertionError: Default evaluation failed: None

🧪 1 Test failed

tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_Then_Up.test_ffe_rc_down_then_up_recovers[spring-boot] from system_tests_suite   View in Datadog (Fix with Cursor)
AssertionError: Default evaluation failed: None
assert None == 200
 +  where None = HttpResponse(status_code:None, headers:{}, text:None).status_code
 +    where HttpResponse(status_code:None, headers:{}, text:None) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_Then_Up object at 0x7f76a4b23320>.default_eval

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_Then_Up object at 0x7f76a4b23320>

    def test_ffe_rc_down_then_up_recovers(self):
        assert self.config_request_data is not None, "No /v0.7/config 503 response was captured"
        assert self.config_request_data["response"]["status_code"] == HTTPStatus.SERVICE_UNAVAILABLE, (
...

Testing the test | all-jobs-are-green   View in Datadog   GitHub Actions

🔧 Fix in code (Fix with Cursor). CI checks failed during execution of end-to-end tests for spring-boot.

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 022c28a | Docs | Datadog PR Page | Give us feedback!

Copy link
Copy Markdown

@aarsilv aarsilv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants