Add FFE exposure emission#3910
Conversation
|
Benchmarks [ tracer ]Benchmark execution time: 2026-05-24 04:03:53 Comparing candidate commit 8eedb0e in PR branch Found 0 performance improvements and 7 performance regressions! Performance is the same for 187 metrics, 0 unstable metrics. scenario:MessagePackSerializationBench/benchMessagePackSerialization
scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache
scenario:SamplingRuleMatchingBench/benchRegexMatching1
scenario:SamplingRuleMatchingBench/benchRegexMatching2
scenario:SamplingRuleMatchingBench/benchRegexMatching3
scenario:SamplingRuleMatchingBench/benchRegexMatching4
scenario:TraceSerializationBench/benchSerializeTrace
|
The canonical fixture PHPT explicitly enumerates the FFE classes it requires before instantiating the Datadog FeatureFlags client. The shared evaluation-completed envelope/hook added on this branch made Client::createWithDependencies() reference NoopEvaluationCompletedHook, EvaluationCompletedHook, and EvaluationCompleted, but the test helper had not been updated, so the packaged/extension PHPT failed with "Class DDTrace\FeatureFlags\Internal\NoopEvaluationCompletedHook not found" before any fixture case ran. Add the three new files to require_feature_flag_api so the PHPT matches the runtime class graph used by Client::evaluate().
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 92ef9a34dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Adds Mermaid sources and rendered PNGs for the hook (this) PR plus a README documenting the regeneration workflow. - `docs/php-ffe-stack/stack-pr3909.mmd` + `.png` — 4-PR stack with this PR highlighted (M1 done; EVP and metrics as siblings to come). - `docs/php-ffe-stack/system-pr3909.mmd` + `.png` — target system architecture; this PR contributes the EvaluationCompletedHook + OpenFeature provider hook surface. All downstream nodes (writers, sidecar FFI, sidecar process, backends) marked future. - `docs/php-ffe-stack/README.md` — npx invocation for regenerating PNGs locally; PR-by-PR diagram table; architectural rule note. The architectural rule encoded in the system diagram (all I/O via the libdatadog sidecar) is the same rule Bob applied to PR #3910. See DataDog/libdatadog#2026 for the sidecar-side support.
Per Bob's PR review (2026-05-22), the tracer extension must perform no I/O outside the sidecar. Replaces the raw-socket `AgentExposureTransport` with `SidecarExposureTransport`, which forwards exposure batches to the libdatadog sidecar via a new native PHP function `\DDTrace\send_ffe_exposures` that calls the `ddog_sidecar_send_ffe_exposures` FFI added in DataDog/libdatadog#2026. PHP side: - Delete `Internal/Exposure/AgentExposureTransport.php` (raw socket POST to the Agent EVP proxy). - Add `Internal/Exposure/SidecarExposureTransport.php` that JSON-encodes the batch and calls `\DDTrace\send_ffe_exposures()`. Fire-and-forget; the sidecar handles retries. - Update `ExposureWriter::createDefault()` to instantiate the sidecar transport. - Drop the obsolete `testAgentTransportBuildsAgentEvpRequest` PHPUnit test (HTTP construction now lives in libdatadog, covered by `cargo test -p datadog-sidecar ffe_flusher`). - Add `Internal/DefaultEvaluationCompletedHook` and `Internal/CompositeEvaluationCompletedHook` so production callers go through a composite hook factory. In this PR the composite contains only `ExposureHook`; the metrics PR (#3911) contributes `EvaluationMetricHook` and the file conflict at merge resolves by combining both. Update `Client::create()` to call `DefaultEvaluationCompletedHook::create()`. C/Rust bridge: - Declare `ddog_ByteSlice` (and underlying `ddog_Slice_U8`) in `components-rs/common.h` for the metrics path; declare both `ddog_sidecar_send_ffe_exposures` and `ddog_sidecar_send_ffe_metrics` in `components-rs/sidecar.h`. - Add C wrappers `ddtrace_sidecar_send_ffe_exposures(zend_string *)` and `ddtrace_sidecar_send_ffe_metrics(zend_string *endpoint, zend_string *payload_bytes)` in `ext/sidecar.{h,c}` that call the FFI with the current sidecar transport + instance id + queue id. - Declare native PHP functions `\DDTrace\send_ffe_exposures(string): bool` and `\DDTrace\send_ffe_metrics(string, string): bool` in `ext/ddtrace.stub.php`; add corresponding arginfo entries and `ZEND_FUNCTION` registrations in `ext/ddtrace_arginfo.h`; implement `PHP_FUNCTION(DDTrace_send_ffe_exposures)` and `PHP_FUNCTION(DDTrace_send_ffe_metrics)` in `ext/ddtrace.c`. - Bump `libdatadog` submodule to FFE branch tip `29762335c` (which provides both FFIs). The submodule will be bumped to the libdatadog main commit once #2026 merges. Docs: - Add `docs/php-ffe-stack/{stack,system}-pr3910.{mmd,png}` for this PR. Validation: - `php vendor/bin/phpunit --config phpunit.xml tests/api/Unit/FeatureFlags` → 41 tests, 174 assertions, OK. - libdatadog sidecar tests (`cargo test -p datadog-sidecar ffe_flusher`) → 3 passed, on the pinned submodule commit. - Mermaid PNGs regenerate via `npx @mermaid-js/mermaid-cli`. `make test_featureflags` and `make test_c TESTS=tests/ext/ffe/...` will run in CI; running them locally requires rebuilding the extension which is gated behind libdatadog #2026 merging.
Adds the M3 evaluation-metrics layer on top of the hook PR (#3909) as a sibling of the EVP exposures PR (#3910). Records `feature_flag.evaluations` for both PHP 7 (DD Client hook) and PHP 8 (OpenFeature SDK hook); both paths share `EvaluationMetricHook::sharedWriter()` for unified aggregation. OTLP/protobuf payloads are encoded in PHP via the existing `OtlpMetricEncoder` and delivered to the user-configured OTLP HTTP metrics intake through the libdatadog sidecar (`ddog_sidecar_send_ffe_metrics` FFI added in DataDog/libdatadog#2026). This branch is force-pushed (user-authorized one-time exception to the no-force-push rule, 2026-05-23) to restructure history away from being linearly stacked on the M2 exposures PR (#3910). The PR now stacks directly on the hook PR (#3909) as a sibling of the EVP PR. PHP side: - Add `Internal/Metric/EvaluationMetricWriter` with bounded series aggregation, drop accounting, and shutdown flush. - Add `Internal/Metric/EvaluationMetricHook` (DD Client hook) and `OtlpMetricEncoder` (PHP 7-safe protobuf encoding). - Add `Internal/Metric/SidecarOtlpMetricsTransport` that calls `\DDTrace\send_ffe_metrics()` (FFI declared in #3910). Endpoint resolution: `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT`, falling back to `OTEL_EXPORTER_OTLP_ENDPOINT + /v1/metrics`, default `http://localhost:4318/v1/metrics`. - Add `DDTrace\OpenFeature\EvalMetricsHook` implementing `OpenFeature\interfaces\hooks\Hook` (after + error stages), registered on `DataDogProvider` via `setHooks()`. - `DataDogProvider` constructs its internal DD `Client` with `DefaultEvaluationCompletedHook::createWithoutMetric()` so the OpenFeature path records the metric via the OpenFeature hook (PR 3911 scope) and NOT via the DD Client hook — preventing double-counting. PHP 7 path keeps recording via the DD Client hook. - Add `Internal/CompositeEvaluationCompletedHook` and `Internal/DefaultEvaluationCompletedHook` (metric-only composite). This is the merge-conflict point with PR #3910's `[ExposureHook]` composite — second merge resolves by combining both hooks. - Update `Client::create()` to call `DefaultEvaluationCompletedHook::create()`. - Drop the obsolete `testOtlpTransportBuildsHttpProtobufRequest` PHPUnit test (HTTP construction now lives in libdatadog, covered by `cargo test -p datadog-sidecar ffe_metrics_flusher`). - Add `_files_openfeature.php` entry for `EvalMetricsHook.php`. C/Rust bridge: the `\DDTrace\send_ffe_metrics()` native function, its C wrapper `ddtrace_sidecar_send_ffe_metrics()`, and the `ddog_sidecar_send_ffe_metrics` FFI declaration in `components-rs/sidecar.h` were already added in #3910. This PR's branch picks up those changes once #3910 merges (or via the same libdatadog submodule pin during review). For development locally the libdatadog submodule is pinned to the FFE branch tip (`29762335c`). Docs: - Add `docs/php-ffe-stack/{stack,system}-pr3911.{mmd,png}` per the 4-PR documentation convention. Validation: - `php vendor/bin/phpunit --config phpunit.xml tests/api/Unit/FeatureFlags` → 40 tests, 160 assertions, OK. - Mermaid PNGs regenerate via `npx @mermaid-js/mermaid-cli`. `make test_featureflags`, OpenFeature PHPUnit, and ffe-dogfooding end-to-end validation will run in CI / are validated separately by FOLLOW-05 Steps 4–5.
The C/Rust bridge for the new native PHP functions \DDTrace\send_ffe_exposures() and \DDTrace\send_ffe_metrics() lives on the M2 EVP exposures PR (#3910) because that PR introduced the bridge when refactoring the exposure transport. PR #3911 (this PR) needs the same bridge for its OTLP metrics transport — without it the SidecarOtlpMetricsTransport silently drops batches because function_exists('\\DDTrace\\send_ffe_metrics') is false. Adds the same bridge files here so the M3 branch is independently compilable. At merge time the two PRs will conflict at the file level on these bridge files; resolution is deduplication (the bridge is identical in both PRs by design). Files added/modified: - components-rs/sidecar.h: declares ddog_sidecar_send_ffe_exposures and ddog_sidecar_send_ffe_metrics FFIs. - components-rs/common.h: declares ddog_ByteSlice typedef for the metrics payload. - ext/sidecar.h, ext/sidecar.c: C wrappers ddtrace_sidecar_send_ffe_exposures() and ddtrace_sidecar_send_ffe_metrics(). - ext/ddtrace.stub.php, ext/ddtrace_arginfo.h, ext/ddtrace.c: declares the native PHP functions and the PHP_FUNCTION implementations.
Pulls in libdatadog commit `875ec8f0e` ("fix(sidecar): dispatch FFE
actions before application-entry check"). Without this fix, the
`SidecarOtlpMetricsTransport::send()` call from PHP would silently
no-op for short-lived processes: the sidecar received the
`FfeMetrics` action but dropped it because the `Entry::Occupied`
gate on the application metadata had not yet fired.
This unblocks the parametric system-test
`Test_Feature_Flag_Parametric_Evaluation_Metrics::test_php_ffe_evaluation_metric`
which exercises the full PHP -> sidecar -> OTLP-HTTP-intake path
end-to-end. Local result: 26/27 FFE-scoped parametric tests pass
(remaining failure is the EVP exposure test, which lives on the
M2 PR #3910 branch).
Pulls in libdatadog commit `875ec8f0e` ("fix(sidecar): dispatch FFE
actions before application-entry check") so the EVP exposure batch sent
via `ddog_sidecar_send_ffe_exposures` is no longer silently dropped
when the PHP runtime hasn't yet registered the application against the
sidecar's `QueueId`. Same fix is on the sibling PR #3911 (`2a48c4987")
for the OTLP metric path.
Without this submodule bump, `Test_Feature_Flag_Parametric_Exposures::test_php_ffe_exposure_event`
sees zero EVP POSTs at the test-agent because the sidecar's
`enqueue_actions` handler discards the `FfeExposures` action under
the `Entry::Occupied` gate.
…macOS+colima
`build-debug-artifact` produces a `/output` bind mount via `-v
${TMP_OUT}:/output`. On macOS+colima only paths under $HOME are
mounted into the Linux VM; the macOS default `/var/folders/...` temp
dir is not, so writes from inside the container to `/output` land in
the VM and never propagate back to the host. The script exits without
error and the user's binaries dir is left with whatever stale artifact
was there before.
Pin TMP_OUT/TMP_PKG under $OUTPUT_DIR (which the user just passed as
an absolute, usable destination) so the bind mount is always on a
host-visible path. Inherits cleanup via the existing EXIT trap.
The same fix is independently on the M3 branch (#3911) as part of
e74b050; bringing it to the M2 branch (#3910) so both branches are
buildable on macOS without a `TMPDIR=$HOME/...` override.
…gh rez Same three fixes as the parallel commit on the M3 (PR #3911) branch: 1. Quote the YAML `title:` so the `#PR-number` is not parsed as a comment (previous render had the title truncated at "current ="). 2. `flowchart LR` → `flowchart TD` on system diagrams so vertical PHP-process → host → backend lanes stack vertically instead of getting squeezed into one wide row. 3. Render with `-w 2400 -H 2400 --scale 3 -b white` (~1900×2000 stack, ~3000×4600 system) instead of ~600px default.
…gh rez Same three fixes as on the M2 (#3910) and M3 (#3911) sibling branches: 1. Quote the YAML `title:` so the `#PR-number` survives parsing (otherwise YAML treats the `#` as a comment and the title renders as "PHP FFE 4-PR stack — current =" with the rest missing). 2. `flowchart LR` → `flowchart TD` on the system diagram so the PHP-process / host-sidecar / backend lanes stack vertically. 3. Render at 2400×2400 `--scale 3` instead of ~600px default.
Brings the PHP FFE diagram convention to the M1 PR. Each subsequent PR in the stack (#3909, #3910, #3911) already carried its own stack + system diagram; #3906 was missing them. Mirrors the format used by the rest of the stack: - `stack-pr3906.mmd` — the 4-PR stack with #3906 badged as current and the downstream layers shown as "future". - `system-pr3906.mmd` — the target end-to-end architecture with M1's scope (UserCode, OpenFeature Client, DataDogProvider, DDTrace FeatureFlags Client, NativeEvaluator, Remote Config client) highlighted, and everything from the Hook layer onward dashed. All conventions match the other branches: quoted YAML titles (to keep `#PR-number` out of the YAML comment parser), `flowchart TD` orientation, rendered with `-w 2400 -H 2400 --scale 3 -b white`.
Until the FFE self-telemetry channel lands, exposure batch drops are counted in an in-memory `$dropped` field that is only observable via `droppedCount()` (used by tests) and not surfaced anywhere in production. That's silent-data-loss territory. Add a one-time `error_log()` breadcrumb on the first drop in a process plus TODO(FFE-self-telemetry) markers at both drop sites (buffer overflow + transport flush failure). The `maybeWarnFirstDrop` helper and the warning go away when the real drop-counter metric is wired up.
…silently drop When the series map reaches `seriesLimit` (default 1000 unique attribute-sets), the writer was returning false and incrementing `dropped` for every new key. In PHP-FPM/Apache that was harmless because `register_shutdown_function` fires per request and the map empties at the end of each request. In long-running PHP runtimes — Swoole, RoadRunner, FrankenPHP/Octane, CLI worker loops — the shutdown function only fires when the worker process exits, so the series map filled once and then every new unique attribute-set was silently dropped for the rest of the worker's lifetime (hours or days). Flush inline when the cap is hit so the new key fits. Same fix as the parallel commit on the M2 (PR #3910) ExposureWriter. Test `testSeriesOverflowDropsNewSeriesButKeepsExistingSeries` renamed and rewritten to assert the new auto-flush behavior: `droppedCount() === 0` after three records that would have dropped one under the old code; transport receives three batches instead of one. Caught by Codex review on the parallel M2 commit.
… drop The default `ExposureHook` only registers a `register_shutdown_function` flush. In PHP-FPM/Apache that fires at the end of each request, so the buffer empties between requests and the `bufferLimit` cap is effectively per-request. In long-running PHP runtimes — Swoole, RoadRunner, FrankenPHP/Octane, CLI worker loops — the shutdown callback only fires when the worker process exits. The buffer fills once and every subsequent unique exposure is silently dropped for the rest of the worker's lifetime (hours or days). Flush inline when the buffer hits `bufferLimit`. The flush call clears the buffer regardless of transport success, so the append below always finds capacity. If the inline flush itself fails (transport error), the already-buffered events are counted as dropped — same semantics as an explicit failed flush — and the new event is buffered for the next flush attempt. Test `testFullBufferDropsWithoutPoisoningDedupCache` renamed and rewritten to assert the new behavior: `droppedCount() === 0` after a buffer-full record() that previously would have dropped. New test `testFullBufferAutoFlushPropagatesTransportFailure` covers the case where the inline flush itself fails. Caught by Codex (P2) review.
Motivation
This PR emits server-side EVP exposures for PHP feature-flag evaluations when the evaluator marks the result with
doLog=true. It adds batching, deduplication, and sidecar enqueueing on top of the hook from #3909.PHP still performs no HTTP I/O. Exposure delivery goes through the libdatadog sidecar and then the Datadog Agent EVP proxy.
Planning/reference doc: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0
Decisions
doLog=trueis the exposure gate./evp_proxy/v2/api/v2/exposures; FFE does not support direct/agentless EVP exposure delivery.droppedCount()covers PHP-side encode/enqueue/transport failure only. It does not observe sidecar-to-Agent HTTP delivery failure after enqueue succeeds.subject.id="", matching the runtime evaluation empty-targeting-key behavior.error_logbreadcrumb is emitted on the first PHP-side exposure drop until real FFE self-telemetry lands.Where this PR fits in the stack
This is the EVP exposures layer. It is a sibling of #3911 (metrics) under the hook PR #3909.
Where this PR fits in the target system
This PR contributes the
ExposureWriter, theddog_sidecar_send_ffe_exposuresFFI path through the libdatadog sidecar, and the Agent EVP proxy leg. The metric path is handled by #3911.Changes
ExposureWriterwith bounded batching, LRU dedup keyed by(flag key, subject id), drop accounting, inline flush on full buffer, and shutdown flush.SidecarExposureTransport, which JSON-encodes batches and forwards them through\DDTrace\send_ffe_exposures().Client::create()and the default OpenFeature provider throughDefaultEvaluationCompletedHook::create(), returning a composite of[ExposureHook].libdatadogsubmodule to875ec8f0e, the FFE branch commit that contains both sidecar FFIs and the dispatch-before-application-gate fix.doLog, payload shape, empty targeting key, dedup construction, buffer overflow/flush behavior, transport failure drop accounting, and shutdown flush.Dependencies
Questions for reviewers
error_logon first exposure drop acceptable until FFE self-telemetry lands, or should this be debug/self-telemetry-only?