Skip to content

Add Sourcepoint first-party CMP integration#625

Open
ChristianPavilonis wants to merge 18 commits intomainfrom
feature/sourcepoint-integration
Open

Add Sourcepoint first-party CMP integration#625
ChristianPavilonis wants to merge 18 commits intomainfrom
feature/sourcepoint-integration

Conversation

@ChristianPavilonis
Copy link
Copy Markdown
Collaborator

@ChristianPavilonis ChristianPavilonis commented Apr 8, 2026

Summary

  • Adds a new Sourcepoint CMP (Consent Management Platform) first-party proxy integration that routes cdn.privacy-mgmt.com and geo.privacymanager.io traffic through Trusted Server, eliminating third-party requests for consent management assets.
  • Implements four rewriting layers (HTML attributes, JS response bodies, runtime config trap, client-side DOM guard) to ensure comprehensive URL coverage across static markup, dynamically loaded scripts, and Next.js hydration payloads.
  • Ships disabled by default with full TOML configuration, integration documentation, 14 Rust unit tests, and 6 TypeScript tests.

Changes

File Change
crates/trusted-server-core/src/integrations/sourcepoint.rs New ~860-line integration module: route registration, CDN/geo proxy handlers, HTML attribute rewriter, JS body rewriter (rewrite_script_content), head injector (window._sp_ property trap), Accept-Encoding scoping, backend selection, and 14 unit tests
crates/trusted-server-core/src/integrations/mod.rs Register sourcepoint module and wire it into the integration registry
crates/js/lib/src/integrations/sourcepoint/index.ts TypeScript entry point that bootstraps the client-side script guard
crates/js/lib/src/integrations/sourcepoint/script_guard.ts Client-side DOM insertion dispatcher handler that intercepts dynamically inserted <script>/<link> elements pointing at Sourcepoint CDN and rewrites them to first-party paths
crates/js/lib/test/integrations/sourcepoint/script_guard.test.ts 6 Vitest tests covering the script guard (rewrites, skips, and teardown)
docs/guide/integrations/sourcepoint.md Integration documentation: overview, setup steps, config reference, architecture
docs/guide/integrations-overview.md Updated overview with Sourcepoint in the Mermaid flowchart, performance characteristics table, and environment variables section
trusted-server.toml Added [integrations.sourcepoint] config stanza (disabled by default)

Closes

Closes #145
Closes #344
Closes #345

Test plan

  • cargo test --workspace
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • JS tests: cd crates/js/lib && npx vitest run
  • JS format: cd crates/js/lib && npm run format
  • Docs format: cd docs && npm run format
  • WASM build: cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1
  • Manual testing via fastly compute serve

Checklist

  • Changes follow CLAUDE.md conventions
  • No unwrap() in production code — use expect("should ...")
  • Uses log macros (not println!)
  • New code has tests
  • No secrets or credentials committed

@ChristianPavilonis ChristianPavilonis self-assigned this Apr 8, 2026
Comment thread crates/js/lib/src/integrations/sourcepoint/script_guard.ts Fixed
Comment thread crates/js/lib/src/integrations/sourcepoint/script_guard.ts Fixed
Comment thread crates/js/lib/src/integrations/sourcepoint/script_guard.ts Fixed
@aram356 aram356 requested review from aram356 and prk-Jr April 9, 2026 16:36
Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Adds a well-structured Sourcepoint CMP first-party proxy with four rewriting layers (HTML attributes, JS body, runtime config trap, client-side DOM guard). The implementation follows existing integration patterns closely, has 14 Rust + 6 JS tests, and ships disabled by default. A few items need attention before merge.

Blocking

🔧 wrench

  • take_body_str() panics on non-UTF-8 upstream responses: response.take_body_str() at line 400 will panic if the upstream CDN returns invalid UTF-8 with a JS Content-Type. Read bytes and attempt conversion with a fallback instead. (sourcepoint.rs:400)
  • CI failure — format-typescript: Prettier check is failing on test/integrations/sourcepoint/script_guard.test.ts. Must be fixed before merge.

❓ question

  • Redirect Location headers not rewritten: The JS body rewrite is gated on status == 200. If the CDN returns a 3xx redirect with a Location pointing to cdn.privacy-mgmt.com, the browser follows it back to the third-party CDN, defeating the proxy. (sourcepoint.rs:394)

Non-blocking

🤔 thinking

  • Upstream response headers forwarded verbatim: Set-Cookie and other headers from cdn.privacy-mgmt.com pass through to the client on the non-rewritten path. Same as Didomi, but worth confirming this is intended. (sourcepoint.rs:418)
  • Regex matches mismatched quotes: The CDN URL pattern can match 'url" (single open, double close). Extremely unlikely in minified JS but could produce malformed output. (sourcepoint.rs:65)

♻️ refactor

  • Redundant rewrite_sdk guard: Checked in both handles_attribute() and rewrite() — the second is unreachable. (sourcepoint.rs:437)
  • Head injector config property names: Hardcoded Sourcepoint config keys could go stale. A test asserting known property names appear in the generated script would catch omissions. (sourcepoint.rs:467)

🌱 seedling

  • No body size limit: The body is read entirely into memory with no upper bound. A Content-Length check would prevent OOM from unexpected large responses. (sourcepoint.rs:400)
  • Missing validation test for invalid cdn_origin: No test proves that register() fails when cdn_origin is not a valid URL. (sourcepoint.rs:153)

⛏ nitpick

  • Manual [sourcepoint] log prefix: Other integrations rely on the log crate's module path rather than a manual bracketed prefix. (sourcepoint.rs:355)

CI Status

  • cargo fmt: PASS
  • cargo test: PASS
  • vitest: PASS
  • format-typescript: FAIL (Prettier issue in script_guard.test.ts)
  • format-docs: PASS
  • browser integration tests: PASS
  • integration tests: PASS
  • CodeQL: FAIL (likely infra)

Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed commit a211eb0. Three blocking issues require fixes before merge; the rest are non-blocking.


Blocking (must fix before merge)

B1 — CI: format-typescript still failing on current HEAD

The latest commit (a211eb02) re-introduced a Prettier violation in crates/js/lib/test/integrations/sourcepoint/script_guard.test.ts. CI's format-typescript check will fail on this commit. Fix with:

cd crates/js/lib && npm run format -- --write

then commit the result.

B2 — SSRF via cdn_origin (inline comment on line 96)

See inline comment. A syntactically valid but host-unrestricted cdn_origin allows proxying to any IP including 169.254.169.254 (cloud metadata). Needs a custom host validator.

B3 — PUT/PATCH unreachable (inline comment on line 337)

routes() only registers GET and POST; the PUT/PATCH arms in handle() are dead code. Either register the methods or remove the arms.


Important (non-blocking)

I1 — Single-quoted '/unified/' missed by SP_ORIGIN_UNIFIED_PATTERN (inline on line 81) — fix is a one-character change to the regex character class.

I2 — BackendConfig::from_url per request (inline on line 357) — pre-compute in new() following the pattern in aps.rs / prebid.rs.

I3 — PR checklist says "Uses tracing macros" but codebase uses log crate

The checklist item reads "Uses tracing macros (not println!)" but CLAUDE.md specifies log. The source correctly uses log::info! throughout. The checklist entry should read "Uses log macros (not println!)" to avoid confusion for future contributors.

I4 — JS-rewrite path drops all upstream headers (inline on line 403) — Response::new() discards Vary, CORS, and security headers. Start from response.clone_without_body() instead.

I5 — No test for build_target_url with a path-bearing cdn_origin (inline on line 337) — set_path silently drops any path prefix in cdn_origin. Should be tested and either fixed or documented as an explicit constraint.


Suggestions (non-blocking)

S1 — URL normalisation logic duplicated across Rust and TypeScript

parse_sourcepoint_url in sourcepoint.rs and normalizeSourcepointUrl in script_guard.ts implement identical protocol-relative URL normalisation in parallel. Not a defect, but a cross-reference comment in each file would prevent future fixes being applied to only one side.

S2 — Redundant rewrite_sdk guard in rewrite() (inline on line 437) — unreachable because handles_attribute already gates on it. Either remove with a comment or document the intentional belt-and-suspenders.

S3 — No test for single-quoted CDN URLs (inline on line 81)

S4 — Unchecked WASM build in PR checklist

The test plan has an unchecked - [ ] WASM build item, but CI's WASM build passes. The checkbox should be checked to accurately reflect build status.

S5 — register missing # Examples doc section (inline on line 314) — CLAUDE.md requires this for all public API functions.

Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
ChristianPavilonis and others added 3 commits April 10, 2026 10:48
- Replace #[validate(url)] with a custom validator that restricts
  cdn_origin to *.privacy-mgmt.com hosts, preventing SSRF via
  arbitrary origins (e.g. cloud metadata endpoints).
- Remove unreachable PUT/PATCH arms from the request body match
  since routes() only registers GET and POST.
- Fix Prettier formatting in script_guard.test.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace take_body_str() with take_body_bytes() + String::from_utf8()
  to avoid panicking on non-UTF-8 upstream responses.
- Rewrite Location headers on 3xx redirects that point to
  cdn.privacy-mgmt.com so browsers stay on the first-party proxy.
- Preserve upstream CORS headers on the JS-rewrite path instead of
  discarding them when building a fresh Response.
- Extend SP_ORIGIN_UNIFIED_PATTERN to match both single- and
  double-quoted "/unified/" chunk paths, preserving the original
  quote character in the replacement.
- Normalise log prefixes from [sourcepoint] to Sourcepoint: for
  consistency with APS/Prebid style.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant rewrite_sdk check from rewrite() since
  handles_attribute() already gates on it; update test to verify
  the guard at the handles_attribute level.
- Add # Examples section to register() per documentation standards.
- Add tests for cdn_origin validation (rejects non-privacy-mgmt.com
  hosts, accepts valid origins).
- Add test for single-quoted origin+'/unified/' rewrite pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ChristianPavilonis
Copy link
Copy Markdown
Collaborator Author

Review feedback addressed

All blocking and important findings have been fixed across three commits. Replies posted on each inline thread.

Blocking (fixed)

  • SSRF guard — Custom validate_cdn_origin() restricts host to *.privacy-mgmt.com
  • PUT/PATCH dead code — Removed unreachable arms
  • Prettier — Formatting fixed

Important non-blocking (fixed)

  • UTF-8 safetytake_body_bytes() + String::from_utf8() with graceful fallback
  • Redirect rewriting — 3xx Location headers containing CDN host are rewritten
  • CORS headers — JS-rewrite path now preserves upstream CORS headers
  • Single-quote regexSP_ORIGIN_UNIFIED_PATTERN handles both ' and " quotes
  • Log prefix — Normalised to Sourcepoint: style

Suggestions (fixed)

  • Removed redundant rewrite_sdk guard, added invariant comment
  • Added # Examples to register() docs
  • Added validation tests for cdn_origin
  • Added test for single-quoted unified path

Intentionally skipped (with rationale in thread replies)

  • Regex mismatched quotes — documented limitation, no real-world impact
  • BackendConfig::from_url per-request — matches APS/Prebid pattern
  • Body size limit — future enhancement
  • Config property name test — covered by existing head injector test

All CI checks pass locally: cargo fmt, clippy, cargo test, vitest, Prettier.

Refactor normalizeSourcepointUrl to remove the bare-domain startsWith
check that triggered CodeQL "Incomplete URL substring sanitization"
alerts. The host === exact match was already the security boundary;
now the normalization layer no longer references the CDN hostname at
all, eliminating the static analysis finding.

Add a Content-Length guard (5 MB) before reading upstream response
bodies into memory for JavaScript rewriting, preventing unbounded
memory consumption from unexpectedly large responses.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@aram356 aram356 linked an issue Apr 13, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Well-executed revision — every prior blocker is addressed and the four-layer rewriting architecture (HTML attrs / JS body / window._sp_ trap / client DOM guard) is cleanly separated. Two remaining correctness issues in the upstream response handling need fixes before merge: the body-size guard is bypassed when Content-Length is absent, and the UTF-8 fallback path silently drops upstream headers. A few non-blocking refinements around redirect rewriting, config scope, and test coverage follow.

Blocking

🔧 wrench

  • Body-size guard bypassed on chunked/unknown-length responsesMAX_REWRITE_BODY_SIZE only applies when Content-Length is present; otherwise take_body_bytes() reads unbounded (sourcepoint.rs:456-470)
  • UTF-8 fallback drops upstream headers — the fallback constructs a fresh Response with only status + Content-Type, discarding Set-Cookie, CORS, Cache-Control, Vary, etc., diverging from the passthrough path (sourcepoint.rs:480-488)

Non-blocking

♻️ refactor

  • Redirect Location rewrite is a literal String::replace — misses protocol-relative //cdn.privacy-mgmt.com/... values (sourcepoint.rs:430-438)
  • validate_cdn_origin does not check schemeftp://cdn.privacy-mgmt.com passes validation and only fails at the first proxied request (sourcepoint.rs:135-150)

🤔 thinking

  • Config/rewriter host scope mismatchvalidate_cdn_origin accepts *.privacy-mgmt.com but rewriters hardcode cdn.privacy-mgmt.com; configuring a different subdomain silently half-enables the integration (sourcepoint.rs:143, 69, 197, 598)
  • Rewritten-JS path hard-codes Cache-Control — diverges from the passthrough path's apply_cache_headers policy without explanation (sourcepoint.rs:498-501)
  • Property trap only catches reassignment of window._sp_ — nested mutation (window._sp_.config.X = ...) is not patched; also s.replace() in the JS helper is first-occurrence only (sourcepoint.rs:568-600)

🌱 seedling

  • copy_headers forwards Authorization — credential-leak footgun if publishers reuse bearer tokens across origins (sourcepoint.rs:222-233)
  • No direct test for handle() — redirect rewriting, UTF-8 fallback, Content-Length guard, Content-Type gating all lack unit coverage (sourcepoint.rs:372-522)

👍 praise

  • Solid response to the prior review — every prior 🔧/❓ addressed with well-chosen fixes; four-layer rewriting architecture is cleanly separated and well documented

CI Status

  • cargo fmt: PASS
  • cargo clippy: PASS
  • cargo test: PASS
  • vitest: PASS
  • format-typescript: PASS
  • format-docs: PASS
  • browser integration tests: PASS
  • CodeQL: PASS

Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
…t rewrite

- Skip JS rewrite when Content-Length is absent (chunked/unknown) to
  prevent unbounded memory reads
- Reuse original response on UTF-8 fallback to preserve upstream headers
- Extract redirect Location rewriting into a URL-parsing helper that
  handles absolute, protocol-relative, and relative redirects
- Tighten cdn_origin validation to exactly cdn.privacy-mgmt.com and
  reject non-HTTP(S) schemes
- Drop Authorization from forwarded headers to prevent credential leakage
- Document Cache-Control divergence and property trap limitations
- Add 7 new tests for validation and redirect rewriting
Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Round-3 blockers are all correctly addressed — UTF-8 fallback now reuses the original response (preserving headers), Content-Length-missing short-circuits the rewrite before take_body_bytes, rewrite_redirect_location handles absolute / protocol-relative / relative via Url::join, validate_cdn_origin checks scheme and pins the host, and Authorization is excluded from forwarded headers. No remaining 🔧/❓ findings. Submitting as change-request per author request; the items below are refinements (🤔/♻️/🌱/⛏) rather than correctness blockers — each can be resolved or deferred to a follow-up at the author's discretion.

Non-blocking

♻️ refactor

  • apply_cache_headers invoked at five return sites — redirect branch, CL-too-large branch, CL-missing branch, UTF-8-fallback branch, and the fallthrough (sourcepoint.rs:476, 505, 513, 528, 567). Restructuring handle() around a single exit point (or wrapping the proxied Response in a post-handle adapter) would remove the per-branch bookkeeping. Low priority.

🌱 seedling

  • No direct test for handle() (sourcepoint.rs:414-569) — redirect rewriting, UTF-8 fallback, CL-missing skip, CL-too-large skip, and non-JS passthrough branches all lack unit coverage. Acknowledged in the round-3 response; tracking here for follow-up.
  • CSP interaction not documented (docs/guide/integrations/sourcepoint.md) — publishers with Content-Security-Policy: script-src https://cdn.privacy-mgmt.com will see rewritten first-party scripts blocked the moment they enable the integration. A note under "HTML Rewriting" calling out that CSP must allow the first-party host would save an incident report.

CI Status

  • cargo fmt: PASS
  • cargo clippy: PASS
  • cargo test: PASS
  • vitest: PASS
  • format-typescript: PASS
  • format-docs: PASS
  • browser integration tests: PASS
  • CodeQL: PASS

Comment thread crates/js/lib/src/integrations/sourcepoint/index.ts
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs Outdated
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/js/lib/src/integrations/sourcepoint/script_guard.ts
@ChristianPavilonis ChristianPavilonis removed the request for review from prk-Jr April 21, 2026 13:59
@ChristianPavilonis
Copy link
Copy Markdown
Collaborator Author

Implemented Sourcepoint follow-up updates and pushed commit.

What I changed:

  • Added __tsjs_sourcepoint runtime config bootstrap in Sourcepoint head inject and gated client-side guard installation with it.
  • Sourcepoint head injection now always emits config script; runtime trap is only included when rewrite_sdk is enabled.
  • Added strict cdn_origin validation to reject URL paths.
  • Preserved response headers on JS rewrite success path while normalizing encoding headers and content type.
  • Added regression coverage for the above behaviors.

I also resolved remaining open review threads and re-requested review from @prk-Jr.

Validation:

  • cargo test --workspace
  • cargo clippy --package trusted-server-core --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • cd crates/js/lib && npx vitest run
  • cd crates/js/lib && npm run -s format -- --check

Commit: e2fdb0c

Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

The Sourcepoint integration covers the expected rewrite layers, but I found two blocking issues around cookie semantics plus one follow-up consistency concern in the JS guard.

Blocking

🔧 wrench

  • Sourcepoint cookie state never makes the round trip: the proxy does not forward any Cookie header upstream, so Sourcepoint cookies previously set on the first-party endpoint are never sent back to Sourcepoint. That breaks the main baseEndpoint / CNAME flow and authenticated-consent-style cookie-backed state. (crates/trusted-server-core/src/integrations/sourcepoint.rs:256)

  • Cookie-setting Sourcepoint responses are still marked public-cacheable: both the passthrough path and the rewritten-JS path can return Cache-Control: public, max-age=... even when upstream sends Set-Cookie. If Sourcepoint uses server-set cookies on the first-party endpoint, that risks caching user-specific consent state. (crates/trusted-server-core/src/integrations/sourcepoint.rs:269, crates/trusted-server-core/src/integrations/sourcepoint.rs:377)

Non-blocking

🤔 thinking

  • JS and Rust host checks diverge on explicit ports: the client-side DOM guard compares URL.host, while the Rust rewriter compares hostname-only. That creates inconsistent behavior for URLs like https://cdn.privacy-mgmt.com:443/.... (crates/js/lib/src/integrations/sourcepoint/script_guard.ts:35, crates/trusted-server-core/src/integrations/sourcepoint.rs:228)

CI Status

  • fmt: PASS
  • clippy: FAIL
  • rust tests: FAIL
  • js tests: PASS
  • integration artifact prep: FAIL

Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/trusted-server-core/src/integrations/sourcepoint.rs
Comment thread crates/js/lib/src/integrations/sourcepoint/script_guard.ts Outdated
@ChristianPavilonis
Copy link
Copy Markdown
Collaborator Author

Addressed the latest review round in commit 1a40868 and pushed to feature/sourcepoint-integration.

What changed:

  • Forwarded a narrowly scoped Sourcepoint cookie allowlist upstream instead of dropping the Cookie header
  • Added optional auth_cookie_name support for authenticated-consent deployments
  • Forced cookie-bearing Sourcepoint responses to Cache-Control: private, no-store on both passthrough and rewritten-JS paths
  • Aligned the client-side Sourcepoint guard with Rust by switching from URL.host to URL.hostname
  • Added Rust + Vitest regression coverage and updated Sourcepoint docs/sample config

Validation:

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test --workspace
  • cd crates/js/lib && npx vitest run

I replied to and resolved the remaining review threads and re-requested review from @prk-Jr and @aram356.

Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Four-layer rewriting architecture (HTML attrs / JS body regex / window._sp_ trap / client DOM guard) is well-structured, and most prior 🔧 / ❓ findings have been addressed across rounds 1–4 (SSRF guard, UTF-8 safety, redirect rewriting, cookie allowlist, no-store on cookie responses, hostname-only host checks, scheme/host/path/query/fragment validation). However, a stale merge from main introduced a hard build break that fails three CI jobs.

Blocking

🔧 wrench

  • Build break: IntegrationProxy::handle signature drifted from the trait#581 added services: &RuntimeServices to the trait; the merge from main did not update Sourcepoint's impl. cargo fmt, cargo test, and prepare integration artifacts all fail with E0050. (sourcepoint.rs:540-549)
  • Use platform-agnostic client_ip from RuntimeServicescopy_headers calls Fastly-specific original_req.get_client_ip_addr(). With RuntimeServices now plumbed through, switch to services.client_info().client_ip to match Didomi's pattern and keep trusted-server-core portable. (sourcepoint.rs:269-272)

Non-blocking

🤔 thinking

  • Property-trap config field allowlist is hardcoded and silently goes stale — only baseEndpoint, mmsDomain, wrapperAPIOrigin, cmpOrigin, and metricUrl are patched. New Sourcepoint fields would silently leak. A snapshot test asserting the exact patched set would force review when the list changes. (sourcepoint.rs:735-742)
  • auth_cookie_name is unvalidated — empty string, whitespace, or ; / = characters are accepted. Empty string especially: a malformed cookie pair =value parses to name "", which auth_cookie_name = "" would forward. Add #[validate] (non-empty trimmed, length ≤ 64, [A-Za-z0-9_-]). (sourcepoint.rs:124-129)

♻️ refactor

  • is_likely_javascript_path matches non-JS API paths/wrapper/v2/* is JSON but is treated as JS-likely, forcing Accept-Encoding: identity for every CMP API hit. Tightening the heuristic (e.g. only .js and /unified/) preserves the optimisation only where applicable. (sourcepoint.rs:431-433)

📌 out of scope (carryover from earlier rounds, still open)

  • No direct unit test for handle() — redirect rewriting, UTF-8 fallback, Content-Length missing/too-large branches still lack coverage.
  • apply_cache_headers invoked at five return sites in handle() — restructuring around a single exit point would remove the per-branch bookkeeping.
  • CSP interaction undocumented — publishers with Content-Security-Policy: script-src https://cdn.privacy-mgmt.com will see rewritten first-party scripts blocked the moment they enable rewriting. A note in docs/guide/integrations/sourcepoint.md under "HTML Rewriting" would save an incident report.

CI Status

  • cargo fmt: FAIL (caused by trait-signature drift)
  • cargo test: FAIL (caused by trait-signature drift)
  • prepare integration artifacts: FAIL (caused by trait-signature drift)
  • vitest: PASS
  • format-typescript: PASS
  • format-docs: PASS
  • CodeQL: PASS
  • Analyze (rust / actions / javascript-typescript): PASS

let method = req.get_method().clone();
let target_path = Self::strip_cdn_prefix(&path).ok_or_else(|| {
Report::new(Self::error(format!("Unknown Sourcepoint route: {path}")))
})?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — Build break: IntegrationProxy::handle signature drifted from the trait.

#581 (Add PlatformHttpClient and PlatformBackend traits) landed on main while this PR was open and added a services: &RuntimeServices parameter to IntegrationProxy::handle. The merge from main on e296d627 did not update Sourcepoint's impl, leaving 3 params where 4 are required.

error[E0050]: method `handle` has 3 parameters but the declaration in trait
              `IntegrationProxy::handle` has 4
   --> crates/trusted-server-core/src/integrations/sourcepoint.rs:541:9

Three CI jobs (cargo fmt, cargo test, prepare integration artifacts) all fail on this. Fix:

async fn handle(
    &self,
    _settings: &Settings,
    services: &RuntimeServices,
    req: Request,
) -> Result<Response, Report<TrustedServerError>> {

and use crate::platform::RuntimeServices; at the top.

fn copy_headers(&self, original_req: &Request, proxy_req: &mut Request) {
if let Some(client_ip) = original_req.get_client_ip_addr() {
proxy_req.set_header("X-Forwarded-For", client_ip.to_string());
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — Use platform-agnostic client_ip from RuntimeServices (companion to the trait-signature fix).

copy_headers calls original_req.get_client_ip_addr(), which is a Fastly-specific API. Now that RuntimeServices is plumbed through handle, switch to services.client_info().client_ip and pass it as a parameter to copy_headers, matching the pattern Didomi adopted in didomi.rs:200-226. This keeps trusted-server-core portable across platform backends.

// In handle():
self.copy_headers(services.client_info().client_ip, &req, &mut proxy_req);

// In copy_headers signature:
fn copy_headers(&self, client_ip: Option<IpAddr>, original_req: &Request, proxy_req: &mut Request) {
    if let Some(ip) = client_ip {
        proxy_req.set_header("X-Forwarded-For", ip.to_string());
    }
    // ...
}

"if(typeof o.config.cmpOrigin===\"string\")o.config.cmpOrigin=r(o.config.cmpOrigin);",
"}}",
"if(typeof o.metricUrl===\"string\")o.metricUrl=r(o.metricUrl);",
"return o",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinking — Property-trap config field allowlist is hardcoded and silently goes stale.

The p() patcher rewrites only config.{baseEndpoint, mmsDomain, wrapperAPIOrigin, cmpOrigin} and metricUrl. If Sourcepoint adds a new URL field (e.g. events.endpoint, a cdnRoot), the rewrite silently misses it and the page leaks third-party requests. Given that this reaches into a vendor SDK we don't control, a snapshot test asserting the exact set of patched properties would force a code review whenever the list changes — re-raising aram356's earlier point. Consider also documenting how a publisher reports a missed field.

/// Sourcepoint's standard cookie set is allowlisted automatically.
/// Configure this only when the CMP uses a custom `authCookie` name and
/// that cookie must round-trip through the first-party proxy.
pub auth_cookie_name: Option<String>,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinkingauth_cookie_name is unvalidated.

Empty string, whitespace, or ; / = characters in auth_cookie_name would all be accepted at config load. Empty string in particular is problematic: a malformed cookie pair like =value parses to a name of "", which should_forward_sourcepoint_cookie("") would forward when auth_cookie_name = "". Suggest adding a #[validate] rule (non-empty after trim, length ≤ 64, [A-Za-z0-9_-]). Low likelihood but cheap to lock down.

/// `Accept-Encoding: identity` optimisation for that request.
fn is_likely_javascript_path(path: &str) -> bool {
path.ends_with(".js") || path.starts_with("/unified/") || path.starts_with("/wrapper/")
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ refactoris_likely_javascript_path matches non-JS API paths.

/wrapper/v2/messages is a JSON API but is treated as JS-likely, causing Accept-Encoding: identity for the upstream request. The runtime check (is_javascript_response based on Content-Type) correctly skips the rewrite, so this is not a correctness bug — but uncompressed JSON for /wrapper/v2/* is a mild perf cost on every CMP hit. Tightening the heuristic (e.g. only .js and /unified/) preserves the optimisation only where it can be applied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants