From 30ed7e7ff4bd0fa3a7d0ee82a06a43198bf927bc Mon Sep 17 00:00:00 2001 From: Eli Reisman Date: Sat, 27 Jun 2026 13:32:19 -0700 Subject: [PATCH 1/3] docs(capture): document capture_mode, capture_compression, and changeset Adds the Sampo changeset for the opt-in capture_mode (v1 ingestion protocol) and capture_compression, plus an AGENTS.md section mapping capture_mode/capture_compression and their env vars to the modules and routing that implement them, the v1 invariants to preserve, and the sync_mode blocking-retry behavior. User-facing usage stays in the official docs per the README convention. --- .sampo/changesets/capture-v1-mode.md | 7 +++++++ AGENTS.md | 20 ++++++++++++++++++++ 2 files changed, 27 insertions(+) create mode 100644 .sampo/changesets/capture-v1-mode.md diff --git a/.sampo/changesets/capture-v1-mode.md b/.sampo/changesets/capture-v1-mode.md new file mode 100644 index 00000000..4c438759 --- /dev/null +++ b/.sampo/changesets/capture-v1-mode.md @@ -0,0 +1,7 @@ +--- +pypi/posthog: minor +--- + +Add an opt-in `capture_mode` for the Capture V1 ingestion protocol (`POST /i/v1/analytics/events`). Set `capture_mode="v1"` on the client (or the `POSTHOG_CAPTURE_MODE=v1` environment variable) to use Bearer auth, per-event results, and partial retry. Defaults to `"v0"` (the legacy `/batch/` endpoint), so existing setups are unaffected. + +When using `capture_mode="v1"`, request bodies can be compressed via `capture_compression` (or `POSTHOG_CAPTURE_COMPRESSION`): `"gzip"`, `"deflate"`, or `"none"` (default). The legacy `gzip=True` flag is honored as a fallback. diff --git a/AGENTS.md b/AGENTS.md index 8a53f11d..1a9b1959 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -9,6 +9,26 @@ Guidance for coding agents working in `posthog-python`. - The project uses `uv` for local development. See `CONTRIBUTING.md` for setup. - Keep edits targeted and follow existing patterns. Prefer adding or updating tests near the behavior you change. +## Capture protocol (`capture_mode`) + +The client supports two ingestion wire protocols, selected by `capture_mode` (precedence: explicit `Client(capture_mode=...)` kwarg > `POSTHOG_CAPTURE_MODE` env var > default). + +- `"v0"` (default) — legacy `POST /batch/`. Upgrades stay transparent; existing callers are unaffected. +- `"v1"` — `POST /i/v1/analytics/events`: Bearer auth, a typed event `options` object, per-event results, and partial retry. + +v1 request bodies can additionally be compressed via `capture_compression` (precedence: explicit `Client(capture_compression=...)` kwarg > `POSTHOG_CAPTURE_COMPRESSION` env var > the legacy `gzip` flag > none). Supported values are `"none"`, `"gzip"`, and `"deflate"` (zlib-wrapped, RFC 1950, to match the server's decoder and the Go/Rust SDKs). v0 keeps using its own `gzip` flag; `capture_compression` is v1-only. + +Where the pieces live: + +- `posthog/capture_mode.py` — the `CaptureMode` enum and `resolve_capture_mode()` precedence logic. +- `posthog/capture_compression.py` — the `CaptureCompression` enum and `resolve_capture_compression()` precedence logic (with `gzip` fallback). +- `posthog/capture_v1.py` — pure transforms (`to_v1_event`, `build_v1_batch_body`) and transport (`post_v1`, `_compress_v1`, `parse_v1_response`, `send_v1_batch`, `CaptureV1Error`). +- Routing: `Consumer._send_analytics` (async) and `Client._enqueue` (sync) pick the analytics submitter by `capture_mode`. The dedicated `$ai_*` endpoint has no v1 form and always uses the legacy submitter. + +v1-specific behavior to preserve when editing: sentinel `$`-properties are lifted into `options` (coerced to native JSON types or omitted — a wrong type 400s the whole batch); top-level `$set`/`$set_once` are relocated into `properties`; only events the server tags `retry` are resent (stable `PostHog-Request-Id`/`created_at`, incrementing `PostHog-Attempt`); `429` is terminal. + +Retry blocking matches v0: in the default async mode retries happen on the background consumer thread, but with `sync_mode=True` the partial-retry loop (including its backoff sleeps) runs inline on the calling thread, so a slow/erroring endpoint blocks the caller until retries are exhausted. + ## Validation Useful checks: From bde3228abbc0472ab9f8948528d102aabcdc65ca Mon Sep 17 00:00:00 2001 From: Eli Reisman Date: Fri, 3 Jul 2026 09:59:59 -0700 Subject: [PATCH 2/3] docs(capture): note v1 drop surfacing and Retry-After minimum Document the two parity behaviors added to the v1 transport: server `drop` verdicts are accumulated across attempts and surfaced via CaptureV1Error/ on_error even on a 2xx, and Retry-After acts as a minimum (max of configured backoff and Retry-After, hard-capped) rather than a replacement. --- .sampo/changesets/capture-v1-mode.md | 2 ++ AGENTS.md | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/.sampo/changesets/capture-v1-mode.md b/.sampo/changesets/capture-v1-mode.md index 4c438759..54cc4b40 100644 --- a/.sampo/changesets/capture-v1-mode.md +++ b/.sampo/changesets/capture-v1-mode.md @@ -5,3 +5,5 @@ pypi/posthog: minor Add an opt-in `capture_mode` for the Capture V1 ingestion protocol (`POST /i/v1/analytics/events`). Set `capture_mode="v1"` on the client (or the `POSTHOG_CAPTURE_MODE=v1` environment variable) to use Bearer auth, per-event results, and partial retry. Defaults to `"v0"` (the legacy `/batch/` endpoint), so existing setups are unaffected. When using `capture_mode="v1"`, request bodies can be compressed via `capture_compression` (or `POSTHOG_CAPTURE_COMPRESSION`): `"gzip"`, `"deflate"`, or `"none"` (default). The legacy `gzip=True` flag is honored as a fallback. + +Per-event server verdicts are surfaced through the existing `on_error` handler: events the backend explicitly drops, or fails to accept after retries, raise a `CaptureV1Error` carrying the affected event UUIDs — so a rejection is never silently lost, even when the HTTP request itself succeeded. diff --git a/AGENTS.md b/AGENTS.md index 1a9b1959..be47642c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -25,7 +25,7 @@ Where the pieces live: - `posthog/capture_v1.py` — pure transforms (`to_v1_event`, `build_v1_batch_body`) and transport (`post_v1`, `_compress_v1`, `parse_v1_response`, `send_v1_batch`, `CaptureV1Error`). - Routing: `Consumer._send_analytics` (async) and `Client._enqueue` (sync) pick the analytics submitter by `capture_mode`. The dedicated `$ai_*` endpoint has no v1 form and always uses the legacy submitter. -v1-specific behavior to preserve when editing: sentinel `$`-properties are lifted into `options` (coerced to native JSON types or omitted — a wrong type 400s the whole batch); top-level `$set`/`$set_once` are relocated into `properties`; only events the server tags `retry` are resent (stable `PostHog-Request-Id`/`created_at`, incrementing `PostHog-Attempt`); `429` is terminal. +v1-specific behavior to preserve when editing: sentinel `$`-properties are lifted into `options` (coerced to native JSON types or omitted — a wrong type 400s the whole batch); top-level `$set`/`$set_once` are relocated into `properties`; only events the server tags `retry` are resent (stable `PostHog-Request-Id`/`created_at`, incrementing `PostHog-Attempt`); a server `drop` is a terminal per-event rejection — drops are accumulated across attempts and surfaced via `CaptureV1Error`/`on_error` even on a 2xx with no retries (a success status is not full delivery); `Retry-After` is a *minimum* (the client waits `max(configured_backoff, Retry-After)`, hard-capped) not a replacement; `429` is terminal. Retry blocking matches v0: in the default async mode retries happen on the background consumer thread, but with `sync_mode=True` the partial-retry loop (including its backoff sleeps) runs inline on the calling thread, so a slow/erroring endpoint blocks the caller until retries are exhausted. From e1f90ae3aca6f8f67098fb0a2810d62fe60f02e3 Mon Sep 17 00:00:00 2001 From: Eli Reisman Date: Fri, 3 Jul 2026 12:05:05 -0700 Subject: [PATCH 3/3] docs(capture): note the unified 30s retry backoff ceiling MAX_BACKOFF_SECONDS (30s) caps both the exponential backoff and the Retry-After clamp; update the v1 invariants note accordingly. --- AGENTS.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AGENTS.md b/AGENTS.md index be47642c..3a4102d9 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -25,7 +25,7 @@ Where the pieces live: - `posthog/capture_v1.py` — pure transforms (`to_v1_event`, `build_v1_batch_body`) and transport (`post_v1`, `_compress_v1`, `parse_v1_response`, `send_v1_batch`, `CaptureV1Error`). - Routing: `Consumer._send_analytics` (async) and `Client._enqueue` (sync) pick the analytics submitter by `capture_mode`. The dedicated `$ai_*` endpoint has no v1 form and always uses the legacy submitter. -v1-specific behavior to preserve when editing: sentinel `$`-properties are lifted into `options` (coerced to native JSON types or omitted — a wrong type 400s the whole batch); top-level `$set`/`$set_once` are relocated into `properties`; only events the server tags `retry` are resent (stable `PostHog-Request-Id`/`created_at`, incrementing `PostHog-Attempt`); a server `drop` is a terminal per-event rejection — drops are accumulated across attempts and surfaced via `CaptureV1Error`/`on_error` even on a 2xx with no retries (a success status is not full delivery); `Retry-After` is a *minimum* (the client waits `max(configured_backoff, Retry-After)`, hard-capped) not a replacement; `429` is terminal. +v1-specific behavior to preserve when editing: sentinel `$`-properties are lifted into `options` (coerced to native JSON types or omitted — a wrong type 400s the whole batch); top-level `$set`/`$set_once` are relocated into `properties`; only events the server tags `retry` are resent (stable `PostHog-Request-Id`/`created_at`, incrementing `PostHog-Attempt`); a server `drop` is a terminal per-event rejection — drops are accumulated across attempts and surfaced via `CaptureV1Error`/`on_error` even on a 2xx with no retries (a success status is not full delivery); `Retry-After` is a *minimum*, not a replacement (the client waits `max(configured_backoff, min(Retry-After, MAX_BACKOFF_SECONDS))`); `MAX_BACKOFF_SECONDS` (30s) is the single ceiling for both the exponential backoff and the `Retry-After` clamp; `429` is terminal. Retry blocking matches v0: in the default async mode retries happen on the background consumer thread, but with `sync_mode=True` the partial-retry loop (including its backoff sleeps) runs inline on the calling thread, so a slow/erroring endpoint blocks the caller until retries are exhausted.