Skip to content

AF 2 to 3 migration 0049 (remove_pickled_data_from_xcom_table) aborts on U+0000 NUL escape in XCom values during bytea to JSONB cast #69063

Description

@seanmuth

Apache Airflow version

3.2.2

What happened?

Upgrading the metadata DB from 2.x to 3.x fails during migration 0049_3_0_0_remove_pickled_data_from_xcom_table (revision eed27faa34e3) when any historical XCom value contains a Unicode NUL escape (\u0000). PostgreSQL's json/jsonb type cannot represent \u0000, so the bulk ALTER COLUMN value TYPE JSONB USING CAST(CONVERT_FROM(value, 'UTF8') AS JSONB) aborts the entire migration:

sqlalchemy.exc.DataError: (psycopg2.errors.UntranslatableCharacter) unsupported Unicode escape sequence
DETAIL: \u0000 cannot be converted to text.
CONTEXT: JSON data, line 1: …": "Foo\u0000…

This is the same class of bug already fixed for NaN/Infinity/-Infinity in this migration (#57893): values that are legal in the pickled representation but illegal in strict JSON/JSONB. The existing regexp_replace sanitization step handles the non-finite-float tokens but does not strip \u0000, so any NUL byte in a stored XCom takes the whole upgrade down.

\u0000 lands in XCom values legitimately: a task returns a Python object whose string fields contain \x00 (common with data sourced from fixed-width exports, C buffers, Mongo, or bad UTF-16→UTF-8 conversions). On 2.x this serialized fine into the bytea/LargeBinary value column as the 6-char escape \u0000; the 3.x JSONB conversion then rejects it.

The sibling dag_run.conf migration 0055_3_0_0_remove_pickled_data_from_dagrun_table has the same gap, but fails differently: its per-row try/except (added alongside the NaN handling) swallows the bad row and increments err_count, so a NUL-bearing conf is silently dropped rather than crashing. Same root cause (json.dumps emits \u0000, JSONB rejects it), different (quieter, lossy) symptom. Worth fixing both together.

What you think should happen instead?

The migration should sanitize \u0000 (and any raw NUL byte) the same way it already sanitizes NaN/Infinity, so the upgrade completes instead of aborting (0049) or silently dropping data (0055).

\u0000 cannot be preserved in JSONB at all (unlike NaN, which can be quoted), so the only options are to strip it or replace it with a sentinel codepoint (e.g. U+FFFD). Stripping is the least surprising for metadata.

Suggested in-place amendment to the PostgreSQL branch of 0049 (these migration files are amended in place for bugfixes, since an already-applied migration can't be re-run):

UPDATE xcom
SET value = convert_to(
    regexp_replace(
        replace(convert_from(value, 'UTF8'), '\u0000', ''),  -- strip NUL escape (illegal in jsonb)
        '([:,\[]\s*|^)(NaN|-?Infinity)(?=\s*[,}\]]|$)',
        '\1"\2"',
        'g'
    ),
    'UTF8'
)
WHERE value IS NOT NULL AND get_byte(value, 0) != 128

(The MySQL and SQLite branches need the equivalent REPLACE(..., '\u0000', ''), and 0055's Python path should strip \u0000 from the deserialized data before json.dumps.) A PR follows.

How to reproduce

On a 2.x deployment, push an XCom whose value contains a NUL byte, then upgrade to 3.x. Minimal SQL repro of the failing cast (the stored bytea holds the 6-char \u0000 escape exactly as json.dumps would emit it):

-- '\x7b226b223a2022465c75303030306f6f227d' decodes to: {"k": "F\u0000oo"}
SELECT CAST(CONVERT_FROM('\x7b226b223a2022465c75303030306f6f227d'::bytea, 'UTF8') AS JSONB);
-- ERROR:  unsupported Unicode escape sequence
-- DETAIL: \u0000 cannot be converted to text.

Operating System

Debian (official Airflow image)

Versions of Apache Airflow Providers

No response

Deployment

Astronomer

Deployment details

Upgrade from 2.11.2 (Runtime) to Airflow 3.2.2 / Runtime 3.2-5, PostgreSQL metadata DB.

Anything else?

Related: #57893 (the NaN/Infinity sanitization for this same migration). This is the \u0000 follow-on for the same conversion step. The fix may also need to cover raw 0x00 bytes in addition to the \u0000 escape sequence.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct


Filed with Claude Code (Claude Opus 4.8).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions