Skip to content

Security (2/5): LZW + ASCIIHex limits — CVE-2026-28804, CVE-2025-62708, CVE-2025-66019#2

Merged
icanhasmath merged 2 commits into
1.28.6.xfrom
1.28.6-sec-filters-lzw-hex
Jun 23, 2026
Merged

Security (2/5): LZW + ASCIIHex limits — CVE-2026-28804, CVE-2025-62708, CVE-2025-66019#2
icanhasmath merged 2 commits into
1.28.6.xfrom
1.28.6-sec-filters-lzw-hex

Conversation

@icanhasmath

Copy link
Copy Markdown
Collaborator

Part 2 of 5 of the PyPDF2 1.28.6 security backport, targeting 1.28.6.x. Scoped to PyPDF2/filters.py LZW/ASCIIHex paths.

CVE Sev Fix
CVE-2026-28804 Mod ASCIIHexDecode O(n²) → linear bulk decode
CVE-2025-62708 Mod Cap LZWDecode output
CVE-2025-66019 Mod LZW output cap tightened to 75 MB

Backported from upstream pypdf 6.7.5 / 6.1.3 / 6.4.0; Py2.7-safe. The two LZW CVEs are one change (no output cap existed in 1.28.6). New tests: Tests/test_security_lzw_hex.py (6). Validated under Python 2.7.18 — no new failures vs baseline.

🤖 Generated with Claude Code

icanhasmath and others added 2 commits June 18, 2026 11:21
ASCIIHexDecode.decode accumulated output one character at a time
(retval += chr(...), hex_pair += char) which is O(n^2); a large
/ASCIIHexDecode stream caused excessive CPU time (CWE-407).

Locate the EOD marker once, strip whitespace, and bulk-decode via
binascii.unhexlify. Also handle a trailing odd hex digit per ISO 32000
§7.4.2 (assumed followed by "0") instead of the previous AssertionError.
Mirrors upstream pypdf 6.7.5 (PR py-pdf#3666).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A crafted LZWDecode stream could amplify into gigabytes of output with no
limit, exhausting memory (CWE-770). 1.28.6's LZWDecode.Decoder.decode
accumulated into `baos` in an unbounded loop.

Add LZW_MAX_OUTPUT_LENGTH (75 MB) and a per-iteration check in
Decoder.decode that raises PdfReadError once the output exceeds it. The
internal Decoder gains a defaulted max_output_length kwarg; the public
LZWDecode.decode signature is unchanged.

Upstream addressed this in pypdf 6.1.3 (PR py-pdf#3502, output cap) and tightened
the default to 75 MB in 6.4.0 (PR, CVE-2025-66019). 1.28.6 has no LzwCodec
/ LimitReachedError, so this is a hand-written cap on the old decoder
reusing PdfReadError.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@martinPavesio martinPavesio left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports security hardening to the PyPDF2/filters.py decoders to mitigate decompression/CPU-exhaustion vectors in /ASCIIHexDecode and /LZWDecode, and adds regression tests.

Changes:

  • Replace quadratic /ASCIIHexDecode decoding with whitespace-stripping + bulk binascii.unhexlify.
  • Introduce a default maximum output length for /LZWDecode and plumb it through LZWDecode.Decoder.
  • Add regression tests covering ASCIIHex basic behavior and LZW output capping.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
PyPDF2/filters.py Implements ASCIIHex bulk decode and adds an LZW decoded-output cap.
Tests/test_security_lzw_hex.py Adds regression tests for ASCIIHex behavior and LZW output-length limiting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread PyPDF2/filters.py
Comment thread PyPDF2/filters.py
Comment thread Tests/test_security_lzw_hex.py
Comment thread Tests/test_security_lzw_hex.py
Comment thread Tests/test_security_lzw_hex.py
Comment thread Tests/test_security_lzw_hex.py
Comment thread Tests/test_security_lzw_hex.py
Comment thread Tests/test_security_lzw_hex.py
@icanhasmath icanhasmath merged commit ca1cbdb into 1.28.6.x Jun 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants