Skip to content

fix: audit cleanup — installer NameError CRIT + 12 mediums/lows + lint#100

Merged
nelsonduarte merged 14 commits into
mainfrom
chore/audit-pr-j
Jun 21, 2026
Merged

fix: audit cleanup — installer NameError CRIT + 12 mediums/lows + lint#100
nelsonduarte merged 14 commits into
mainfrom
chore/audit-pr-j

Conversation

@nelsonduarte

Copy link
Copy Markdown
Owner

Summary

Cleanup round covering 13 catalogued bugs plus 5 fresh findings from ruff/mypy/pytest. Headline: installer NameError CRIT that hid every failed-install root cause behind a Python traceback.

CRITICAL

  • installer.py NameError on error pathexcept Exception as exc: QTimer.singleShot(0, lambda: ...str(exc)) referenced exc outside its scope, raising NameError at lambda execution time. Every failed install showed a Python traceback instead of the intended friendly message. Captured via err_msg = str(exc); lambda msg=err_msg: pattern.

MEDIUM (5)

  • Image dimension caps (editor/dialogs.py, editor/tab.py, tools/import_pdf.py) — Signature picker and image-to-PDF converter accepted unbounded sizes; gigapixel TIFFs crashed the process. New utils.check_image_size enforces a 100 megapixel ceiling with a translated warning (editor.image_too_large).
  • Atomic write in convert.py (6 sites) — Non-PDF outputs (DOCX/PPTX/XLSX/etc.) wrote directly; crash mid-write corrupted destination. New _atomic_save helper does tempfile + os.replace with proper CancelledError handling on streaming paths.
  • GhostScript Windows non-ASCII paths (utils.py) — _compress_pdf could fail on user folders with non-ASCII chars (e.g. "José") in old locales. New _win_short_path helper converts to 8.3 short-name via GetShortPathNameW on Windows (no-op elsewhere).
  • PasswordDialog hardcoded 72px — replaced with devicePixelRatioF()-scaled pixmap so the icon renders crisply on HiDPI displays.

LOW (8 — 1 NEW CRIT pre-existed, the rest were polish)

  • text_edit excluded from non-Latin warning — added to the detector tuple.
  • _MAX_RECENT configurable (max_recent_files, clamp 1-50, default bumped 5->10).
  • Ruff cleanup — 41 violations (F401/F541/E741) -> 0 in app/ + pdfapps.py + installer.py. Defensive imports preserved with # noqa: F401.
  • mypy n_pages annotationint | str union explicit.
  • Stale test stubs_nfc staticmethod added; slice now uses def boundary instead of fixed character window.
  • Updater backup in temp dirtempfile.mkstemp + shutil.move so the .bak doesn't show in OneDrive sync.
  • CI raw .exe dropdist/PDFApps.exe removed from 3 upload sites in build.yml; users install via Setup or MSIX.

i18n

1 new key x 8 languages: editor.image_too_large (with {width}, {height}, {megapix} placeholders). Parity: 8 x 605.

Tests

tests/test_audit_pr_j.py — 20 tests + 1 skipped (Windows-only). Covers all 13 fixes plus i18n parity.

Final: 339 passed, 2 skipped (was 319 + 1 flatpak + 1 skipped; +20 new; the 2 broken stubs in test_pdfapps.py also recovered).

Adversarial review

Verdict: SAFE TO MERGE. Two non-blocking observations:

  1. get_recent_files doesn't truncate at read-time; legacy entries above the cap survive until next add_recent_file. Cosmetic.
  2. _atomic_save is defined before PySide6 imports in convert.py. PEP-8 nit, no behavioural impact.

Bandit/pip-audit/ruff/mypy: no new findings introduced. Pre-existing items (B103, B310, F841 in 2 unrelated files) are out of scope.

Validation

  • 14 atomic commits
  • ruff clean (F401/F541/E741) in scope
  • pytest 339/2; suite recovered the 2 stale tests
  • i18n parity verified (8x605)
  • No APP_VERSION bump
  • Compatible with all merged PRs

nelsonduarte and others added 14 commits June 21, 2026 14:57
…RIT)

When install_app raised, the except block scheduled lambdas via
self.after() that referenced `exc` after the worker thread frame
had unwound, so the after-callback hit NameError instead of showing
the friendly error dialog. Capture str(exc) up front and bind it via
a lambda default argument so the closure is independent of the
enclosing frame's local-variable lifetime.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A user (or a malicious actor) picking a gigapixel TIFF (50000x50000)
crashed the editor process when QPixmap allocated multi-GB to hold
the raw decoded buffer. Add a shared utils.check_image_size helper
using PIL.Image (already a dependency) to short-circuit before
allocation, returning width/height for a localised warning.

Apply at the two entry points the audit flagged: _SignatureDialog
import tab in app/editor/dialogs.py and EditorTab._pick_image in
app/editor/tab.py. Cap is 100 megapixels — generous enough for any
phone-camera / scanner output, tight enough to refuse gigapixel
attacks.

Add editor.image_too_large key across all 8 supported languages.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The same gigapixel-image attack surface the editor pickers hardened
exists in the Import-PDF tool when the user multi-selects images:
a single 50000x50000 TIFF in the source list crashed the worker
thread (and on frozen PyInstaller builds takes the whole app).

Use the shared utils.check_image_size guard, increment the existing
`skipped` counter on rejection so the user gets the standard
"skipped N images" status rather than a silent failure.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The password dialog's lock icon was rendered at 72x72 with a
hardcoded DPR of 2.0, which assumes every monitor is Retina/HiDPI.
On a 1.0 DPR (regular 1080p) monitor the icon was over-sized then
downscaled (blur); on a mixed-DPI multi-monitor setup it drifted
by a few pixels when the dialog moved between screens.

Sample devicePixelRatioF() on the dialog itself and size the
pixmap to `logical_size * dpr` while keeping the QLabel fixed at
40x40 logical pixels. Defaults to 1.0 if the API is missing or
returns a non-positive value (headless/mocked widgets).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The non-Latin warning loop (R11-L4) covered the 'text' and 'note'
edit types but missed 'text_edit' — yet text_edit also writes via
the built-in helv font, so users editing existing spans containing
non-Latin characters got silent tofu output without any warning.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Every non-image converter (DOCX/TXT/PPTX/XLSX/HTML/EPUB) was writing
to the user-supplied out_path directly, so a crash or cancel mid-save
truncated whatever was previously at that path (the docx_doc.save,
prs.save, wb.save, epub.write_epub, raw open(... 'w') calls all
truncate-on-open). PDF outputs already went through
BasePage._atomic_pdf_write; non-PDF outputs were the gap.

Add a private _atomic_save helper that wraps tempfile.mkstemp +
os.replace (same-directory tempfile so the rename stays a single FS
op). For TXT, refactor the streaming-write loop to raise
CancelledError if the worker cancels mid-page — _atomic_save's
BaseException handler discards the half-written tmp, leaving the
previous output untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…on Windows

When the input PDF or the tempfile output sits under a user profile
with non-ASCII characters (e.g. C:\Users\José\report.pdf),
Ghostscript on Windows mis-parses the command line because it still
reads argv through the legacy ANSI code page — the resulting error
is the unhelpful 'could not open output file' message.

Add _win_short_path helper that calls GetShortPathNameW to map the
path to its 8.3 alias (ASCII on NTFS with short-names enabled,
which is the default). No-op on POSIX or when the path does not
yet exist. Apply at the gs subprocess call site for both the
input PDF and the tempfile output.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ult 10)

Replace the hardcoded _MAX_RECENT=5 with a _get_max_recent() helper
that reads max_recent_files from the on-disk config, falls back to
the new default of 10 (matches Office/Chrome/Acrobat conventions; 5
was leftover from the 1.0 prototype), and clamps to [1, 50] so a
hostile / corrupt config cannot make the recents menu freeze the
UI for seconds while re-rendering thousands of entries.

Keep the _MAX_RECENT module-level alias bound to the default so any
external import of the legacy constant keeps working. Live config
override is only honoured via the new helper.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sweep the four ruff codes the audit flagged across app/, pdfapps.py
and installer.py. 41 violations → 0:

- 14 F401: drop dead PdfReader/Qt/QPixmap/QPainter/QColor/Signal/
  Thread/glob/ACCENT_H/BORDER/TEXT_SEC/QApplication/QPoint/
  QGridLayout/QLayout/QComboBox/_TextDialog/_TextEditDialog/is_dark
  imports. The defensive pypdf import in pdfapps.py and the
  pytesseract probe in tools/ocr.py keep their imports but get a
  noqa:F401 with an inline explanation.
- 11 F541: strip the spurious f-prefix on Qt stylesheet fragments
  that contain no placeholders (toast in base.py, mode-button
  inactive variants in editor/tab.py).
- 1 E741: rename ambiguous `l` to `lbl` in worker.py drain loop.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The fallback path reassigned an int-typed variable to '?' on read
failure, which mypy flagged as incompatible. Add the int|str union
annotation up front so the fallback is type-clean; the renderer
(t('edit.status.pages', n=...)) treats both transparently because
str.format handles either.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1. test_encrypted_pdf_helpers_unlock_with_stored_password: PR-H added
   a staticmethod _nfc on BasePage that _open_reader and _open_fitz
   call via self._nfc(...). The standalone test _Stub never grew the
   attribute, so the test failed with AttributeError before exercising
   the actual decrypt path. Bind staticmethod(BasePage._nfc).

2. test_editor_handles_encrypted_pdfs: the test sliced the source of
   _load_pdf with a fixed 1500-char window, which PR-H/PR-I outgrew
   when they added the password prompt + NFC normalisation lines. The
   'password=self._pdf_password' assertion fell off the slice. Use
   the next `def ` boundary as the slice end so the assertion stays
   robust against further additions inside the same function.

Both tests now pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The old binary was renamed to '<current>.bak' alongside the install
location. On OneDrive-installed copies (Windows Store + manual
installs of self-extracting setup into a synced folder) the .bak
file became a visible sync entry, prompting OneDrive to upload the
prior binary on every update — adds a couple of hundred MB of churn
per release for users with the app under their OneDrive root.

Move the backup to tempfile.gettempdir() via mkstemp, keeping the
original basename + .pdfapps-backup suffix for forensics. Use
shutil.move (already imported) instead of os.rename for the
cross-volume case — temp dir is often on a different filesystem
from /usr/bin on Linux installs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Windows artifact bundle uploaded both the NSIS installer
(PDFAppsSetup.exe) AND the raw PyInstaller binary (PDFApps.exe),
then attached both to the public release with checksums. The raw
exe embeds the runner's internal build paths in its PyInstaller
debug strings — not a credential leak but free reconnaissance for
anyone scraping the .exe for build-machine signatures.

Users install via PDFAppsSetup.exe or the .msix; the in-app
auto-updater (`_find_asset` in app/updater.py) already targets
PDFAppsSetup.exe on Windows, so this is a no-op for the update
flow. Drop PDFApps.exe from the three places it appeared in
build.yml: matrix upload, sha256 checksum loop, and release files
list.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
One test per fix in the PR plus the i18n parity check. Where the
fix has an observable runtime effect (image-cap guard, atomic save,
config-driven max-recent, short-path helper) the test calls the
helper directly with stubs/monkeypatch; where the effect is in
CI / installer / GUI code, the test grep-asserts on the source
file so future regressions are caught at lint time.

Notably:
- check_image_size: normal-image accept, gigapixel reject (PIL stub
  to avoid OOM), corrupt-file fallback
- _atomic_save: success path replaces, failure leaves prior content
  untouched and cleans up the tempfile
- _get_max_recent: default + config override + clamp tests
- ruff F401/F541/E741 subprocess check guards future regressions

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@nelsonduarte nelsonduarte added bug Something isn't working enhancement New feature or request labels Jun 21, 2026
Comment thread tests/test_audit_pr_j.py
target.write_bytes(b"OLD_CONTENT")

# 1) Successful save replaces the old file.
_atomic_save(str(target), lambda p: open(p, "wb").write(b"NEW_CONTENT"))
Comment thread tests/test_audit_pr_j.py
# and does not leak a tempfile in the directory.
with pytest.raises(RuntimeError):
def _boom(p: str) -> None:
open(p, "wb").write(b"PARTIAL")
Comment thread app/i18n.py
# Back-compat alias for any external code that still imports the legacy
# constant. New call sites should use _get_max_recent() so the user's
# config override is honoured.
_MAX_RECENT = _DEFAULT_MAX_RECENT
Comment thread app/utils.py
n = ctypes.windll.kernel32.GetShortPathNameW(path, buf, 512)
if n and buf.value:
return buf.value
except Exception:
Comment thread pdfapps.py
# Defensive import — surfaces the missing-dependency dialog before
# the rest of app.* drags pypdf in via tools/*.py and crashes with
# an opaque traceback. The names are intentionally unused here.
from pypdf import PdfReader, PdfWriter # noqa: F401
Comment thread tests/test_audit_pr_j.py
subprocess / OS call we cannot reasonably stage from a unit test.
"""

import io
Comment thread tests/test_audit_pr_j.py

import io
import json
import os
@nelsonduarte nelsonduarte merged commit 9c8a392 into main Jun 21, 2026
3 checks passed
@nelsonduarte nelsonduarte deleted the chore/audit-pr-j branch June 21, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants