Skip to content

fix(encoding): explicitly use UTF-8 for all file I/O (#630)#798

Open
oboehmer wants to merge 4 commits into
mainfrom
fix/630-read-write-utf8
Open

fix(encoding): explicitly use UTF-8 for all file I/O (#630)#798
oboehmer wants to merge 4 commits into
mainfrom
fix/630-read-write-utf8

Conversation

@oboehmer
Copy link
Copy Markdown
Collaborator

@oboehmer oboehmer commented Apr 27, 2026

Description

Fix UnicodeEncodeError on Windows when generating HTML reports containing Unicode characters (↕, ✓, →).

Python uses the system locale for open() by default — cp1252 on Windows, ASCII in minimal containers — which fails on the Unicode sort indicators and navigation arrows in the HTML report templates.

Closes

Related Issue(s)

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactoring / Technical debt (internal improvements with no user-facing changes)
  • Documentation update
  • Chore (build process, CI, tooling, dependencies)
  • Other (please describe):

Test Framework Affected

  • PyATS
  • Robot Framework
  • Both
  • N/A (not test-framework specific)

Network as Code (NaC) Architecture Affected

  • ACI (APIC)
  • NDO (Nexus Dashboard Orchestrator)
  • NDFC / VXLAN-EVPN (Nexus Dashboard Fabric Controller)
  • Catalyst SD-WAN (SDWAN Manager / vManage)
  • Catalyst Center (DNA Center)
  • ISE (Identity Services Engine)
  • FMC (Firepower Management Center)
  • Meraki (Cloud-managed)
  • NX-OS (Nexus Direct-to-Device)
  • IOS-XE (Direct-to-Device)
  • IOS-XR (Direct-to-Device)
  • Hyperfabric
  • All architectures
  • N/A (architecture-agnostic)

Platform Tested

  • macOS (version tested: macOS 15)
  • Linux (distro/version tested: )

Key Changes

  • Add encoding="utf-8" to all open(), read_text(), write_text(), and NamedTemporaryFile(mode="w") calls across 16 production files
  • Fix generated Python scripts inside f-strings (subprocess_auth.py, subprocess_client.py) written to temp files and executed via os.system()
  • Set os.environ.setdefault("PYTHONUTF8", "1") at CLI entry point to propagate UTF-8 mode to child processes
  • Remove redundant PYTHONIOENCODING workaround from CI workflow (UnicodeEncodeError on Windows when stdout uses cp1252 encoding #723)
  • Consolidate e2e UTF-8 fixtures to success + mixed scenarios with meaningful UTF-8 in data fields that flow through the full YAML → Jinja2 → report pipeline

Testing Done

  • Unit tests added/updated
  • Integration tests performed
  • Manual testing performed:
    • PyATS tests executed successfully
    • Robot Framework tests executed successfully
    • D2D/SSH tests executed successfully (if applicable)
    • HTML reports generated correctly
  • All existing tests pass (pytest / pre-commit run -a)

Test Commands Used

uv run pytest tests/unit/ tests/e2e/ -x -q -n auto --dist loadscope
# 1768 passed, 377 skipped

Checklist

  • Code follows project style guidelines (pre-commit run -a passes)
  • Self-review of code completed
  • Code is commented where necessary (especially complex logic)
  • Documentation updated (if applicable)
  • No new warnings introduced
  • Changes work on both macOS and Linux
  • CHANGELOG.md updated (if applicable)

Additional Notes

  • Python 3.15 (PEP 686) will make UTF-8 mode the default, making PYTHONUTF8=1 a transitional measure
  • The explicit encoding="utf-8" on every I/O call is the primary fix; PYTHONUTF8=1 is belt-and-suspenders for child processes spawned via os.system() and subprocess

All open(), read_text(), write_text(), and aiofiles.open() calls in
production code now pass encoding="utf-8", removing reliance on the
system locale (which can be cp1252 on Windows or ASCII in minimal
containers).

Also adds encoding="utf-8" to the subprocess.run(text=True) call in
the e2e test harness, which caused UnicodeDecodeError when capturing
nac-test's emoji-containing stdout under a non-UTF-8 locale.

E2e fixtures now contain multi-byte UTF-8 characters (in comments and
docstrings only, not data values) to act as regression guards for the
affected file I/O paths.
…CLI entry (#630)

Add encoding='utf-8' to all NamedTemporaryFile and open() calls that were
missed in the initial fix: subprocess_auth.py, subprocess_client.py,
device_executor.py, subprocess_runner.py, orchestrator.py.

Also fix generated Python scripts inside f-strings (subprocess_auth.py,
subprocess_client.py) — these are written to temp files and executed via
os.system(), so their open() calls need encoding too.

Set os.environ.setdefault('PYTHONUTF8', '1') at CLI entry point
(nac_test/cli/main.py) to propagate UTF-8 mode to all child processes.

Add targeted Unicode regression test for combined report generation.
…os (#630)

Revert decorative UTF-8 comments from 9 fixture scenarios (24 files)
that only had UTF-8 in YAML comments and Python comments — content
that never flows through the report pipeline.

Keep and enhance success + mixed fixtures with meaningful UTF-8:
- data.yaml: UTF-8 site names (SITE_München_100, SITE_日本_100)
  that flow through YAML → Jinja2 → test names → HTML report
- pyATS TITLE/DESCRIPTION constants with German and Japanese text
  that render directly in the combined HTML report
- Robot Documentation line with non-ASCII characters
@oboehmer oboehmer force-pushed the fix/630-read-write-utf8 branch from c9ec5c2 to 872bb23 Compare April 27, 2026 08:35
The PYTHONIOENCODING env var in the Windows smoke test was a band-aid
for #723 (cp1252 can't encode emoji). Now properly fixed at the source:
PYTHONUTF8=1 at CLI entry point + explicit encoding='utf-8' on all I/O.
@oboehmer oboehmer force-pushed the fix/630-read-write-utf8 branch from 872bb23 to b9bbf8f Compare April 27, 2026 08:52
@oboehmer oboehmer marked this pull request as draft April 27, 2026 09:29
@oboehmer oboehmer marked this pull request as draft April 27, 2026 09:29
@oboehmer oboehmer marked this pull request as ready for review May 23, 2026 13:27
@oboehmer
Copy link
Copy Markdown
Collaborator Author

@aitestino , do you have an opinion if this is needed? technically it is not, current nac-test handles this fine on Linux (where utf-8 is default encoding) as well as Windows. Certain windows terminals will require utf-8 encoding env to display the emojis (nothing we can do here).. please review, but we can also close it..

@oboehmer oboehmer requested a review from aitestino May 23, 2026 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows: UnicodeEncodeError when generating HTML reports

1 participant