fix(encoding): explicitly use UTF-8 for all file I/O (#630)#798
Open
oboehmer wants to merge 4 commits into
Open
fix(encoding): explicitly use UTF-8 for all file I/O (#630)#798oboehmer wants to merge 4 commits into
oboehmer wants to merge 4 commits into
Conversation
All open(), read_text(), write_text(), and aiofiles.open() calls in production code now pass encoding="utf-8", removing reliance on the system locale (which can be cp1252 on Windows or ASCII in minimal containers). Also adds encoding="utf-8" to the subprocess.run(text=True) call in the e2e test harness, which caused UnicodeDecodeError when capturing nac-test's emoji-containing stdout under a non-UTF-8 locale. E2e fixtures now contain multi-byte UTF-8 characters (in comments and docstrings only, not data values) to act as regression guards for the affected file I/O paths.
…CLI entry (#630) Add encoding='utf-8' to all NamedTemporaryFile and open() calls that were missed in the initial fix: subprocess_auth.py, subprocess_client.py, device_executor.py, subprocess_runner.py, orchestrator.py. Also fix generated Python scripts inside f-strings (subprocess_auth.py, subprocess_client.py) — these are written to temp files and executed via os.system(), so their open() calls need encoding too. Set os.environ.setdefault('PYTHONUTF8', '1') at CLI entry point (nac_test/cli/main.py) to propagate UTF-8 mode to all child processes. Add targeted Unicode regression test for combined report generation.
…os (#630) Revert decorative UTF-8 comments from 9 fixture scenarios (24 files) that only had UTF-8 in YAML comments and Python comments — content that never flows through the report pipeline. Keep and enhance success + mixed fixtures with meaningful UTF-8: - data.yaml: UTF-8 site names (SITE_München_100, SITE_日本_100) that flow through YAML → Jinja2 → test names → HTML report - pyATS TITLE/DESCRIPTION constants with German and Japanese text that render directly in the combined HTML report - Robot Documentation line with non-ASCII characters
c9ec5c2 to
872bb23
Compare
872bb23 to
b9bbf8f
Compare
Collaborator
Author
|
@aitestino , do you have an opinion if this is needed? technically it is not, current nac-test handles this fine on Linux (where utf-8 is default encoding) as well as Windows. Certain windows terminals will require utf-8 encoding env to display the emojis (nothing we can do here).. please review, but we can also close it.. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fix
UnicodeEncodeErroron Windows when generating HTML reports containing Unicode characters (↕, ✓, →).Python uses the system locale for
open()by default —cp1252on Windows,ASCIIin minimal containers — which fails on the Unicode sort indicators and navigation arrows in the HTML report templates.Closes
Related Issue(s)
PYTHONIOENCODINGCI workaround is now removed as redundant)Type of Change
Test Framework Affected
Network as Code (NaC) Architecture Affected
Platform Tested
Key Changes
encoding="utf-8"to allopen(),read_text(),write_text(), andNamedTemporaryFile(mode="w")calls across 16 production filessubprocess_auth.py,subprocess_client.py) written to temp files and executed viaos.system()os.environ.setdefault("PYTHONUTF8", "1")at CLI entry point to propagate UTF-8 mode to child processesPYTHONIOENCODINGworkaround from CI workflow (UnicodeEncodeError on Windows when stdout uses cp1252 encoding #723)success+mixedscenarios with meaningful UTF-8 in data fields that flow through the full YAML → Jinja2 → report pipelineTesting Done
pytest/pre-commit run -a)Test Commands Used
uv run pytest tests/unit/ tests/e2e/ -x -q -n auto --dist loadscope # 1768 passed, 377 skippedChecklist
pre-commit run -apasses)Additional Notes
PYTHONUTF8=1a transitional measureencoding="utf-8"on every I/O call is the primary fix;PYTHONUTF8=1is belt-and-suspenders for child processes spawned viaos.system()andsubprocess