Fix: level CANN dlog before rtSetDevice so device logs honor log_level#763
Merged
ChaoWao merged 1 commit intoMay 13, 2026
Conversation
There was a problem hiding this comment.
Code Review
This pull request reorders the initialization sequence in simpler_init to ensure that dlog_setlevel is called before the device context is opened via rtSetDevice. This change is necessary because CANN snapshots the log level at context-open time, making subsequent level changes ineffective for the device-side session. Documentation across several files has been updated to reflect this new order. The review feedback suggests explicitly including the header in the platform-specific implementation files where std::getenv is utilized to ensure portability.
PR hw-native-sys#723 (collapsed ChipWorker init/set_device into a single simpler_init) flipped the order of attach_current_thread and the dlog_setlevel block inside simpler_init on both a2a3 and a5 onboard. CANN snapshots the device-side log session's level at device-context open time (rtSetDevice inside attach_current_thread), so a dlog_setlevel issued after that is a no-op for the device side. Net effect: when ASCEND_GLOBAL_LOG_LEVEL is not set in the environment, the log_level the user passed to Worker(...) / configure_logging(...) silently fails to reach the device-side filter, and ~/ascend/log/{debug,run}/device-N/*.log files are either missing or pinned at CANN's default (level 3 / ERROR). Pre-hw-native-sys#715/hw-native-sys#723 the order was correct because init and set_device were two separate C entries called in the right sequence; hw-native-sys#723 merged them and the dlog ordering was silent collateral. Sim has no CANN dlog and is unaffected. The fix: hoist the existing dlog_setlevel block above attach_current_thread in both onboard simpler_init's. HostLogger is already seeded by libsimpler_log.so's simpler_log_init() (runs earlier in ChipWorker::init), so HostLogger::get_instance().level() is already the user's choice at this point — no new plumbing. Comment on the hoisted block explains the rtSetDevice ordering constraint so this doesn't silently regress again. Header doc (pto_runtime_c_api.h) reorders the three responsibilities and docs (logging.md, dynamic-linking.md, chip-level-arch.md) update their call-flow diagrams to match the new order. Hardware verification on Ascend910 / a2a3 onboard (ASCEND_GLOBAL_LOG_LEVEL unset, configure_logging("debug")): before ~/ascend/log/run/device-2/device-845511_*.log: logLevel=3 (no DEBUG entries, debug/ dir empty) after ~/ascend/log/run/device-2/device-856602_*.log: logLevel=0 (76 KB of DEBUG entries) With ASCEND_GLOBAL_LOG_LEVEL=1 set, device shows logLevel=1 regardless of configure_logging — env-var path unchanged. Existing onboard ST (tests/st/aicore_op_timeout, PR hw-native-sys#762) still passes after rebuild.
8131abf to
82a7da2
Compare
ChaoWao
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #723 (collapsed
ChipWorker::init/set_deviceinto a singlesimpler_init) silently flipped the order ofattach_current_threadand the
dlog_setlevelblock insidesimpler_initon botha2a3and
a5onboard. CANN snapshots the device-side log session'slevel at device-context open time (
rtSetDeviceinsideattach_current_thread), so adlog_setlevelissued after thatis a no-op for the device side.
Net effect: when
ASCEND_GLOBAL_LOG_LEVELis not set in theenvironment, the
log_levelthe user passed toWorker(...)/configure_logging(...)silently fails to reach the device-sidefilter —
~/ascend/log/{debug,run}/device-N/*.logfiles are eithermissing or pinned at CANN's default (logLevel=3 / ERROR).
Pre-#715/#723 the order was correct because
initandset_devicewere two separate C entries called in the right sequence; #723
merged them and the dlog ordering was silent collateral. Sim has
no CANN dlog and is unaffected.
The fix
Hoist the existing
dlog_setlevelblock aboveattach_current_threadin both onboardsimpler_inits.HostLoggeris already seeded by
libsimpler_log.so'ssimpler_log_init()(runs earlier in
ChipWorker::init), soHostLogger::get_instance().level()is already the user's choice atthis point — no new plumbing needed.
The comment on the hoisted block now explains the
rtSetDeviceordering constraint so this doesn't silently regress again.
Files
src/a2a3/platform/onboard/host/pto_runtime_c_api.cpp— hoistsrc/a5/platform/onboard/host/pto_runtime_c_api.cpp— same hoistsrc/common/worker/pto_runtime_c_api.h—simpler_initdoccomment: responsibilities reordered (dlog first), wording explains
the constraint
docs/logging.md,docs/dynamic-linking.md,docs/chip-level-arch.md— call-flow diagrams updated to matchthe new order (per
.claude/rules/doc-consistency.md)Sim variants (
src/{a2a3,a5}/platform/sim/host/pto_runtime_c_api.cpp)are untouched — they have no CANN dlog.
Hardware verification
Ascend910 /
a2a3onboard, device 2, tiny driver script(
Worker.init→close) withconfigure_logging("debug")andASCEND_GLOBAL_LOG_LEVELunset:~/ascend/log/run/device-2/device-{pid}_*.log~/ascend/log/debug/device-2/device-{pid}_*.loglogLevel=3, ccecpulogLevel=-1, aicpulogLevel=-1logLevel=0, ccecpulogLevel=-1, aicpulogLevel=-1With
ASCEND_GLOBAL_LOG_LEVEL=1exported (PID 860263), deviceshows
logLevel=1regardless ofconfigure_logging("debug")—the
getenvguard correctly defers to the env var, so that pathis unchanged.
Existing onboard ST (
tests/st/aicore_op_timeout, #762) stillpasses in ~8 s after the rebuild.
Test plan
check-headers) pass on all 6 files without
SKIP=.ASCEND_GLOBAL_LOG_LEVEL-set path unchanged.Fixes the regression introduced by #723.