[v2] Implement conditional autoprompt importing#10231
Conversation
hssyoo
left a comment
There was a problem hiding this comment.
The measurements in the PR description show a ~5ms improvement, which is far from the ~66ms mean import time we measured. Based on these numbers, it seems highly likely that prompt_toolkit is still being imported elsewhere.
I was able to confirm this by checking sys.modules at the end of a aws --version call.
Great catch! I have analyzed |
e8d5797 to
e82544d
Compare
…ing in a more idiomatic way.
…gated to SingleWizardCommand instances.
hssyoo
left a comment
There was a problem hiding this comment.
It's looking better but it looks like there's some gaps:
- I noticed that
aws configure liststill importsprompt_toolkiteven though it doesn't need it. - I'm not seeing anything in the PR that will guard against someone from adding unconditional
prompt_toolkitimports in the future.
I'd like to see us step back and first enumerate all cases where we know we'll need to import prompt_toolkit. Knowing all cases upfront will help us verify that the implementation behaves as intended and help us write tests to ensure future changes don't reintroduce unconditional imports.
I'm also not sure if scattering import statements across different files and within methods is the right approach. I'm wondering if what we want instead is an authoritative source for importing prompt_toolkit. For example, say we had 2 modules:
_prompt.py- The only module in the AWS CLI repository that's allowed to import anything fromprompt_toolkit.prompt.py- A module that exposes factory functions that return objects that need requireprompt_toolkit.
The imports from _prompt.py would happen within the factory functions. This is nice because then the rest of the AWS CLI codebase can import from prompt.py, but prompt_toolkit won't be imported until the factory function is actually called.
A single entrypoint for importing prompt_toolkit also enables us to write linting rules that assert that prompt_toolkit is never imported outside of _prompt.py, providing an extra layer of protection.
Thoughts?
| { | ||
| "type": "enhancement", | ||
| "category": "Performance", | ||
| "description": "Refactor imports to reduce command initialization time (e.g. loading all imported modules), resulting in about 11.70-13.35% reduced execution time across most commands (modeled commands and ``aws --version``)." |
There was a problem hiding this comment.
- We're only deferring
prompt_toolkitimport here, right? Let's be specific about that instead of generally claiming "refactor imports". - I wouldn't put any concrete numbers in the changelog. This figure changes between hardware and command run. Just say it avoids
prompt_toolkitimport time for commands that don't need it.
Configure listRegarding In summary, every Single point of importing prompt_toolkit (with linting enforcement)prompt_toolkit audit resultsBelow is the audit results for when prompt_toolkit is definitely needed:
Always needed — these are inherently interactive commands that always prompt the user Current bug (as you pointed out): imported during init of BaseSSOConfigurationCommand, which fires during subcommand table build for any Fix: move imports from init to _run_main
Currently correct:
Current bug: Fix: defer
Conditionally needed — only when Currently correct:
Conditionally needed — only when autoprompt mode resolves to 'on' or 'on-partial' Currently correct:
Currently correct: lazy import inside In summary, there are two places on this current branch that import prompt_toolkit when it is not needed. One is all of the Thoughts on your factory-with-enforcement proposal
While I see your point, I don't see how the proposed factory approach solves the concern. So instead of importing prompt_toolkit, devs will need to call the factory functions that import prompt_toolkit. But the devs still have to make sure they call the factory function an optimal place. For example, a dev can just as easily call the factory function in the ConfigureCommand's init function, so it seems to replace the problem of where to scatter the imports with where to scatter the factory function calls.
I don't understand why forcing prompt_toolkit to be imported by a single module is a good restriction to have, since, as stated in the previous paragraph, the factory function calls still need to be placed well to avoid unnecessarily importing prompt_toolkit. A lint rule saying "only these files may import prompt_toolkit" is just an allowlist — it prevents new files from adding prompt_toolkit imports, but it doesn't prevent the actual problem (those allowed files being imported too early). The rule we actually want to enforce seems more like "prompt_toolkit must not be in sys.modules after create_clidriver() + driver.main() for commands that don't need it." That seems more of a runtime property, not a static lint check. We can't determine it from source analysis alone because it depends on which code paths execute. What are your thoughts on adding a test for a subset of commands that asserts that def test_no_prompt_toolkit_imported_for_basic_commands():
for cmd in ['s3 ls', 'configure list', 'logs describe-log-groups', 'ecs describe-services']:
# run in subprocess, assert prompt_toolkit not in sys.modules |
Description of changes:
AutoPromptDriverobject. The mode resolution code has been moved toclidriver.pyas the standalone functionsresolve_auto_prompt_modeandvalidate_auto_prompt_args_are_mutually_exclusive.AutoPromptDriveris now only imported lazily when autoprompt mode is actually active.resolve_modeandvalidate_auto_prompt_args_are_mutually_exclusivetests fromtests/unit/autoprompt/test_core.pytotests/unit/test_clidriver.pyto reflect the new location of those functions.PrompterKeyboardInterruptinto a new minimalawscli/autoprompt/exceptions.pymodule, so thatawscli.errorhandlercan import it without pulling inawscli.autoprompt.factory(which has many top-levelprompt_toolkitimports).Displayimport inecs/monitorexpressgatewayservice.pylazy, deferring the import ofprompt_toolkit_display(and thereforeprompt_toolkit) until the interactive ECS monitoring display is actually used.PTKPromptandRequiredInputValidatorfromconfigure/sso.pyincustomizations/login/login.pylazy, deferring the import ofconfigure/sso.pyuntil the login region prompt is actually needed.StartLiveTailCommandandTailCommandincustomizations/logs/__init__.pylazy, deferring the import ofprompt_toolkituntil theaws logssubcommand is invoked (i.e. the logs subcommand table is built).ConfigureSSOCommand,ConfigureSSOSessionCommand, and their shared base classBaseSSOConfigurationCommandinto a newcustomizations/configure/sso_commands.pymodule, which has noprompt_toolkitimports. This allowsconfigure/configure.pyto keep its reference to these SSO commands and defer the import ofprompt_toolkit(viaconfigure/sso.py) untilaws configure ssois actually invoked.select_menufromwizard/ui/selectmenuinconfigure/sso.pylazy, deferring the import ofwizard/ui/__init__.py(which importsprompt_toolkit) untilConfigureSSOCommandis actually instantiated.create_wizard_appfromwizard/factoryinwizard/devcommands.pylazy, deferring the import of the wizard UI chain (which importsprompt_toolkit) until a wizard dev command is actually run.wizard/factoryinwizard/commands.pylazy by introducing a_get_runnermethod onTopLevelWizardCommand, deferring the import of the wizard factory and UI chain until a wizard command is actually executed.devcommandsinawscli/customizations/wizard/commands.pyfrom module-level to within theregister_wizard_commandsfunction. Although this currently gives no behavioral difference sinceregister_wizard_commandsis always called for every command, in the future when plugin/customization initialization is only called when a relevant command is entered (lazy plugin loading), this change will lead to lazy-importing ofdevcommandsonly when it's needed.Description of tests:
Benchmark Results: Performance Improvement Analysis
The following benchmarking analysis evaluates the impact of the Conditional Autoprompt feature across two standard CLI operations. Tests were performed with 2,000 samples per state to ensure statistical precision.
Tested on m7i.xlarge EC2 instance, x86 architecture, AL2023 os, 64 GB storage, 2,000 iterations each.
Consolidated Performance Gains
Statistical Significance & Reliability
To filter out environmental noise (CPU spikes, OS scheduling), we measured the performance shift in terms of Standard Error (SE).
aws --versionand ~379 SEs foraws sts get-caller-identity.Visual Analysis
Average execution times compared
Significant time savings are shown. We can expect similar improvements across all modeled commands, and some improvement(s) to customization commands.
Probability density functions of execution time
aws --version execution time probability.
aws sts get-caller-identity execution time probability.
The probability density curves in both graphs show no overlap between the "Before" and "After" states. The narrow, tall peaks confirm that the large sample size has provided a high-confidence estimate of the true performance improvement. For the version command, we note that the peak is taller and narrower in the "After" state, signaling that the patch not only improves execution time but also reduces noise.
Summary: The feature provides a consistent and significant performance reduction (11.70-13.35%) across different command types with statistical certainty.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.