Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,433 changes: 145 additions & 2,288 deletions packages/google-cloud-compute/google/cloud/compute/__init__.py

Large diffs are not rendered by default.

2,364 changes: 145 additions & 2,219 deletions packages/google-cloud-compute/google/cloud/compute_v1/__init__.py

Large diffs are not rendered by default.

136 changes: 136 additions & 0 deletions scripts/import_profiler/documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Python SDK Import Profiler: Documentation & Breakdown

This document provides a comprehensive guide to the `import_profiler` scripts, directory files, and how to analyze the generated import trace logs to target optimization areas, along with the details of the implemented lazy-loading refactor.

---

## 1. File Guide & Directory Structure
The profiling tool is located in the [scripts/import_profiler/](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/) directory:

* **[profiler.py](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/profiler.py)**: The core executable script. It is designed as a single-file, self-spawning harness that performs process-isolated importing benchmarks and generates trace logs.
* **[plan.md](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/plan.md)**: The current project phases and roadmap checklist.
* **[status.md](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/status.md)**: Tracks the active task state and hosts recorded baseline performance metrics.

---

## 2. Profiler Mechanism (`profiler.py`)
To benchmark import times accurately, we must bypass Python's dynamic module cache:

* **The Cache Problem:** When Python imports a module (e.g. `import os`), it registers it in `sys.modules`. Any subsequent import statement for the same module instantly retrieves it from `sys.modules` in `<0.1 ms`.
* **Process Isolation:** The script uses a **Master/Worker Architecture**.
* The **Master** process loops $N$ times.
* In each iteration, it launches a new interpreter subprocess using `sys.executable` pointing to itself with the `--worker` flag.
* The **Worker** process executes a fresh, isolated import, records memory/time metrics natively, dumps them as a JSON string to stdout, and exits.
* This guarantees a 100% "cold start" environment for every single import measurement.
* **Natively Captured Metrics:**
* **Peak Memory (RAM):** Tracked using `tracemalloc` to capture peak memory block allocation.
* **Load Time:** Tracked using `time.perf_counter()` to get high-resolution timing.

---

## 3. How to Interpret Python `-X importtime` Trace Logs
When running with the `--trace` flag, the script captures the raw stderr trace produced by Python's `-X importtime` option. The trace looks like this:

```
import time: self [us] | cumulative | imported package
import time: 536 | 536 | _io
import time: 1077 | 2385 | _frozen_importlib_external
import time: 773659 | 793010 | google.cloud.compute_v1.types.compute
```

### Explaining the Fields:
1. **`self [us]` (Microseconds):** The time spent importing the module itself, *excluding* any time spent importing its child dependencies.
2. **`cumulative` (Microseconds):** The total time spent loading the module *including* all nested imports. This represents the total wait time introduced by this line.
3. **Hierarchy Indentation:** Indented packages are sub-imports triggered by the parent module. A package with higher indentation is loaded deeper in the call stack.

---

## 4. Breakdown & Interpretation of Generated Traces

### A. Compute Client Trace
* **Total Import Latency:** ~10.4 seconds.
* **Primary Bottleneck - Massive Types File:**
* `google.cloud.compute_v1.types.compute` takes **`773.6 ms` self time** (almost a full second on a single file compilation/execution).
* **The Boilerplate Cascade:**
* Since compute contains over 100 submodules (`addresses`, `instances`, `firewalls`, `networks`, etc.), eager importing of `google.cloud.compute` imports *every single service client and transport*.
* Each service client executes a chain of duplicate helper modules (transport configurations, pagination, options, exceptions, and `urllib3`/`requests` dependencies), aggregating to a 10-second delay.

### B. AI Platform Client Trace
* **Total Import Latency:** ~13.3 seconds.
* **Primary Bottleneck - Massive Types Module:**
* `google.cloud.aiplatform_v1.types` triggers a chain of individual proto type files, loading hundreds of model and job definitions which aggregates to **`377 ms` cumulative time**.
* **Heavy Service Clients:**
* `data_foundry_service` alone introduces **`403 ms` cumulative latency**.
* `dataset_service` adds further transport and client initialization overheads.

---

## 5. Lazy-Loading Design and Implementation (PEP 562)

To solve eager initialization cost, we implemented a lazy-loading layer using Python PEP 562's module-level `__getattr__` and `__dir__` definitions.

```
Eager Import (Before) Lazy Import (After)
[10s - 13s] [~6ms - 279ms]
┌─────────────────────┐ ┌─────────────────────┐
│ import package │ │ import package │
└──────────┬──────────┘ └──────────┬──────────┘
│ │ (Instant)
┌────────────┴────────────┐ ▼
│ Eagerly import all │ ┌─────────────────────┐
│ submodules, types, LROs │ │ Return module proxy │
└─────────────────────────┘ └──────────┬──────────┘
│ (Only when class
│ is accessed)
┌─────────────────────┐
│ Dynamically resolve │
│ via __getattr__() │
└─────────────────────┘
```

### Key Solution Steps:
1. **Module Level `__getattr__`**: Overrode the attributes lookup of `__init__.py` modules so they intercept class name lookups (e.g. `compute.InstancesClient`).
2. **Registry Mapping (`_lazy_registry`)**: Maps each class name to its exact relative or absolute submodule file path.
3. **Dynamic Compilation**: Uses `importlib.import_module` to import the targeted submodule and load the requested class dynamically, caching it in `globals()` to make subsequent accesses execute in `<0.002 ms`.
4. **Deferred Monolith Imports**: Deferred the loading of massive type definitions (`google.cloud.compute_v1.types.compute` or `compat.types`) so they are compiled ONLY when a user actively instantiates a request/response type object.

---

## 6. Circular Import Resolution

Transitioning to lazy-loading disrupted Python's natural eager import ordering, revealing a deep circular dependency loop:
`base.py` ➔ `utils/__init__.py` ➔ `initializer.py` ➔ `metadata.py` ➔ `pipeline_jobs.py` ➔ `models.py` ➔ `jobs.py` ➔ `_publisher_models.py` ➔ `base.py`.

* **The Problem:** The compiler was trying to instantiate class `_PublisherModel(base.VertexAiResourceNoun)` before class `base.VertexAiResourceNoun` was defined because `base.py` was stuck waiting for `initializer.py` to compile.
* **The Resolution:** We analyzed the usage of `initializer` in `base.py` and `utils/__init__.py` and realized it was never called at the class definition or module-level scope. We commented out the top-level imports of `initializer` in both files and moved them **locally inside the methods** (`_submit`, `__init__`, `_instantiate_client`, `_list`, `full_resource_name`). This successfully broke the compilation cycle.

---

## 7. Performance Gains Summary

Running `profiler.py` on the lazy-loaded implementation demonstrates the following savings:

### google-cloud-compute:
* **Eager Import Time**: Reduced from **`10,356.92 ms`** to **`6.28 ms`** (**99.9% savings**).
* **Peak RAM Usage**: Reduced from **`96.29 MB`** to **`0.28 MB`** (**99.7% savings**).

### google-cloud-aiplatform:
* **Eager Import Time**: Reduced from **`13,342.05 ms`** to **`279.16 ms`** (**97.9% savings**).
* **Peak RAM Usage**: Reduced from **`156.56 MB`** to **`4.88 MB`** (**96.9% savings**).

---

## 8. Execution Reference
Ensure you are in the correct pyenv virtual environment where packages are installed in editable mode:

```bash
# 1. Run the profiler to get baseline/optimized outcomes (e.g. 5 iterations)
PYENV_VERSION=py312 python profiler.py --module=google.cloud.compute --iterations=5
PYENV_VERSION=py312 python profiler.py --module=google.cloud.aiplatform --iterations=5

# 2. Run the profiler to generate trace logs
PYENV_VERSION=py312 python profiler.py --module=google.cloud.compute --trace
PYENV_VERSION=py312 python profiler.py --module=google.cloud.aiplatform --trace
```
Loading
Loading