Migrate qBraid Target in CUDAQ to qBraid v2 by TheGupta2012 · Pull Request #5 · qBraid/cuda-quantum

TheGupta2012 · 2026-03-13T08:20:46Z

Reference: Builds on top of the PR qBraid integration MVP #4

Add the updates for migrating cudaq qbraid target to use qbraid platform v2. Include updates for jobs, main api update, auth updates for api-key auth, etc.

TheGupta2012 · 2026-03-13T08:21:48Z

@ryanhill1 I discarded the commit in current main so that we don't have to resolve conflicts there. I pulled from upstream/main, rebased in my local and now building on top of your changes

* working implementation using openQasm * modified and added test files(incomplete) * fix emulate command alignment * update polling + format * update polling interval and make code more readable * remove ionq fields from target-arguments * fix formatting * Add qBraid mock python server for testing Signed-off-by: Ryan Hill <ryanjh88@gmail.com> * Update __init__.py Signed-off-by: Ryan Hill <ryanjh88@gmail.com> * QbraidTester running correctly * added documentation for qbraid --------- Signed-off-by: Ryan Hill <ryanjh88@gmail.com> Co-authored-by: feelerx <superfeelerxx@gmail.com>

The deployments cleanup job only removes `default` environment deployments but not `ghcr-ci` ones. Every CI run creates multiple ghcr-ci deployments via dev_environment.yml, leaving "copy-pr-bot temporarily deployed to ghcr-ci — Inactive" entries cluttering PR timelines. Extend the existing cleanup loop to also delete ghcr-ci deployments. The production `ghcr-deployment` environment used by deployments.yml is not affected. Signed-off-by: mitchdz <mitch_dz@hotmail.com>

…DIA#4320) Fixes NVIDIA#4319. The basis-driven pattern selection in `decomposition{basis=...}` failed to select decomposition chains involving `SToR1` and `TToR1` because these patterns were registered with `s(1)`/`t(1)` metadata (controlled-only) despite their implementations handling any control count. The graph lookup in `DecompositionPatternSelection.cpp` used exact hash matching on `OperatorInfo`, so an unbounded `(n)` entry could not match a concrete control count. This left `CCX` gates undecomposed when `t` was not directly in the target basis. The fix updates `SToR1`/`TToR1`/`R1ToU3`/`U3ToRotations` registration to `(n)` and adds `OperatorInfo::matches()` for wildcard control count matching in `incomingPatterns()` and `findGateDist()`. Signed-off-by: Thomas Alexander <talexander@nvidia.com>

…4332) Signed-off-by: Adam Geller <adgeller@nvidia.com>

…IA#4330) This updates the unittest so that cudaq::state objects are used to capture and pass state information (amplitude vectors) into kernels. The new API contract is that this sort of state information shall be passed into CUDA-Q kernels as state objects and not raw vectors. --------- Signed-off-by: Eric Schweitz <eschweitz@nvidia.com>

Migrating Python bindings from pybind11 to nanobind - Adding nanobind as a submodule - Creating NanobindAdaptors for MLIR C-API type casters - Keeping pybind11 only for upstream MLIR Python extensions - Converting all `*_py.cpp ` binding files, headers, CUDAQuantumExtension.cpp, pyDynamics, interop library, and PYSCF plugin to nanobind --------- Signed-off-by: Sachin Pisal <spisal@nvidia.com>

I, Harshit <harshit.11235@gmail.com>, hereby add my Signed-off-by to this commit: 9cd62cf I, Harshit <harshit.11235@gmail.com>, hereby add my Signed-off-by to this commit: 3b0a1e4 I, Harshit <harshit.11235@gmail.com>, hereby add my Signed-off-by to this commit: 1a24c66 Signed-off-by: Harshit <harshit.11235@gmail.com>

I, TheGupta2012 <harshit.11235@gmail.com>, hereby add my Signed-off-by to this commit: 925ae39 I, TheGupta2012 <harshit.11235@gmail.com>, hereby add my Signed-off-by to this commit: 41fe248 I, TheGupta2012 <harshit.11235@gmail.com>, hereby add my Signed-off-by to this commit: d74243d Signed-off-by: TheGupta2012 <harshit.11235@gmail.com>

This is a rewrite of NVIDIA#4329, using a stateless class with static functions rather than a builder pattern. Signed-off-by: Luca Mondada <luca@mondada.net>

Fixes NVIDIA#4343. Signed-off-by: Sachin Pisal <spisal@nvidia.com>

…VIDIA#4335) When a kernel returns a vector (for `cudaq::run`), we insert `__nvqpp_vectorCopyCtor` which performs a `malloc` + `memcpy` to copy stack data to the heap. After `AggressiveInlining` and `ReturnToOutputLog`, the heap copy becomes dead but remains in the IR. This is normally cleaned up by LLVM's optimization passes, but on code paths that emit MLIR directly (e.g., `nop` for backends that consume `quake`), these dead allocations persist and get sent to the server. This PR adds a new MLIR pass, `eliminate-dead-heap-copy`, that redirects reads from the `malloc`'d buffer to the original `memcpy` source (the stack `alloca`), then erases the dead `malloc`, `memcpy`, and `cc.stdvec_init` ops. This can be added on-demand via target yml file. Update the mock server test to demonstrate that. --------- Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

Updating cuquantum version to 26.03.1 --------- Signed-off-by: Sachin Pisal <spisal@nvidia.com>

## Background `cudaq.sample` with `set_target("braket")` fails on v0.14.0+ with: RuntimeError: [line 10] cannot declare bit register. Only 1 bit register(s) is/are supported Amazon Braket's OpenQASM 2.0 parser enforces exactly one classical register per circuit. The payload CUDA-Q emits for the Bell-state reproducer in NVIDIA#4341 contains two. ## Root cause `addPipelineTranslateToOpenQASM` (`lib/Optimizer/CodeGen/Pipelines.cpp`) was refactored in NVIDIA#3693 to run `ExpandMeasurements` unconditionally. For `qasm2` backends that run `combine-measurements` in the mid pipeline (Braket, Scaleway, Quantum Machines), the sequence becomes: 1. Mid pipeline: `combine-measurements` merges per-qubit measurements into a single `quake.mz` on the whole `!quake.veq` - the intent being "emit one `creg` spanning all qubits". 2. Translate pipeline: `ExpandMeasurements` re-expands the combined `mz` into one `mz` per qubit, then loop-unrolls. 3. OpenQASM2.0 emitter: writes one `creg` declaration per `mz`. Target-specific YAML intent is silently overridden in the translate pipeline. ## Fix 1. `lib/Optimizer/CodeGen/Pipelines.cpp`: revert `addPipelineTranslateToOpenQASM` to the thin cleanup it was before NVIDIA#3693 . Each backend's YAML now drives measurement expansion. 2. `infleqtion.yml` and `tii.yml`: add `jit-high-level-pipeline: "expand-measurements"`. These targets previously depended on the unconditional expansion to get one `creg` per measured qubit; the explicit entry preserves that behavior. 3. `test/Translate/OpenQASM/basic.qke` and `test/Translate/openqasm2_*.cpp`: update CHECK lines to match the single-`creg` output for a vector `mz` (which is what the emitter produces after the fix). ## Impact | Backend | creg count for `mz(qvector(n))` | |---|---| | Braket, Scaleway, Quantum Machines | 1 (single `creg` of size n) | | Infleqtion, TII | n (preserved via new YAML entry) | | Quantinuum, IQM, OQC, Anyon, QCI | n (unchanged; already had `expand-measurements` in YAML) | The change is scoped to `addPipelineTranslateToOpenQASM`, which only runs for `codegen-emission: qasm2`. Simulators and non-OpenQASM2.0 backends are unaffected. ## Testing - `ninja check-cudaq-mlir` passes with the updated CHECK lines. - `cudaq.translate(kernel, format="openqasm2")` under `set_target(...)` for Braket, Scaleway, Infleqtion, TII — creg counts match the matrix above. - Reproducer from NVIDIA#4341 now emits exactly the "expected" OpenQASM2.0 shown in the issue: `creg var3[2]; measure var0 -> var3;`. - Manually tested against real servers: `test_braket.py`, `test_Infleqtion.py`, `test_tii.py`, `test_scaleway.py`. ## Follow-up An automated local test set up for OpenQASM payload validator will be added in a separate PR. Fixes NVIDIA#4341. --------- Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>

…frastructure (NVIDIA#4349) ## Summary Reverts PRs - NVIDIA#3800, - NVIDIA#4204, - NVIDIA#4208, - NVIDIA#4266, - NVIDIA#4267. Following an architecture alignment meeting (Apr 17), we are changing direction on how measurement results are represented in CUDA-Q. The `measure_result` standalone class and `!quake.measurements<N>` Quake type introduced by these PRs are being replaced by a new `measure_handle` approach with fundamentally different semantics. This revert restores: * `measure_result` as a typedef to bool (compiler mode) * Multi-qubit mz returning `!cc.stdvec<!quake.measure>` * Removes `!quake.measurements<N>` type, `quake.get_measure`, `quake.measurements_size` ops * Removes `quake.relax_size` extension for measurements * Removes `QIRResultArrayCreate` / `QIRResultArrayGetElementPtr1d` QIR intrinsics * Removes 8 test files added by the reverted PRs ### Forward direction (follow-up PRs): New `measure_handle` Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>

Skipping identity terms when building the Pauli word and coefficient lists passed to the Krylov kernel. Controlled exp_pauli does not handle the identity terms. We add their contribution back when assembling the Hamiltonian matrix. Fixes https://github.com/NVIDIA/cuda-quantum/actions/runs/24584888146/job/71904057326#step:5:1955 Signed-off-by: Sachin Pisal <spisal@nvidia.com>

…rs (NVIDIA#4351) Fixed the `test_state_mps.py - AttributeError: 'list' object has no attribute 'dtype'` errors in https://github.com/NVIDIA/cuda-quantum/actions/runs/24624569814/job/72005503960#step:7:43857 The fix for the rest of the failure (`RuntimeError: invalid value`) will come in a separate PR. Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

This PR removes argument synthesis by default for Python kernels run on the local simulator, instead directly invoking them with the arguments (currently, by constructing a message buffer through `.argsCreator` which is passed to the kernel's `thunk`). This only affects entry point kernels. Benefits: 1. This makes it unnecessary to recompile kernels for different arguments in this setting, simplifying the `reuse_compiler_artifacts` logic. 2. It aligns the python local simulation path more closely with C++, where arguments are similarly not synthesized. 3. As a result of 1 and 2, it is a useful and important first step towards an inter-launch caching strategy for python. --------- Signed-off-by: Adam Geller <adgeller@nvidia.com> Signed-off-by: Luca Mondada <luca@mondada.net> Co-authored-by: Luca Mondada <luca@mondada.net>

Signed-off-by: TheGupta2012 <harshit.11235@gmail.com>

Signed-off-by: Adam Geller <adgeller@nvidia.com>

…VIDIA#4450) Signed-off-by: Adam Geller <adgeller@nvidia.com>

cuda 12.6 doesn't work with clang++ 22.1. Re-enable gcc12 toolchain support to work around this. --------- Signed-off-by: Adam Geller <adgeller@nvidia.com> Signed-off-by: Mitchell <mitchdz@plasticmemories.xyz> Co-authored-by: Mitchell <mitch_dz@hotmail.com> Co-authored-by: Mitchell <mitchdz@plasticmemories.xyz>

While building flang in gcc12, potential OOM errors persisted. This update uses beefier runners with 64GB of RAM, and also restricts ninja to 8 concurrent threads. Signed-off-by: mdzurick <mitch_dz@hotmail.com>

Signed-off-by: Adam Geller <adgeller@nvidia.com>

@pytest

- If the launched server exits before becoming reachable, waitpid(WNOHANG) breaks out of the 50s ping loop so we move to the next port immediately, instead of throwing `RuntimeError: No usable ports available` only after few minutes. - Dropping static from the mt19937 so seed_offset is honoured on every construction, not just the first one. Fixes: ``` @pytest.fixture(scope="session", autouse=True) def startUpMockServer(): > cudaq.set_target("remote-mqpu", auto_launch=str(num_qpus)) E RuntimeError: No usable ports available tmp/tests/remote/test_remote_platform.py:71: RuntimeError ``` https://github.com/NVIDIA/cuda-quantum/actions/runs/25393104770/job/74490650292#step:7:1282 Signed-off-by: Sachin Pisal <spisal@nvidia.com>

Signed-off-by: Adam Geller <adgeller@nvidia.com>

…4459) - `move_artifacts` in `scripts/migrate_assets.sh` emitted an `rm` + `rmdir -p` pair per file. With LLVM 22 (~7k files) bundled into `cudaq/lib/llvm`, the generated `uninstall.sh` ballooned to a ~15k-line `if $continue; then ... fi` body, causing bash to segfault mid-uninstall in the "Additional validation (MPI and uninstall)" CI step on ubuntu/debian/fedora/redhat. - Capture top-level entries in `$1` before the move and emit one `rm -rf -- "$2/<entry>"` per entry. Trailing `rm -rf "$CUDA_QUANTUM_PATH"` is unchanged. Co-authored by: AI Signed-off-by: Sachin Pisal <spisal@nvidia.com>

…4463) Tests will soon be removed due to NVIDIA#4276 anyway. Signed-off-by: Adam Geller <adgeller@nvidia.com>

) This is not permanent, and should be immediately fixed upon followup. This notebook is currently blocking CI from opening after the llvm22 update, so this PR unconditionally skips it with a TODO to open it up later. Signed-off-by: mdzurick <mitch_dz@hotmail.com>

Signed-off-by: Adam Geller <adgeller@nvidia.com> Signed-off-by: Adam T. Geller <adgeller@nvidia.com>

Stacks on top of NVIDIA#4392 This PR introduces a `KernelArgs` type that stores kernel arguments EITHER as 'packed' arguments in a contiguous memory buffer OR as a vector of void*. `KernelArgs` also supports storing both representations, which is used by `hybridLaunchKernel`. Using this type allows us to make the signatures of the launch endpoints more homogeneous, effectively hiding the different conventions as implementation details that only need to be handled within the function implementations. Resulting signatures: ```c++ [[nodiscard]] virtual KernelThunkResultType launchKernel(const std::string &name, KernelThunkType kernelFunc, KernelArgs args); [[nodiscard]] virtual KernelThunkResultType launchModule(const CompiledModule &compiled, KernelArgs args); ``` --------- Signed-off-by: Luca Mondada <luca@mondada.net>

Port to llvm-22. --------- Signed-off-by: Eric Schweitz <eschweitz@nvidia.com>

…VIDIA#4405) ## Summary * Extend `--expand-measurements` to scalarize `quake.mz`/`mx`/`my` `%veq -> !cc.stdvec<!cc.measure_handle>`. * Builds on NVIDIA#4404 * No source-language or runtime API change. ## Motivation The pass previously hardcoded `!quake.measure` for per-element output and only handled `quake.discriminate` consumers of the vector result. Handle-typed vector measurements require per-element `!cc.measure_handle` output and can flow to non-discriminate consumers (returns, stores, calls), neither of which the legacy `vector<bool>`-only rewrite supported. ## What Changed - `ExpandRewritePattern` tracks the input stdvec's element type and emits per-element measurements of the matching type (`!quake.measure` or `!cc.measure_handle`). - Consumers are classified as discriminate vs non-discriminate. Handle inputs allocate each buffer only when its consumer class is present; legacy `!cc.stdvec<!quake.measure>` inputs always allocate the i1 buffer so existing AST-Quake CHECK lines stay stable. - Original op is replaced via `replaceOp` (atomic) instead of `eraseOp`, so partial conversion does not try to re-legalize downstream `func.return` consumers. - New lit test `test/Transforms/expand_measurements_handle.qke` covers handle stdvec with each consumer class (return-only, discriminate-only, mixed, `cc.store`), mixed `ref + veq` operands, and `mx`/`my` parity. --------- Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

NVIDIA#2608 is already fixed in main, but this adds regression tests for the future. Signed-off-by: mdzurick <mitch_dz@hotmail.com>

…IDIA#4474) Signed-off-by: Eric Schweitz <eschweitz@nvidia.com>

## Summary macOS 26 (Tahoe) SDK removed `__has_builtin` guards from libc++ headers (e.g., `__builtin_ctzg`, `__is_nothrow_convertible`), making them incompatible with LLVM 16's clang. Additionally, Apple Clang 21 introduced new warnings that break the build under `-Werror`. This PR fixes both issues so that CUDA-Q builds and passes all tests on macOS 26 with Apple Clang 21. ### LLVM libc++ runtimes (SDK 26+) - `set_env_defaults.sh`: Auto-detect active SDK version via `xcrun --show-sdk-version`. When SDK >= 26, include `runtimes` in `LLVM_PROJECTS` to build LLVM's own libc++. - `cudaq-quake.cpp`: Use `-nostdinc++` to suppress SDK C++ headers when LLVM's libc++ is available, while keeping `-isysroot` for C standard headers. - `nvq++.in`: Add `-Wl,-syslibroot` for `ld64.lld` to find `libSystem` in the SDK. Add `-lc++abi` during final link when LLVM's `libc++abi` is present. - `build_llvm.sh`: Do not bake SDK sysroot paths into `clang++.cfg` on macOS (they become stale after Xcode updates). Sysroot is resolved at runtime by `nvq++`. ### Relocatable linking - `nvq++.in`: LLVM 16's `ld64.lld` does not implement `-r` (relocatable linking). Probe the configured linker and fall back to system `ld` for the object merge step if needed. - `device_call.cpp`: Update FileCheck pattern to match both Apple `ld` and `ld64.lld` error formats. ### Apple Clang 21 warnings - `CMakeLists.txt`: Add `-Wno-character-conversion` for gtest (third-party). - `server_impl/CMakeLists.txt`: Add `-Wno-deprecated-literal-operator` for Crow (third-party). - `cudaq.cpp`: Fix `-Wnontrivial-memcall` on `memset` with `std::vector<bool>`. - `vqe_tester.cpp`: Add missing `#pragma` to suppress `-Wdeprecated-declarations` for backward-compatibility test (matching existing pattern in `builder_tester.cpp`). --------- Signed-off-by: ikkoham <ikkoham@users.noreply.github.com> Signed-off-by: Thomas Alexander <talexander@nvidia.com> Co-authored-by: Thomas Alexander <talexander@nvidia.com>

The gcc-12 -Wrestrict workaround in runtime/common/CMakeLists.txt and realtime/unittests/CMakeLists.txt was added as a PUBLIC compile option, so cmake propagates it through interface inheritance into nvcc command lines on consumer CUDA targets. nvcc forwards it to its host gcc, which may be a different gcc than CMAKE_CXX_COMPILER (e.g. gcc-12 vs gcc-13) and may not recognize the flag. Wrap it in a $<$<COMPILE_LANGUAGE:CXX>:..> generator expression so it only appears on CXX command lines. Signed-off-by: Chuck Ketcham <cketcham@nvidia.com>

@schweitzpgi

…A#4432) Follow ups as noted by @schweitzpgi and @boschmitt. --------- Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Similar to NVIDIA#4477, we need this fix for the `gtest` target, which will be used to compile `get_state_tester.cu`. Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

As we prepare to surface qpu headers in user code, `cudaq.h` is become significantly slower to parse in a prototyping sandbox. This PR addresses a slowdown introduced by logger.h as can be seen in the profile below. The compiled code is: ```cpp #include <cudaq.h> int main() { return 0; } ``` The compile time is 3.6s. ``` time clang++ cudaq_inc.cpp -std=c++20 -I ~/cudaq/cq2/install/cudaq/include real 0m3.603s ``` The profile shows logger.h takes 441ms of parse time. <img width="689" height="519" alt="logger-profile-1" src="https://github.com/user-attachments/assets/2a677887-0ef9-432c-a07a-6b5f3c8aabed" /> With this patch, the compilation time becomes 2.9s ``` time clang++ cudaq_inc.cpp -std=c++20 -I ~/cudaq/cq2/install/cudaq/include real 0m2.936s ``` The profile show logger.h only taking 7ms. <img width="676" height="465" alt="logger-profile-2" src="https://github.com/user-attachments/assets/9034acc9-829e-4960-97a8-f433c41107de" /> This is achieved by replacing the `std::variant` with a `FormatArgument` which only stores a pointer and an out-of-line appending callback. The callback is instantiated in logger.cpp. --------- Signed-off-by: Renaud Kauffmann <rkauffmann@nvidia.com>

Removing sub-skills placeholder as those can be added back when ready Signed-off-by: Sachin Pisal <spisal@nvidia.com>

## Summary - Bump `DEFAULT_VERSION` in `TiiServerHelper.cpp` from `0.2.2` to `0.2.4` ## Motivation - The TII server (`q-cloud.tii.ae`) now enforces a minimum `qibo-client` version of 0.2.3, returning HTTP 426 (Upgrade Required) for older clients. - This breaks both C++ and Python TII targets with: ``` {"detail":"Outdated client version: 0.2.2. Please upgrade to qibo-client >= 0.2.3."} ``` ## Testing - Verified locally against `q-cloud.tii.ae` that requests succeed with version `0.2.4`. Signed-off-by: Pradnya Khalate <pkhalate@nvidia.com>

These changes make sure that inlined functions have their scopes preserved through the inlining process. Doing this facilitates keeping track of live ranges from variables declared in the called function's body, which allows for more precise allocation and deallocation of qubits in simulation, etc. Add python regression test. --------- Signed-off-by: Eric Schweitz <eschweitz@nvidia.com>

This will fix the errors we are seeing since the apt proxy cache is getting rate limited. Potentially enable in the future again. Signed-off-by: mitchdz <mitch_dz@hotmail.com>

NOTE: This is a re-post of NVIDIA#4413, which I merged into the wrong branch! It's already been reviewed, discussed and approved. --- Stacks on top of NVIDIA#4398 PR NVIDIA#4398 made the first of two steps towards homogenizing all launch endpoints into a same function signature by introducing `KernelArgs`. This PR introduces a `SourceModule` type that stores the kernels themselves in different formats depending on the host language: - a C++ kernel is stored as a name + function pointer. The name is used in the runtime to retrieve the Quake representation of the kernel if required. The function pointer points to the function that was compiled by nvq++ and is used for local simulation. - a Python kernel is stored as a name + MLIR ModuleOp. Both local and remote executions use the MLIR ModuleOp as the source of truth for the kernel definition. If the kernel launch must be executed locally, the MLIR will be compiled and JITed, otherwise it will be compiled and submitted to the remote endpoint. Using this type, we can effectively hide ALL differences between the host languages and the various launching conventions behind a homogeneous API. Resulting signatures: ```c++ [[nodiscard]] virtual KernelThunkResultType launchKernel(const SourceModule &src, KernelArgs args); [[nodiscard]] virtual KernelThunkResultType launchModule(const CompiledModule &compiled, KernelArgs args); [[nodiscard]] virtual CompiledModule compileModule(const SourceModule &src, KernelArgs args, bool isEntryPoint); ``` Note that for Python, the current execution is broken up into `compileModule` -> `launchModule`, whereas for C++, both compile and launch steps are still in one -- hence the signature difference. Changing the C++ launch path to mirror the Python one is planned upcoming work. Signed-off-by: Luca Mondada <luca@mondada.net>

Signed-off-by: TheGupta2012 <harshit.11235@gmail.com>

TheGupta2012 changed the title ~~Migrate CUDAQ v2~~ Migrate qBraid Target in CUDAQ to qBraid v2 Mar 13, 2026

TheGupta2012 self-assigned this Mar 16, 2026

TheGupta2012 force-pushed the migrate-cudaq-v2 branch from a72236c to 46593c0 Compare April 2, 2026 11:59

ryanhill1 and others added 6 commits April 14, 2026 12:03

update: migrate cudaq to platform v2

925ae39

fix: merge conflicts

41fe248

add: api_key and device to set_target for qbraid

d74243d

fix: submodule hashes and v2 platform implementation and test

9cd62cf

fix: formatting and headers

3b0a1e4

TheGupta2012 force-pushed the migrate-cudaq-v2 branch from 46593c0 to 3b0a1e4 Compare April 15, 2026 10:41

mitchdz and others added 19 commits April 15, 2026 13:33

Fail earlier if counting resources from IR won't be possible (NVIDIA#…

06a903d

…4332) Signed-off-by: Adam Geller <adgeller@nvidia.com>

fix: docs for qbraid helper and update examples

1a24c66

Merge branch 'main' into migrate-cudaq-v2

71b9450

Add stateless CompiledModuleHelper (NVIDIA#4338)

66a1e2d

This is a rewrite of NVIDIA#4329, using a stateless class with static functions rather than a builder pattern. Signed-off-by: Luca Mondada <luca@mondada.net>

removing email (NVIDIA#4344)

ac40cc2

Fixes NVIDIA#4343. Signed-off-by: Sachin Pisal <spisal@nvidia.com>

Updating cuquantum version to 26.03.1 (NVIDIA#4342)

8d7e922

Updating cuquantum version to 26.03.1 --------- Signed-off-by: Sachin Pisal <spisal@nvidia.com>

fix: formatting issues, api key leak and default num of qubits

f6ba810

Signed-off-by: TheGupta2012 <harshit.11235@gmail.com>

atgeller and others added 30 commits May 4, 2026 21:49

Only require relevant checks on PR push (NVIDIA#4449)

4b6fc77

Signed-off-by: Adam Geller <adgeller@nvidia.com>

Add compiler guards to warning suppression pragmas for cutensornet (N…

7838c86

…VIDIA#4450) Signed-off-by: Adam Geller <adgeller@nvidia.com>

restrict ninja to -j8, use larger systems for flang build (NVIDIA#4452)

cc705d8

While building flang in gcc12, potential OOM errors persisted. This update uses beefier runners with 64GB of RAM, and also restricts ninja to 8 concurrent threads. Signed-off-by: mdzurick <mitch_dz@hotmail.com>

Suppress false positive std::swap warnings in matrix files (NVIDIA#4454)

0c0fc82

Signed-off-by: Adam Geller <adgeller@nvidia.com>

Add gcc12 to required checks (NVIDIA#4456)

108efd2

Signed-off-by: Adam Geller <adgeller@nvidia.com>

Add gcc12 to build+test for PR push (NVIDIA#4460)

af5d2dd

Signed-off-by: Adam Geller <adgeller@nvidia.com>

Disable remote platform py tests due to page alignment issue (NVIDIA#…

6c479f2

…4463) Tests will soon be removed due to NVIDIA#4276 anyway. Signed-off-by: Adam Geller <adgeller@nvidia.com>

Use gfortran instead of flang for gcc12 toolchain (NVIDIA#4453)

6d41887

Signed-off-by: Adam Geller <adgeller@nvidia.com> Signed-off-by: Adam T. Geller <adgeller@nvidia.com>

Merge remote-tracking branch 'upstream/main' into migrate-cudaq-v2

2bc6bc5

[core] Add a desugar-cable-ops pass (NVIDIA#4457)

5f900f7

Port to llvm-22. --------- Signed-off-by: Eric Schweitz <eschweitz@nvidia.com>

add tests for NVIDIA#2608 (NVIDIA#4380)

e08a48d

NVIDIA#2608 is already fixed in main, but this adds regression tests for the future. Signed-off-by: mdzurick <mitch_dz@hotmail.com>

Update intrinsics for stacksave and stackrestore to new spelling. (NV…

b274a1c

…IDIA#4474) Signed-off-by: Eric Schweitz <eschweitz@nvidia.com>

Additional tracing and resource estimation target improvements (NVIDI…

3abeb18

…A#4432) Follow ups as noted by @schweitzpgi and @boschmitt. --------- Signed-off-by: Thomas Alexander <talexander@nvidia.com>

[Testing][Build] Fix a test file build config for gcc-12 (NVIDIA#4481)

345669b

Similar to NVIDIA#4477, we need this fix for the `gtest` target, which will be used to compile `get_state_tester.cu`. Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

Removing sub-skills placeholder (NVIDIA#4482)

a447d39

Removing sub-skills placeholder as those can be added back when ready Signed-off-by: Sachin Pisal <spisal@nvidia.com>

remove apt proxy cache (NVIDIA#4487)

7744cc7

This will fix the errors we are seeing since the apt proxy cache is getting rate limited. Potentially enable in the future again. Signed-off-by: mitchdz <mitch_dz@hotmail.com>

Merge branch 'main' into migrate-cudaq-v2

bcd43e5

fix: login order for integration tests, ref PR NVIDIA#4486

2b65005

Signed-off-by: TheGupta2012 <harshit.11235@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate qBraid Target in CUDAQ to qBraid v2#5

Migrate qBraid Target in CUDAQ to qBraid v2#5
TheGupta2012 wants to merge 133 commits intomainfrom
migrate-cudaq-v2

TheGupta2012 commented Mar 13, 2026

Uh oh!

TheGupta2012 commented Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

Conversation

TheGupta2012 commented Mar 13, 2026

Uh oh!

TheGupta2012 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

TheGupta2012 commented Mar 13, 2026 •

edited

Loading