Vulkan: export allocator for custom contexts by xFile3160 · Pull Request #8871 · halide/Halide

xFile3160 · 2025-11-17T21:55:24Z

This adds Vulkan allocator lifecycle APIs for applications that provide their own VkInstance, VkDevice, and queue.
Halide still owns internal Vulkan allocator/shader-module state for that external context. The application can acquire that Halide allocator state, keep it with its Vulkan context, and release it before destroying its VkDevice.
The shader cache stays keyed by VkDevice for sharing. Releasing an allocator now removes only the cache entries owned by that allocator.
Added Vulkan acquire/release AOT coverage.

AI disclosure:
I used OpenAI Codex to help inspect code, draft/revise the patch. I reviewed the final code and take responsibility for it.

xFile3160 · 2025-11-17T21:56:48Z

Full discussion available here: #8715 (comment)

alexreinking · 2025-11-18T16:51:19Z

@derek-gerstmann could you review this PR? Seems you were discussing the related issue.

xFile3160 · 2026-01-20T21:42:35Z

Any news about this? Should I rebase?

alexreinking · 2026-01-20T23:14:32Z

Hi @xFile3160 -- happy new year! Yes, please rebase (looks like you just did). I'll ping @derek-gerstmann again to look at this since he has in-depth knowledge of the Vulkan backend.

derek-gerstmann · 2026-01-20T23:27:22Z

Thanks for the reminder! I'll look this over this week!

derek-gerstmann · 2026-01-20T23:49:21Z

+//   with the same locking used by the custom acquire/release implementations. This allows the allocator to be
+//   saved for future halide_vulkan_acquire_context calls that Halide will automatically issue to retrieve
+//   the custom context.
+extern int halide_vulkan_export_memory_allocator(void *user_context,


I don't understand the need for this method, or for the corresponding release method. The allocator should be stored in your custom context, and held onto for the lifetime of the context. The context manages lifespan of the allocator.

derek-gerstmann · 2026-01-20T23:49:32Z

+// - halide_vulkan_memory_allocator_release
+//   releases the internally allocated memory allocator, important for proper memory cleanup. Must have overridden halide_vulkan_acquire_context
+//   and halide_vulkan_release_context, and must coordinate with the same locking as the custom implementations.
+extern int halide_vulkan_memory_allocator_release(void *user_context,


See above comment.

derek-gerstmann · 2026-01-20T23:50:38Z

    return is_initialized;
 }

+WEAK int halide_vulkan_export_memory_allocator(void *user_context, halide_vulkan_memory_allocator *allocator) {


This doesn't actually do anything other than check to see if the allocator is null.

derek-gerstmann · 2026-01-20T23:52:00Z

    return destroy_status;
 }

+WEAK int halide_vulkan_memory_allocator_release(void *user_context,


Not sure I understand the intent ... was it to have a public method to invoke the destructor for the allocator?

derek-gerstmann · 2026-01-20T23:57:12Z

            error = halide_error_code_device_interface_no_device;
            halide_error_no_device_interface(user_context);
        }
+        // If user overrode halide_vulkan_acquire_context and returned nullptr for allocator,


This class shouldn't be doing anything other than holding a lock on the context. It's just a convenient wrapper for the internal methods to have a lock that lives within a scope.

derek-gerstmann · 2026-04-22T20:51:17Z

+//   with the same locking used by the custom acquire/release implementations. This allows the allocator to be
+//   saved for future halide_vulkan_acquire_context calls that Halide will automatically issue to retrieve
+//   the custom context.
+extern int halide_vulkan_export_memory_allocator(void *user_context,


I'd suggest following the conventions of the context methods and naming this halide_vulkan_acquire_memory_allocator.

derek-gerstmann · 2026-04-22T20:51:39Z

+// - halide_vulkan_memory_allocator_release
+//   releases the internally allocated memory allocator, important for proper memory cleanup. Must have overridden halide_vulkan_acquire_context
+//   and halide_vulkan_release_context, and must coordinate with the same locking as the custom implementations.
+extern int halide_vulkan_memory_allocator_release(void *user_context,


Same as above. I'd suggest I'd suggest naming this halide_vulkan_release_memory_allocator.

derek-gerstmann · 2026-04-22T20:52:36Z

 }

+WEAK int halide_vulkan_export_memory_allocator(void *user_context, halide_vulkan_memory_allocator *allocator) {
+    halide_mutex_lock(&thread_lock);


This default implementation doesn't actually do anything ... shouldn't it return the allocator associated with the context?

derek-gerstmann · 2026-04-22T20:54:55Z

+        return halide_error_code_buffer_argument_is_null;
+    }
+
+    return vk_release_memory_allocator(user_context, (VulkanMemoryAllocator *)allocator,


Lifetime management is an issue here. How do we know there are no remaining uses for the allocator? Also, allocators are specific to the context, so we need to make sure the given allocator matches the one associated with the given context.

derek-gerstmann · 2026-04-22T20:57:11Z

+            halide_start_clock(user_context);
+#endif
+            // make sure halide vulkan is loaded BEFORE creating allocator
+            debug(user_context) << "VulkanContext: Loading Vulkan function pointers for context override...\n";


This is not the right place to initialize device function pointers. They are specific to the context, and should only be initialized once, which is why they are only done in the acquire_context method.

xFile3160 · 2026-04-22T21:19:04Z

I've changed this patch quiet a bit actually. But @derek-gerstmann your comments make absolutely sense, and I'm going to address/explain the intent and the new API a bit better soon. Sorry, I haven't followed up either updating this patch.

derek-gerstmann

This seems like it was vibe coded ... there's too many unnecessary/unrelated changes and it doesn't match what we discussed.

The proposal was to add two methods halide_vulkan_acquire_memory_allocator and halide_vulkan_release_memory_allocator.

This would allow you to override halide_vulkan_acquire_context() and halide_vulkan_release_context() by declaring them in your code base and relying upon the weak linking to override the default.

In your custom halide_vulkan_acquire_context() you have the ability to now call halide_vulkan_acquire_memory_allocator() to create an allocator instance, and return it. In your custom halide_vulkan_release_context() you can now call halide_vulkan_release_memory_allocator().

Likewise, the existing halide_vulkan_acquire_context() would need to be modified to also call the newly added halide_vulkan_acquire_memory_allocator() to create the allocator, and return it. And then the default halide_vulkan_release_context() would need to be modified to call halide_vulkan_release_memory_allocator().

derek-gerstmann · 2026-05-04T19:51:05Z


        for (int i = 0; i < (1 << log2_compilations_size); i++) {
-            if (compilations[i].kernel_id > kInvalidId &&
+            if (compilations[i].kernel_id > kDeletedId &&


Why are you changing things in the GPUCompilationCache?

Shader modules cached in GPUCompilationCache are owned by Halide runtime state associated with the allocator used to create/destroy them. For externally managed contexts, VkDevice lifetime and Halide allocator lifetime are not the same boundary. Keying by allocator lets the release_memory_allocator delete only the cache entries owned by that allocator. I've done this to prevent stale shader-module/cache when external context tear down Halide allocator state without destroying the vkDevice, not owned by halide

There's two separate issues here. The first is the line I commented on in gpu_context_common.h. Why are you changing anything in this file?

The second issue is the change in the type definition for the GPUCompilationCache Key being used in the Vulkan runtime. The reason it was specified with the Device pointer was to allow sharing across contexts for the same devices created by the same instance to minimize kernel launch overhead.

Changing this to the allocator pointer now means the compilation cache isn't shared for all contexts for the same device, since the allocator pointer is created dynamically for the context.

I'd suggest leaving it as it is, and release the compilation cache inside of halide_vulkan_release_allocator() to detach the external vkDevice.

derek-gerstmann · 2026-05-04T19:53:03Z

+/** Override the Vulkan context acquisition callback. Returns the previous
+ * handler. If unset, Halide uses its built-in Vulkan context management.
+ */
+extern halide_vulkan_acquire_context_t halide_set_vulkan_acquire_context(halide_vulkan_acquire_context_t handler);


No .... I don't think we can allow this. This doesn't match the runtime interface design. These methods are overloaded via weak linking.

I added the setter callbacks to support embedding environments where weak-symbol interposition is unreliable or unavailable, especially Windows-style linkage. Isn't vulkan cross-platform? If you don't want this in this PR I can move it out, but I think this is important to discuss.

~~Yes, lets move the setter callbacks into a separate PR.~~

I'll raise this at the next community dev meeting.

I believe the CUDA get/set acquire/release context methods were added to support JIT compilation many years ago, but we really don't want to force an indirect call for everyone if we don't have to. With AOT, you can always override this method yourself regardless of the weak symbols.

derek-gerstmann · 2026-05-04T19:53:37Z

+extern halide_vulkan_acquire_context_t halide_set_vulkan_acquire_context(halide_vulkan_acquire_context_t handler);
+
+/** Override the Vulkan context release callback. Returns the previous handler. */
+extern halide_vulkan_release_context_t halide_set_vulkan_release_context(halide_vulkan_release_context_t handler);


Same as above.

Windows-OS doesn't like WEAK. Vulkan should be eventually supported by it, am I mistaken?
So to give you some context: I'm developing/building a cross-platform studio that uses halide as recommended way to implement image-processing kernels. This thing, owns stuff, vkDevice, vkInstance and stuff. But the intention is to leverage the memory allocator inside halide safely. This leads me to:

First introduce this APIs like it was done for CUDA I think, without weak linkage to support windows?

The gpu compilation cache keyed by allocator instead of vkDevice because halide doesn't own it

This is more a design decision for the Halide runtime more than anything else. They all follow the same interface which has been very stable for a long long time. Yes, the MSVS toolchain is a pain to deal with for weak linking, but that doesn't prevent you from writing your own custom runtime which is usually what most integrators due when they wish to customize the behavior of the runtime to their app/framework.

My main concern is forcing an indirect call for all acquire/release context invocations. I'll raise this at the dev meeting this week and let you know how to proceed!

@xFile3160 Okay, chatted with the rest of the team, and adding the get/set acquire/release access methods is fine if it makes things easier to use. Feel free to leave them in this PR!

derek-gerstmann · 2026-05-04T19:57:10Z

+namespace Vulkan {

-// --------------------------------------------------------------------------
+ALWAYS_INLINE int vk_load_external_context_functions(void *user_context, VkInstance instance, VkDevice device) {


This isn't really specific to external contexts ... just call it vk_load_vulkan_interface

This should only be done once per context creation, not repeatedly.

This helper was introduced because with the new api to acquire external context, the caller returns an already created VkInstance/VkDevice etc. Halide still needs the device functions internally. This helper is making sure Halide's dispatch table is initialized for the supplied external instance/device. But you're right, vk_load_vulkan_interface is probably better and it should not be called everytime

We really don't want to be reloading dispatch tables. They should be loaded once, when the context is created.

derek-gerstmann · 2026-05-04T19:59:24Z

 //   call to halide_release_vulkan_context. halide_acquire_vulkan_context
 //   should block while a previous call (if any) has not yet been
 //   released via halide_release_vulkan_context.
-WEAK int halide_vulkan_acquire_context(void *user_context,


~~These can't be changed ... they need to match all the other runtimes.~~

derek-gerstmann · 2026-05-04T20:03:23Z

+        return halide_error_code_internal_error;
+    }
+
+    int error_code = vk_load_external_context_functions(user_context, instance, device);


Again, this should only be called during context creation, and only once.

derek-gerstmann · 2026-05-04T20:04:06Z

+        return halide_error_code_symbol_not_found;
+    }
+
+    vk_destroy_shader_modules(user_context, runtime_allocator);


Why are you destroying shader modules in this method?

derek-gerstmann · 2026-05-04T20:09:09Z

        return;
    }

-    if (shader_module->descriptor_set_layouts) {


Why did you reorder this set of statements and move them down below?

derek-gerstmann · 2026-05-04T20:11:00Z

 };

-WEAK Halide::Internal::GPUCompilationCache<VkDevice, VulkanCompilationCacheEntry *> compilation_cache;
+WEAK Halide::Internal::GPUCompilationCache<VulkanMemoryAllocator *, VulkanCompilationCacheEntry *> compilation_cache;


Why did you change the cache entry to use an Allocator pointer?

derek-gerstmann · 2026-05-04T20:18:08Z

+                                       uint32_t *queue_family_index,
+                                       VkDebugUtilsMessengerEXT *messenger,
+                                       bool create) {
+    return vulkan_acquire_context_handler(user_context, allocator, instance, device,


~~This shouldn't be changed.~~

xFile3160 · 2026-05-04T22:00:09Z

This seems like it was vibe coded ... there's too many unnecessary/unrelated changes and it doesn't match what we discussed.

The proposal was to add two methods halide_vulkan_acquire_memory_allocator and halide_vulkan_release_memory_allocator.

This would allow you to override halide_vulkan_acquire_context() and halide_vulkan_release_context() by declaring them in your code base and relying upon the weak linking to override the default.

In your custom halide_vulkan_acquire_context() you have the ability to now call halide_vulkan_acquire_memory_allocator() to create an allocator instance, and return it. In your custom halide_vulkan_release_context() you can now call halide_vulkan_release_memory_allocator().

Likewise, the existing halide_vulkan_acquire_context() would need to be modified to also call the newly added halide_vulkan_acquire_memory_allocator() to create the allocator, and return it. And then the default halide_vulkan_release_context() would need to be modified to call halide_vulkan_release_memory_allocator().

Thanks a lot for reviewing first of all. Didn't exactly vibe coded but definitely leveraged heavily AI tools. I think I took too much liberty here to change things without properly explain what led to these changes.

The usage I’m trying to support is an embedder-owned Vulkan context: the application owns the VkInstance, VkDevice, and queue, but Halide still needs its own runtime allocator for shader modules, staging buffers, descriptor resources, and other internal Vulkan state.

That allocator needs to live with the external context and be released before that context/device is torn down, without Halide destroying the application-owned Vulkan handles. At least, for how I'm using this which I think should be pretty common?

I can rework this PR to just the allocator-lifecycle stuff. I can preserve the WEAK stuff, but please let me know if my Windows OS concern is real. I had problems cross-compiling my stuff onto Windows mainly due to unsupported WEAK linkages. I saw CUDA runtime did similar to the approach I took here.
I can remove the setter stuff and keep only acquire_memory_allocator and release_memory_allocator. I can also keep the allocator release scoped to halide owned allocator and the vulkan instance/device/queue. The issue with the GPUCompilationCache is because those shader modules are halide-owned stuff associated with the allocator used to desctroy/create them. When I manage context externally, vkDevice is not the right lifetime boundary, the device is owned by my stuff, while the halide allocator/cache needs explicit teardown. Keying by the allocator let the release_memory_allocator release only the cache entries owned by the allocator.

derek-gerstmann · 2026-05-08T20:45:07Z

@xFile3160 Okay, chatted with the rest of the team, and adding the get/set acquire/release access methods is fine if it makes things easier to use. Feel free to leave them in this PR!

The remaining issues are making sure the dispatch tables are only loaded once, resolving how to cache the shader modules, how to cleanup the shader modules, and how to resolve allocator lifetime issues.

xFile3160 · 2026-05-08T20:51:23Z

@xFile3160 Okay, chatted with the rest of the team, and adding the get/set acquire/release access methods is fine if it makes things easier to use. Feel free to leave them in this PR!

The remaining issues are making sure the dispatch tables are only loaded once, resolving how to cache the shader modules, how to cleanup the shader modules, and how to resolve allocator lifetime issues.

Thanks Derek and all of reviewers/devs. I'm going to take a stab at this as soon as I can. Will request review once ready.

xFile3160 · 2026-05-17T20:27:16Z

@xFile3160 Okay, chatted with the rest of the team, and adding the get/set acquire/release access methods is fine if it makes things easier to use. Feel free to leave them in this PR!

The remaining issues are making sure the dispatch tables are only loaded once, resolving how to cache the shader modules, how to cleanup the shader modules, and how to resolve allocator lifetime issues.

Hey Derek, I fixed the dispatch table loading so it only happens during allocator/context acquisition.
For the shader cache, I think I understand your concern now: VkDevice is the right cache key for sharing across contexts on the same device, so I can keep that.

The part I still need to solve is cleanup. The cached Vulkan module state stores resources allocated/destroyed through a specific VulkanMemoryAllocator. When an embedder owns the VkInstance/VkDevice but asks Halide to create the allocator, halide_vulkan_release_memory_allocator() needs to remove only the cache entries owned by that allocator.

Would you be open to a small targeted cleanup API on GPUCompilationCache, e.g. delete_if(predicate, free_fn), so Vulkan can keep lookup keyed by VkDevice but cleanup entries where entry->allocator == allocator?

xFile3160 · 2026-05-18T06:09:06Z

@derek-gerstmann I force pushed an update for this PR rebasing on current main.
I kept the shader cache keyed by VkDevice as you mentioned.
I've added allocator cleanup so halide_vulkan_release_memory_allocator() only removes the shader cache entries created with that allocator. The main thing I'd like feedback on is the GPUCompilationCache::detete_context_if function I've added: it lets vulkan keep the vkDevice as the cache key while still cleaning the allocator owned stuff when an embedder (in this case an external app providing device/instance and acquiring allocator) releases the Halide allocator before destroying its own device. The device is not handled in this case by Halide.
I've also kept the default vulakn context behavior unchanged, the release_context is just a per-dispatch unlock. The full teardown still happens at halide_vulkan_device_release.
I moved the dispatch table loading into allocator acquisition, not per-dispatch so should happen once. The default halide-owned contexts are not changed, so dispatch loading is unchanged. The vk_load_vulkan_interface is for externally supplied VkInstance/VkDevice when halide_vulkan_acquire_memory_allocator creates halide allocator for the external context. It's xkipped if the allocator exists. I've also added test coverage which passes on my local machine. Will update also the other PR addressing your review comments (thanks a lot for them).

Please do not hesitate to provide feedback, it's my first contribution after-all as I'm hands-deep with Vulkan lately.

codecov · 2026-05-18T15:51:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@d58798a). Learn more about missing BASE report.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8871   +/-   ##
=======================================
  Coverage        ?   69.61%           
=======================================
  Files           ?      255           
  Lines           ?    77525           
  Branches        ?    18534           
=======================================
  Hits            ?    53966           
  Misses          ?    17989           
  Partials        ?     5570

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

alexreinking requested a review from halidebuildbots November 17, 2025 22:31

xFile3160 force-pushed the main branch from b0258f8 to 5b36964 Compare January 20, 2026 21:56

xFile3160 closed this Feb 16, 2026

xFile3160 force-pushed the main branch from f802490 to 1877f41 Compare February 16, 2026 15:49

xFile3160 reopened this Feb 16, 2026

alexreinking requested a review from derek-gerstmann February 16, 2026 16:36

derek-gerstmann requested changes Apr 22, 2026

View reviewed changes

xFile3160 force-pushed the main branch from 5bb201a to eaa2054 Compare April 25, 2026 08:40

xFile3160 mentioned this pull request Apr 25, 2026

Vulkan: wrap external buffers as regions #9110

Open

xFile3160 marked this pull request as draft April 28, 2026 08:00

xFile3160 force-pushed the main branch from 72360f6 to 396421a Compare April 28, 2026 19:27

xFile3160 marked this pull request as ready for review April 28, 2026 20:28

xFile3160 requested a review from derek-gerstmann April 28, 2026 20:29

derek-gerstmann requested changes May 4, 2026

View reviewed changes

derek-gerstmann added the dev_meeting Topic to be discussed at the next dev meeting label May 8, 2026

Vulkan: add external allocator lifecycle

ff03bd3

xFile3160 force-pushed the main branch from 4b60b83 to ff03bd3 Compare May 18, 2026 05:59

Conversation

xFile3160 commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xFile3160 commented Nov 17, 2025

Uh oh!

alexreinking commented Nov 18, 2025

Uh oh!

xFile3160 commented Jan 20, 2026

Uh oh!

alexreinking commented Jan 20, 2026

Uh oh!

derek-gerstmann commented Jan 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xFile3160 commented Apr 22, 2026

Uh oh!

derek-gerstmann left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

derek-gerstmann May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

derek-gerstmann May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xFile3160 commented Nov 17, 2025 •

edited

Loading

derek-gerstmann left a comment •

edited

Loading

derek-gerstmann May 4, 2026 •

edited

Loading

derek-gerstmann May 4, 2026 •

edited

Loading

derek-gerstmann May 4, 2026 •

edited

Loading