metal : wind down leftover residency sets at teardown instead of aborting#3870
Open
AlexCherrypi wants to merge 1 commit into
Open
metal : wind down leftover residency sets at teardown instead of aborting#3870AlexCherrypi wants to merge 1 commit into
AlexCherrypi wants to merge 1 commit into
Conversation
…ting ggml_metal_rsets_free() did GGML_ASSERT([rsets->data count] == 0) and so called abort() when the Metal device is torn down (a C++ static destructor at process exit) while residency sets are still registered. On macOS 15+ this crashes the app on every quit: a residency set is removed from the collection only by ggml_metal_buffer_free(), so an app that exits without freeing every buffer (letting the OS reclaim the model on quit) leaves sets registered. The device does not own the buffers and cannot free them from its destructor, so make teardown defensive instead: stop the keep-alive heartbeat, then wind down residency on any leftover sets (endResidency + removeAllAllocations, mirroring ggml_metal_buffer_rset_free but without -release, since each set is still owned by its not-yet-freed buffer) before releasing the collection. The backing buffers are reclaimed by the OS as the process exits. No behavior change when all buffers were freed (the array is empty). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 9, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
On macOS 15+ (Apple Silicon),
ggml_metal_rsets_free()doesGGML_ASSERT([rsets->data count] == 0), which callsabort()when the Metal device is torn down while residency sets are still registered. The device is freed from a C++ static destructor at process exit (ggml_metal_device_get's function-localstaticvector), so any app embedding the Metal backend that exits without freeing every Metal buffer first crashes on every quit.Why it happens
A residency set is added in
ggml_metal_buffer_init_*and removed from the collection in exactly one place —ggml_metal_buffer_free(). An application that lets the OS reclaim its model/weights on exit (a common, historically fine pattern) never callsggml_backend_buffer_freefor those buffers, so the collection is non-empty when the device's static destructor runsggml_metal_rsets_free(), and the assert fires.The device does not own those buffers and cannot free them from its destructor, so the assert can't be made to legitimately hold from within
ggml_metal_rsets_free().Fix
Make teardown defensive instead of aborting:
d_stop+dispatch_group_wait),endResidency+removeAllAllocations, mirroringggml_metal_buffer_rset_free()but without-release(each set is still owned by its not-yet-freed buffer, so releasing here would over-release),The backing buffers are reclaimed by the OS as the process exits. No behavior change when all buffers were freed — the array is empty and the loop is a no-op. Guarded by the existing
GGML_METAL_HAS_RESIDENCY_SETS+@available(macOS 15.0, …).Notes
rsets->dataafter this point, so leftover entries cause no UB — only the abort.Happy to adjust — feedback welcome.