fix: thread-safe cache writes and feature update handling #114
fix: thread-safe cache writes and feature update handling #114vazarkevych wants to merge 1 commit into
Conversation
|
@vazarkevych - thanks for identifying these tricky issues. I guess the most pressing issue here is: the cache-miss stampede in and, P2 list includes callback list mutation during notification, sticky bucket boolean lock. but the blast radius is limited. so, I’d salvage this by porting these ideas onto current |
madhuchavva
left a comment
There was a problem hiding this comment.
resolve the conflicts and address the review comments please
Problem
Several race conditions existed in cache and feature update handling:
InMemoryFeatureCachehad no locking — concurrent reads/writes could corrupt cache entriesFeatureRepository.load_featuresandload_features_asynchad no double-checked locking — multiple threads/coroutines could trigger redundant HTTP fetches simultaneously_feature_update_callbackslist was mutated and iterated without a lock — concurrentadd/remove/notifycould raiseRuntimeError: list changed size during iteration_sticky_bucket_cache_lockwas a boolean flag instead of a real lock — the spin-loop was not thread-safe and silently returned{}when the "lock" was heldFeatureCache.get_current_statereturned a mutable reference tosavedGroupsinstead of a copyGrowthBook.load_features_asynccalledsave_in_cachewith wrong cache key (client_keyinstead ofapi_host::client_key), making the cached value unreachable_features_event_handlerhad the same incorrect cache keyChanges
InMemoryFeatureCache: addedthreading.Locktoget,set,clearFeatureRepository: added_load_lockand_async_load_lockwith double-checked locking pattern inload_featuresandload_features_async_feature_update_callbacks: protectedadd/removewith_refresh_lock;_notifycopies the list under the lock and iterates outside to prevent deadlocks from slow callbacks_sticky_bucket_cache_lock: replaced boolean spin-lock withasyncio.Lock(); simplified_refresh_sticky_bucketsFeatureCache.get_current_state: returnsdict()copy ofsavedGroupsGrowthBook.load_features_async: removed redundantsave_in_cachecall (already handled byFeatureRepository)_features_event_handler: fixed cache key toapi_host::client_key; changedreturn Nonetoreturn