Speed up the capture set list view#1301
Conversation
✅ Deploy Preview for antenna-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughAdds a CachedCountField marker and a per-transaction dedup scheduler that enqueues Celery recompute tasks; migrates and backfills denormalized count columns, implements model/queryset update_cached_counts, wires signals and tasks, adds reconciliation checks, and expands tests for correctness and query behavior. ChangesDenormalized Count Field Infrastructure
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
af24585 to
39a8f02
Compare
|
|
||
| # Denormalized counts. Kept in sync via m2m_changed and pipeline-completion | ||
| # hooks. Reads are O(1). | ||
| source_images_count = models.IntegerField(default=0) |
There was a problem hiding this comment.
How can we indicate which fields are cached counts? field properties? prefix? I believe we have a cached_ prefix on other models, but then it makes the method replacements not 1:1. maybe thats okay. open to ideas.
There was a problem hiding this comment.
Claude says: Checked — no cached_ prefix exists on any model field in the codebase. All denormalized count fields use the bare <thing>_count pattern: Deployment.events_count, Deployment.captures_count, Event.captures_count, SourceImage.detections_count, etc. The new fields match this convention.
I'd lean against adding a cached_ prefix just to this PR's fields — it would create a split where some counts are prefixed and most aren't, and method names (occurrences_count, taxa_count on the same model) wouldn't be 1:1 anymore as you noted. If we want the prefix project-wide, that's a separate refactor PR covering all the count fields.
Happy to do either — flag if you'd prefer the new fields prefixed and we adopt project-wide later, or leave matching the existing convention.
There was a problem hiding this comment.
Is there a simple way to mark/tag/categorize/group model fields? To indicate these are cached counts that could be potentially stale or empty at any time? declare in a list on the model? add a model field property, etc?
39a8f02 to
401a613
Compare
|
Claude says: PR body updated. Two new pieces since the initial draft:
@coderabbitai full review please. Base branch is |
|
✅ Actions performedFull review triggered. |
| from ami.users.models import User | ||
|
|
||
|
|
||
| class CachedCountField(models.IntegerField): |
There was a problem hiding this comment.
@mohamedelabbas1996 I think we discussed a field type for cached counts last year. it finally happened!
Three planning notes from 3f34b78 were committed under docs/planning/ (wrong convention — should be docs/claude/planning/ per CLAUDE.md) and have since rotted: PR-A was superseded by the denormalize approach in #1301; the .only() recommendation was reverted per review; PR-B1 ships in this PR. Two genuinely useful items have been harvested: - New Relic baseline numbers for 9 list endpoints (post-NR-12.1.0) → memory entry reference_nr_baseline_2026_05_11.md - 22:22 UTC PG connection-pool burst hypothesis → issue #1302 - SourceImageViewSet.retrieve 571-call N+1 audit → issue #1303 Co-Authored-By: Claude <noreply@anthropic.com>
d12f477 to
f1019cd
Compare
* perf(api): trim captures list SELECT via .only()
Replaces the unconditional wide SELECT (every column on main_sourceimage +
main_deployment + main_event) on SourceImageViewSet.list with a `.only()`
list scoped to fields SourceImageListSerializer actually reads. Row width
drops from 52 columns to 26.
select_related("deployment__data_source") added so SourceImage.public_url()'s
presigned-URL fallback (when public_base_url is blank) does not fire a per-row
lazy fetch.
SOURCE_IMAGE_LIST_ONLY_FIELDS is exposed as a module-level constant with a
maintenance docstring covering how to keep it in sync as the serializer or
model methods change. SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES is split out as a
sub-constant so the 11 fields that exist only to support presigned-URL
generation are visibly grouped — that whole chain goes away once images are
served through the upcoming image-resizing/CDN layer instead of directly
from source buckets.
Only applied to the list action; retrieve keeps the wide SELECT so the
detail serializer's broader field set is unaffected.
Tests:
- TestSourceImageListQueryCount extended with response-shape assertion
guarding against `.only()` over-restriction (verifies url, size_display,
deployment.name, event present without lazy loads).
Refs: docs/planning/2026-05-11-list-endpoint-perf-continuation-plan.md PR-B1.
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: list-endpoint perf planning trail
Adds:
- 2026-05-11-newrelic-post-upgrade-findings.md — NR prod data after the
agent upgrade that surfaced the list-endpoint hot paths.
- 2026-05-11-list-endpoint-perf-analysis.md — SQL-level root-cause
analysis for the 3 endpoints (collection list, captures list, project
charts).
- 2026-05-11-list-endpoint-perf-continuation-plan.md — sequenced PR plan.
PR-A (collection subquery rewrite) superseded by the denormalize-counts
approach on perf/sourceimagecollection-cached-counts. PR-B1 ships in
this PR. PR-C/D/E queued.
Co-Authored-By: Claude <noreply@anthropic.com>
* refactor(api): move SourceImage list-only field groups into viewset class
Per review on #1300: pulled `SOURCE_IMAGE_LIST_ONLY_FIELDS` /
`SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES` out of module scope and into
`SourceImageViewSet` as class attributes, grouped by what reads them
(core, deployment-nested, event-nested, public-url). Trimmed the
docstring and added a short note on why we whitelist with `.only()`
instead of blacklisting with `.defer()`, with a link to the Django ref.
Co-Authored-By: Claude <noreply@anthropic.com>
* refactor(api): drop .only() — select_related is the N+1 fix
Per @mihow review on #1300: the `.only()` machinery added in this PR was
acting as a row-width trim on top of select_related, not as the N+1 fix.
The actual N+1 prevention is `select_related("deployment__data_source")`
ensuring `SourceImage.public_url()` doesn't lazy-load the data_source
chain per row. With that in place, dropping `.only()` keeps the captures
list query count flat (verified empirically — `TestSourceImageListQueryCount`
still passes at limit=1, 5, 25).
This removes the maintenance burden of keeping the field tuples in sync
with the serializer surface. django-zen-queries and the existing perf
tests are the safety net for regressions.
Renamed `test_list_response_shape_preserved_after_only` to
`test_list_response_shape_has_no_lazy_loads` so the contract it asserts
is independent of any specific deferred-fields mechanism.
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: drop list-endpoint perf planning docs
Three planning notes from 3f34b78 were committed under docs/planning/
(wrong convention — should be docs/claude/planning/ per CLAUDE.md) and
have since rotted: PR-A was superseded by the denormalize approach in
#1301; the .only() recommendation was reverted per review; PR-B1 ships
in this PR. Two genuinely useful items have been harvested:
- New Relic baseline numbers for 9 list endpoints (post-NR-12.1.0)
→ memory entry reference_nr_baseline_2026_05_11.md
- 22:22 UTC PG connection-pool burst hypothesis
→ issue #1302
- SourceImageViewSet.retrieve 571-call N+1 audit
→ issue #1303
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>
…ormalized count columns Add `CachedCountField(IntegerField)` to `ami/base/models.py` as a marker subclass for cached aggregate columns. Swap it in for the 12 existing denormalized count fields across Deployment, Event, SourceImage, and SourceImageCollection. Discoverable via `Model._meta.get_fields()` + `isinstance(f, CachedCountField)` for future use (refresh tasks, admin display, list-endpoint defer()). The DB column type is unchanged (IntegerField -> IntegerField), so the generated migration `0086_use_cached_count_field` is a pure no-op at the SQL level; it only updates Django's recorded model state. Co-Authored-By: Claude <noreply@anthropic.com>
- Remove `test_initial_counts_zero` from `TestSourceImageCollectionCountsDenormalize`; it asserted the `default=0` field-level default rather than any code path we own. - Rewrite a handful of em-dash fragment comments added during PR review rounds (`get_source_image_counts` docstring, captures-list query-count assertion message, collection-list ordering comment, denormalize-test side-effect note) in normal prose. Co-Authored-By: Claude <noreply@anthropic.com>
f1019cd to
2725599
Compare
✅ Deploy Preview for antenna-ssec ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ami/main/signals.py`:
- Around line 210-211: The signal currently calls
instance.update_calculated_fields(save=True) synchronously for actions
("post_add", "post_remove", "post_clear"), which can persist stale counts under
concurrent transactions; change these immediate calls to run after the DB
transaction commits by wrapping the call in django.db.transaction.on_commit
(e.g., transaction.on_commit(lambda:
instance.update_calculated_fields(save=True))). Apply the same replacement for
the other occurrence around lines referencing the same actions so recomputation
is deferred until successful commit.
In `@ami/main/tests.py`:
- Line 3386: Test fixtures are using normalized [0–1] bbox values but this repo
expects pixel-space coordinates; update the Detection/BoundingBox fixtures
(e.g., the Detection instances setting bbox=[0.1,0.1,0.2,0.2] and the other
occurrences at the mentioned sites) to use absolute pixel coordinates consistent
with the image dimensions used in these tests. Locate the test rows creating
Detection objects (look for Detection(..., bbox=...) and any direct
BoundingBox(...) usages) and replace the normalized floats with the
corresponding pixel values (or explicit integers) so the fixtures represent true
pixel-space bboxes across the cases mentioned (lines ~3386, ~3398, ~3424,
~3443).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1e0f8a9e-539a-4e36-8acc-5da8d6ba207a
📒 Files selected for processing (8)
ami/base/models.pyami/main/api/views.pyami/main/migrations/0085_denormalize_sourceimagecollection_counts.pyami/main/migrations/0086_use_cached_count_field.pyami/main/models.pyami/main/signals.pyami/main/tests.pyami/ml/models/pipeline.py
Address opus takeaway-review findings on the SourceImageCollection denormalize work. Signal handlers no longer block on the recompute: - Detection post_save/post_delete dedupes affected source_image_ids on a per-connection set and fans out one task per affected collection via transaction.on_commit. A 10k-detection bulk_create-loop (where signals fire) now triggers at most one task per affected collection instead of 10k synchronous aggregates per detection. - m2m_changed on SourceImageCollection.images defers to a Celery task via transaction.on_commit. New ami.main.checks.cached_counts module catches drift from bulk write paths that skip signals (bulk_create, bulk_update, raw SQL, ML post- processors like small_size_filter). Generic across every model that declares a CachedCountField column — discovered via Model._meta.get_fields(). Layered defense: signals = best-effort sync, periodic reconcile_cached_counts_task = safety net. This also gives CachedCountField its first consumer, addressing the "marker class with no consumer" critique. Other tweaks from the same review pass: - CachedCountField docstring documents migration-friction tradeoff (future AlterField for unrelated IntegerFields near cached ones). - SourceImageCollection.update_calculated_fields docstring explains the .filter(pk=).update() vs .save() divergence on the other 3 models. - TestSourceImageCollectionCountsDenormalize → TransactionTestCase + eager Celery so on_commit hooks actually fire in tests. - TestCachedCountsIntegrityCheck (5 tests) covers discover / find / reconcile dry-run / reconcile repair / no-drift baseline. - Detection bbox fixtures: [0.1...] → [10, 10, 20, 20] for consistency with codebase pixel-space convention (CodeRabbit nit). Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR speeds up capture-set list queries by replacing per-row source-image count annotations with denormalized cached columns and adding infrastructure to discover/reconcile cached count drift.
Changes:
- Adds cached count fields and migrations for
SourceImageCollection, plusCachedCountFieldmarker usage. - Updates list/query logic, signals, tasks, and pipeline bulk-save handling to refresh cached collection counts.
- Adds integrity-check helpers and regression/query-count tests.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
ami/base/models.py |
Adds CachedCountField marker class. |
ami/main/models.py |
Converts cached count fields and adds SourceImageCollection count recomputation. |
ami/main/api/views.py |
Removes source-image count annotations from collection queryset. |
ami/main/signals.py |
Adds M2M/detection signal handlers for collection count refresh. |
ami/main/tasks.py |
Adds collection refresh and cached-count reconciliation Celery tasks. |
ami/main/checks/cached_counts.py |
Adds cached-count discovery, drift detection, and reconciliation helpers. |
ami/ml/models/pipeline.py |
Refreshes affected collection counts after bulk-created detections. |
ami/main/tests.py |
Adds denormalized count, integrity-check, and list query-count tests. |
ami/main/migrations/0085_denormalize_sourceimagecollection_counts.py |
Adds/backfills SourceImageCollection cached count columns. |
ami/main/migrations/0086_use_cached_count_field.py |
Updates migration state to use CachedCountField. |
Comments suppressed due to low confidence (3)
ami/main/signals.py:268
- This only refreshes collections for the detection's current
source_image_id. BecauseDetection.source_imageis writable through the serializer/API, moving a detection to another capture leaves collections containing the old source image with stale processed/with-detections counts; the old source image id needs to be captured before save and refreshed too when it changes.
@receiver(post_save, sender=Detection)
@receiver(post_delete, sender=Detection)
def update_collection_counts_on_detection_change(sender, instance, **kwargs):
"""Schedule a collection-counts refresh for every collection containing the affected SourceImage.
The dedup + on_commit indirection means even tight per-row Detection write
loops fan out to at most one task per affected collection. ``bulk_create``
/ ``bulk_update`` skip signals entirely — those rely on the periodic
reconciliation task to repair drift.
"""
if not instance.source_image_id:
return
_schedule_collection_refresh_for_source_image(instance.source_image_id)
ami/main/signals.py:253
- This does not cover
SourceImagedeletion: Django removes auto-created M2M through rows during cascade delete without sendingm2m_changed, and the detectionpost_deletepath can no longer discover the collections once the image membership is gone. Deleting captures will therefore leavesource_images_count(and derived processed/detection counts) stale unless affected collection ids are captured before the image is deleted.
@receiver(m2m_changed, sender=SourceImageCollection.images.through)
def update_collection_counts_on_m2m(sender, instance, action, **kwargs):
"""Recompute denormalized counts when images are added to or removed from a collection."""
if action in ("post_add", "post_remove", "post_clear"):
collection_pk = instance.pk
transaction.on_commit(lambda: refresh_collection_cached_counts.delay(collection_pk))
ami/main/signals.py:253
- The handler assumes
instanceis always aSourceImageCollection, butm2m_changedalso fires for the reverse manager (source_image.collections.add/remove(...)) withinstanceas aSourceImageand the collection ids inpk_set. In that path this schedules a refresh for the source-image id instead of the affected collection ids, leaving the real collection counts stale.
@receiver(m2m_changed, sender=SourceImageCollection.images.through)
def update_collection_counts_on_m2m(sender, instance, action, **kwargs):
"""Recompute denormalized counts when images are added to or removed from a collection."""
if action in ("post_add", "post_remove", "post_clear"):
collection_pk = instance.pk
transaction.on_commit(lambda: refresh_collection_cached_counts.delay(collection_pk))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pending = getattr(connection, _PENDING_SOURCE_IMAGE_IDS_ATTR, None) | ||
| is_new = pending is None | ||
| if is_new: | ||
| pending = set() | ||
| setattr(connection, _PENDING_SOURCE_IMAGE_IDS_ATTR, pending) | ||
| pending.add(source_image_id) | ||
| if is_new: | ||
| # Outside an atomic block, ``on_commit`` fires synchronously at | ||
| # registration time — so the ``add`` above must precede it or the | ||
| # flush sees an empty set. | ||
| transaction.on_commit(_flush_pending_collection_refreshes) |
| for instance, stored, computed in find_stale_cached_counts(m, project_id=project_id): | ||
| result.checked += 1 |
| source_images_count = CachedCountField(default=0) | ||
| source_images_with_detections_count = CachedCountField(default=0) | ||
| source_images_processed_count = CachedCountField(default=0) |
| migrations.AddField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_count", | ||
| field=models.IntegerField(default=0), | ||
| ), | ||
| migrations.AddField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_with_detections_count", | ||
| field=models.IntegerField(default=0), | ||
| ), | ||
| migrations.AddField( | ||
| model_name="sourceimagecollection", | ||
| name="source_images_processed_count", | ||
| field=models.IntegerField(default=0), | ||
| ), |
| # bulk_create above skips Detection signals; refresh affected collections explicitly. | ||
| source_image_ids = [img.pk for img in source_images] | ||
| for collection in SourceImageCollection.objects.filter(images__id__in=source_image_ids).distinct(): | ||
| collection.update_calculated_fields(save=True) |
| # occurrences/taxa subquery annotations. | ||
| self.assertLessEqual(count, 10, f"Collection list with_counts too many queries: {count}") | ||
|
|
||
| def test_list_query_count_ordering_by_annotated_count(self): |
| def test_list_query_count_does_not_scale_with_page_size(self): | ||
| small = self._list_query_count(f"/api/v2/captures/collections/?project_id={self.project.pk}&limit=1") | ||
| large = self._list_query_count(f"/api/v2/captures/collections/?project_id={self.project.pk}&limit=25") | ||
| print(f"\n[AUDIT] Collection list: limit=1 -> {small}q, limit=25 -> {large}q") | ||
| self.assertLessEqual(large, small + 2, f"Collection list scaling: {small} -> {large} (likely N+1)") |
|
Claude says: Self-suggestion for follow-up PR (or fold into this one): simplify the cached-counts plumbing introduced here. Current shape works but has three per-model pieces (dedup state attr + flush helper + per-model Celery task). Next cached-count addition repeats all three. Proposed consolidationOne free helper + queryset/instance methods on # ami/base/cached_counts.py
_PENDING_ATTR = "_pending_cached_count_recomputes"
def _schedule_recompute(model, pk):
"""Per-connection dedup + transaction.on_commit dispatch.
Outside an atomic block, on_commit fires synchronously at registration
time, so we set the pending set BEFORE registering the flush callback.
"""
pending = getattr(connection, _PENDING_ATTR, None)
is_new = pending is None
if is_new:
pending = set()
setattr(connection, _PENDING_ATTR, pending)
pending.add((model._meta.label, pk))
if is_new:
transaction.on_commit(_flush)
def _flush():
pending = getattr(connection, _PENDING_ATTR, set())
try: delattr(connection, _PENDING_ATTR)
except AttributeError: pass
for label, pk in pending:
recompute_cached_counts_task.delay(label, pk)
@shared_task(ignore_result=True)
def recompute_cached_counts_task(model_label, pk):
model = apps.get_model(model_label)
try: instance = model.objects.get(pk=pk)
except model.DoesNotExist: return
instance.update_calculated_fields(save=True)# ami/base/models.py
class BaseModel(models.Model):
def update_cached_counts(self, run_async: bool = True) -> None:
if run_async:
_schedule_recompute(type(self), self.pk)
return
self.update_calculated_fields(save=True)
class BaseQuerySet(QuerySet):
def update_cached_counts(self, run_async: bool = True) -> None:
for pk in self.values_list("pk", flat=True):
if run_async:
_schedule_recompute(self.model, pk)
else:
self.model.objects.get(pk=pk).update_calculated_fields(save=True)Call sites collapsePipeline SourceImageCollection.objects.filter(
images__id__in=source_image_ids
).update_cached_counts(run_async=False)Signals: @receiver(post_save, sender=Detection)
@receiver(post_delete, sender=Detection)
def update_collection_counts_on_detection_change(sender, instance, **kwargs):
if not instance.source_image_id:
return
SourceImageCollection.objects.filter(
images__id=instance.source_image_id
).update_cached_counts() # async by defaultReconcile loop in What goes away
What stays
Next cached-count additionThree things instead of six:
Reconcile + drift detection are free. What I'd NOT do
Happy to land as a follow-up PR (cleaner review scope) or fold into this branch if reviewers prefer the unified pattern in one place. |
Captures the design behind replacing PR #1301's per-source-table dedup state + per-model Celery refresh tasks with a generic instance/queryset method on BaseModel/BaseQuerySet. Wraps update_calculated_fields(save=True) with caller-controlled sync vs async (run_async=True default) and per- (model, pk) dedup keyed on the DB connection. Co-Authored-By: Claude <noreply@anthropic.com>
…t method Replace the per-source-table dedup state + per-model Celery refresh tasks introduced in 1a242aa with a generic `update_cached_counts(run_async=True)` method on `BaseModel` and `BaseQuerySet`. Wraps the existing `update_calculated_fields(save=True)` body; per-(model_label, pk) dedup keyed on the active DB connection collapses high-volume signal fan-out to one task per affected row. Architecture: - `ami/base/cached_counts.py` — `schedule_recompute(label, pk)` queues a recompute in a per-connection dedup set; `_flush_pending_recomputes` is registered via `transaction.on_commit` (once per call, idempotent on the drain side so it survives rollback-then-fresh-transaction in tests). - `BaseModel.update_cached_counts(run_async=True)` — schedules a recompute for `self`. With `run_async=False` runs inline. - `BaseQuerySet.update_cached_counts(run_async=True)` — schedules one per row in the queryset. - `ami/main/tasks.py::recompute_cached_counts_task` — generic Celery task, loads model via `apps.get_model(label)` and calls `update_calculated_fields(save=True)`. Call site changes: - Detection signal handler is now one line: `SourceImageCollection.objects.filter(images__id=instance.source_image_id).update_cached_counts()` - Collection m2m handler: `instance.update_cached_counts()`. - `refresh_project_cached_counts` body keeps the existing `project.update_related_calculated_fields()` (preserves the bulk subquery UPDATE for `SourceImage.detections_count`); body no longer duplicates the Events/Deployments loop here. - Reconcile loop in `ami/main/checks/cached_counts.py` calls `instance.update_cached_counts(run_async=False)` for stylistic unity. Removed: - `_PENDING_SOURCE_IMAGE_IDS_ATTR`, `_flush_pending_collection_refreshes`, `_schedule_collection_refresh_for_source_image` (signals.py) - `refresh_collection_cached_counts` task (tasks.py) Cost of adding the next cached-count column drops from ~50 LOC (new field + recompute body + dedup attr + flush helper + Celery task + signal handler) to ~10 LOC (new field + recompute body + signal handler calling the method). Design rationale and trade-offs in `docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md`. Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude says: Landed in 9ac2ce17. Test isolation note worth flagging: the per-connection dedup set survives Reconcile loop in Design doc + trade-offs in |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ami/base/models.py (1)
74-80: ⚡ Quick winOptimize the sync path to avoid N+1 queries.
The sync path (lines 79-80) loads instances one at a time with
self.model.objects.get(pk=pk), creating N+1 queries. Since the async path only needs PKs but the sync path needs full instances, branch onrun_asyncand useself.iterator()for the sync path to load instances efficiently in a single query.⚡ Proposed refactor
- model_label = self.model._meta.label - for pk in self.values_list("pk", flat=True): - if run_async: + if run_async: + model_label = self.model._meta.label + for pk in self.values_list("pk", flat=True): schedule_recompute(model_label, pk) - else: - instance = self.model.objects.get(pk=pk) + else: + for instance in self.iterator(): instance.update_calculated_fields(save=True)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ami/base/models.py` around lines 74 - 80, The current sync path does N+1 queries by calling self.model.objects.get(pk=pk) for each pk; change the logic to branch on run_async: if run_async keep the existing loop over self.values_list("pk", flat=True) and call schedule_recompute(model_label, pk), but for the sync path iterate the queryset itself using self.iterator() (or another queryset iterator) to load instances efficiently and call instance.update_calculated_fields(save=True) on each instance; reference model_label, schedule_recompute, run_async, self.values_list("pk", flat=True), self.iterator(), and update_calculated_fields when applying the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ami/main/tasks.py`:
- Line 11: The task function recompute_cached_counts_task currently types pk as
int but should accept any primary key type to match schedule_recompute in
ami/base/cached_counts.py; change the signature to use pk: Any and add "from
typing import Any" at the top of ami/main/tasks.py, then pass pk through to
schedule_recompute unchanged so type checkers no longer flag the mismatch (refer
to recompute_cached_counts_task and schedule_recompute).
---
Nitpick comments:
In `@ami/base/models.py`:
- Around line 74-80: The current sync path does N+1 queries by calling
self.model.objects.get(pk=pk) for each pk; change the logic to branch on
run_async: if run_async keep the existing loop over self.values_list("pk",
flat=True) and call schedule_recompute(model_label, pk), but for the sync path
iterate the queryset itself using self.iterator() (or another queryset iterator)
to load instances efficiently and call
instance.update_calculated_fields(save=True) on each instance; reference
model_label, schedule_recompute, run_async, self.values_list("pk", flat=True),
self.iterator(), and update_calculated_fields when applying the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e47d60b5-368e-40bd-a816-14549cda47b0
📒 Files selected for processing (8)
ami/base/cached_counts.pyami/base/models.pyami/main/checks/cached_counts.pyami/main/models.pyami/main/signals.pyami/main/tasks.pyami/main/tests.pydocs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md
✅ Files skipped from review due to trivial changes (1)
- docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md
🚧 Files skipped from review as they are similar to previous changes (2)
- ami/main/signals.py
- ami/main/models.py
|
|
||
|
|
||
| @celery_app.task(ignore_result=True) | ||
| def recompute_cached_counts_task(model_label: str, pk: int) -> None: |
There was a problem hiding this comment.
Widen the pk type annotation to support non-integer primary keys.
The signature declares pk: int, but schedule_recompute in ami/base/cached_counts.py accepts pk: Any (line 27), and Django models can use UUIDField or other types as primary keys. While Celery serialization handles this at runtime, type checkers will flag the mismatch.
🔧 Proposed fix
-def recompute_cached_counts_task(model_label: str, pk: int) -> None:
+def recompute_cached_counts_task(model_label: str, pk: Any) -> None:Add the import at the top if not already present:
+from typing import Any📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def recompute_cached_counts_task(model_label: str, pk: int) -> None: | |
| from typing import Any | |
| def recompute_cached_counts_task(model_label: str, pk: Any) -> None: |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ami/main/tasks.py` at line 11, The task function recompute_cached_counts_task
currently types pk as int but should accept any primary key type to match
schedule_recompute in ami/base/cached_counts.py; change the signature to use pk:
Any and add "from typing import Any" at the top of ami/main/tasks.py, then pass
pk through to schedule_recompute unchanged so type checkers no longer flag the
mismatch (refer to recompute_cached_counts_task and schedule_recompute).

Summary
Two changes, both targeting
SourceImageCollectionViewSet.listquery cost on large M2M tables:SourceImageCollection. Replaces 3 per-rowSubquery(Count(...))annotations with a plain column SELECT; O(1) read regardless of M2M size.CachedCountFieldinami/base/models.py— a marker subclass ofIntegerField— and apply it to all 12 existing cached count columns project-wide. Discoverable viaModel._meta.get_fields()+isinstance(f, CachedCountField)so future cross-cutting tasks (admin display, refresh jobs, list-endpoint defer) can enumerate them.Stacked on #1300 (
select_relatedon captures list). Base branch isperf/list-endpoint-subquery-rewrite; will retarget tomainonce #1300 merges.Denormalization details
SourceImageCollection:source_images_count,source_images_with_detections_count,source_images_processed_count. Schema migration0085_denormalize_sourceimagecollection_countscreates them and backfills via a single GROUP BY +FILTERscan of the M2M (computes all three counts in one pass).atomic=Falseso the UPDATE can chunk on prod-sized data.SourceImageCollection.get_source_image_counts()returns the 3 counts as a dict from a single aggregate query (no writes — pure compute, testable in isolation).update_calculated_fields(save=True)is the side-effecting wrapper that also writes the columns.ami/main/signals.py:m2m_changedonSourceImageCollection.images.throughforpost_add,post_remove,post_clear.Detectionpost_save/post_deleterecompute for any collection containing the affectedsource_image.pipeline.save_results()explicitly recomputes for affected collections after itsbulk_createbatch, sincebulk_createskips signals.with_*_count()annotation methods onSourceImageCollectionQuerySetand removes them from the viewset chain.CachedCountField details
ami/base/models.py—class CachedCountField(models.IntegerField). Pure marker; no field-level behavior change.Deployment.{events,occurrences,captures,detections,taxa}_count,Event.{captures,detections,occurrences}_count,SourceImage.detections_count,SourceImageCollection.{source_images,source_images_with_detections,source_images_processed}_count.0086_use_cached_count_fieldis recorded but the generated SQL is(no-op)for everyAlterField— the column type is unchanged. The migration only updates Django's recorded model state so future operations see the right field class.Why denormalize rather than the subquery rewrite from the earlier draft of #1300
The collections list UI does not paginate. The paginator
COUNT(*)win from rewriting the annotations as correlated subqueries (the original #1300 approach) is invisible to UX. The remaining cost was 3 per-row count subqueries against the M2M +main_detection; on a collection holding 500k+ images, each request still spent hundreds of ms per subquery.Bench data from the earlier draft also showed the subquery rewrite was slower than the original
Count(images, filter=..., distinct=True)on small projects and roughly equal on the heaviest project. Cached columns are an unambiguous win across all project sizes.Precedent:
Deploymentalready had 5 cached counts (events_count,occurrences_count,captures_count,detections_count,taxa_count). TheCachedCountFieldmarker formalizes that pattern.Recompute granularity tradeoff
Per-Detection signal recompute fires on every individual
Detection.save()/Detection.delete(). For ML pipeline writes this is a non-issue:Detection.objects.bulk_create()skips signals, andpipeline.save_results()batches one recompute per job. The per-row signal handles non-bulk paths (manual workflows, retroactive ML, admin actions).Tests
TestSourceImageCollectionCountsDenormalize(7 cases) — image add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case,update_calculated_fields()recovery from drift, pure-computeget_source_image_counts()returns dict without writes. A tautologicaltest_initial_counts_zerowas dropped during review — it asserted Django'sdefault=0, not our code.TestSourceImageCollectionListQueryCount(3 cases) — page-size scaling,with_counts=true,ordering=-source_images_count(sort hits the cached column directly, no extra subquery).ami.main+ami.ml+ami.jobssuites pass on local.Rollout
default=0and backfills in one deploy. Backfill is the slow step on prod-sized data — measure on staging arbutus-2026 first.Test plan
ami.main+ami.ml+ami.jobstest suites/api/v2/captures/collections/p99 drops vs productionSummary by CodeRabbit
New Features
Improvements
Tests