Speed up the capture set list view by mihow · Pull Request #1301 · RolnickLab/antenna

mihow · 2026-05-13T22:35:37Z

Summary

Two changes, both targeting SourceImageCollectionViewSet.list query cost on large M2M tables:

Denormalize the 3 source-image counts as cached columns on SourceImageCollection. Replaces 3 per-row Subquery(Count(...)) annotations with a plain column SELECT; O(1) read regardless of M2M size.
Introduce CachedCountField in ami/base/models.py — a marker subclass of IntegerField — and apply it to all 12 existing cached count columns project-wide. Discoverable via Model._meta.get_fields() + isinstance(f, CachedCountField) so future cross-cutting tasks (admin display, refresh jobs, list-endpoint defer) can enumerate them.

Stacked on #1300 (select_related on captures list). Base branch is perf/list-endpoint-subquery-rewrite; will retarget to main once #1300 merges.

Denormalization details

3 new columns on SourceImageCollection: source_images_count, source_images_with_detections_count, source_images_processed_count. Schema migration 0085_denormalize_sourceimagecollection_counts creates them and backfills via a single GROUP BY + FILTER scan of the M2M (computes all three counts in one pass). atomic=False so the UPDATE can chunk on prod-sized data.
SourceImageCollection.get_source_image_counts() returns the 3 counts as a dict from a single aggregate query (no writes — pure compute, testable in isolation). update_calculated_fields(save=True) is the side-effecting wrapper that also writes the columns.
Signal handlers in ami/main/signals.py:
- m2m_changed on SourceImageCollection.images.through for post_add, post_remove, post_clear.
- Detection post_save / post_delete recompute for any collection containing the affected source_image.
pipeline.save_results() explicitly recomputes for affected collections after its bulk_create batch, since bulk_create skips signals.
Drops the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and removes them from the viewset chain.

CachedCountField details

New class in ami/base/models.py — class CachedCountField(models.IntegerField). Pure marker; no field-level behavior change.
Applied to: Deployment.{events,occurrences,captures,detections,taxa}_count, Event.{captures,detections,occurrences}_count, SourceImage.detections_count, SourceImageCollection.{source_images,source_images_with_detections,source_images_processed}_count.
Migration 0086_use_cached_count_field is recorded but the generated SQL is (no-op) for every AlterField — the column type is unchanged. The migration only updates Django's recorded model state so future operations see the right field class.
Kept separate from the 0085 backfill migration so the schema-and-data migration is auditable on its own.

Why denormalize rather than the subquery rewrite from the earlier draft of #1300

The collections list UI does not paginate. The paginator COUNT(*) win from rewriting the annotations as correlated subqueries (the original #1300 approach) is invisible to UX. The remaining cost was 3 per-row count subqueries against the M2M + main_detection; on a collection holding 500k+ images, each request still spent hundreds of ms per subquery.

Bench data from the earlier draft also showed the subquery rewrite was slower than the original Count(images, filter=..., distinct=True) on small projects and roughly equal on the heaviest project. Cached columns are an unambiguous win across all project sizes.

Precedent: Deployment already had 5 cached counts (events_count, occurrences_count, captures_count, detections_count, taxa_count). The CachedCountField marker formalizes that pattern.

Recompute granularity tradeoff

Per-Detection signal recompute fires on every individual Detection.save() / Detection.delete(). For ML pipeline writes this is a non-issue: Detection.objects.bulk_create() skips signals, and pipeline.save_results() batches one recompute per job. The per-row signal handles non-bulk paths (manual workflows, retroactive ML, admin actions).

Tests

New TestSourceImageCollectionCountsDenormalize (7 cases) — image add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, update_calculated_fields() recovery from drift, pure-compute get_source_image_counts() returns dict without writes. A tautological test_initial_counts_zero was dropped during review — it asserted Django's default=0, not our code.
New TestSourceImageCollectionListQueryCount (3 cases) — page-size scaling, with_counts=true, ordering=-source_images_count (sort hits the cached column directly, no extra subquery).
Full ami.main + ami.ml + ami.jobs suites pass on local.

Rollout

0085 creates columns with default=0 and backfills in one deploy. Backfill is the slow step on prod-sized data — measure on staging arbutus-2026 first.
0086 is no-op SQL.
Signals fire on new writes immediately after deploy. The viewset switches from annotation to column reads in the same commit, so the deploy must include both the migration and the code.

Test plan

Local: full ami.main + ami.ml + ami.jobs test suites
Staging arbutus-2026: measure 0085 backfill duration on production-shape M2M
Staging: NR check that /api/v2/captures/collections/ p99 drops vs production

Summary by CodeRabbit

New Features
- Added persistent cached count columns for collections, deployments, events and images plus transaction-aware async recompute to keep them up to date.
Improvements
- Simplified collection list queries to rely on stored counts and prefetching for more predictable performance.
- Added an integrity check and repair flow to detect and reconcile stale cached counts.
Tests
- Expanded tests to validate count consistency across M2M, detection changes, and bulk operations.

netlify · 2026-05-13T22:35:42Z

✅ Deploy Preview for antenna-preview ready!

Name	Link
🔨 Latest commit	`9ac2ce1`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/6a06c92eb7c50b0008eaea85
😎 Deploy Preview	https://deploy-preview-1301--antenna-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.
Lighthouse	1 paths audited Performance: 65 (no change from production) Accessibility: 89 (no change from production) Best Practices: 92 (🔴 down 8 from production) SEO: 92 (no change from production) PWA: 80 (no change from production) View the detailed breakdown and full score reports
🤖 Make changes	Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2026-05-13T22:37:26Z

📝 Walkthrough

Walkthrough

Adds a CachedCountField marker and a per-transaction dedup scheduler that enqueues Celery recompute tasks; migrates and backfills denormalized count columns, implements model/queryset update_cached_counts, wires signals and tasks, adds reconciliation checks, and expands tests for correctness and query behavior.

Changes

Denormalized Count Field Infrastructure

Layer / File(s)	Summary
CachedCountField & scheduler `ami/base/models.py`, `ami/base/cached_counts.py`	Adds `CachedCountField` and a per-DB-connection `schedule_recompute` with on_commit flush that enqueues one `recompute_cached_counts_task` per unique (model_label, pk).
Database schema and backfill migrations `ami/main/migrations/0085_denormalize_sourceimagecollection_counts.py`, `ami/main/migrations/0086_use_cached_count_field.py`	Adds three denormalized count columns to `SourceImageCollection` and backfills them with a grouped UPDATE; then converts cached count fields across models to `CachedCountField`.
Model conversions & SourceImageCollection computation `ami/main/models.py`	Converts Deployment/Event/SourceImage count fields to `CachedCountField`; removes `with_source_images_` queryset helpers; adds `source_images_` cached fields on `SourceImageCollection`; implements `get_source_image_counts()` and `update_calculated_fields(save=...)`.
Signals, ML pipeline, and API view `ami/main/signals.py`, `ami/ml/models/pipeline.py`, `ami/main/api/views.py`	Adds m2m_changed and Detection post_save/post_delete handlers that call `.update_cached_counts()`; refreshes collections after bulk Detection creation; simplifies `SourceImageCollectionViewSet.queryset`.
Celery tasks & reconciliation checks `ami/main/tasks.py`, `ami/main/checks/cached_counts.py`	Adds `recompute_cached_counts_task` and `reconcile_cached_counts_task`; implements discovery, stale-finding, and `reconcile_cached_counts` which can repair synchronously via `update_cached_counts(run_async=False)`.
Tests `ami/main/tests.py`	Adds transaction-aware tests for list query counts, denormalization correctness under M2M/Detection changes, and cached-count reconciliation; adjusts query-count assertions and imports.
Design spec `docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md`	Design document describing scheduler, `update_cached_counts` API, call-site changes, risks, migration path, and test plan (including dedup-focused test).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Dynamic capture sets: save filter queries as reusable capture sets (supersedes #907, #487) #1253 — touches SourceImageCollection count fields and related queryset/count helper logic; this PR migrates those counts to denormalized CachedCountField storage and removes the annotation helpers.

Possibly related PRs

RolnickLab/antenna#1300: Prior rewrite of with_source_images_*_count annotation logic; this PR replaces that approach with denormalized cached fields.
RolnickLab/antenna#1093: Adds/uses the SourceImageCollection "processed" count; this PR migrates that field to denormalized storage and adjusts where it's computed/used.

Suggested labels

backend

Suggested reviewers

annavik

Poem

"🐰 I hop through counts both new and old,
Cached fields snug, their stories told,
Signals nudge and tasks dedupe,
Tests confirm we didn't goof,
Hooray — small carrots for the code!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Speed up the capture set list view' accurately describes the main objective of this PR - optimizing the performance of the SourceImageCollection list view by denormalizing cached counts and introducing CachedCountField markers.
Description check	✅ Passed	The PR description comprehensively covers all required template sections: summary, detailed list of changes, denormalization/CachedCountField details with technical depth, rationale, test coverage, and rollout/deployment notes. All major sections from the template are addressed.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/sourceimagecollection-cached-counts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mihow · 2026-05-13T23:21:00Z


+    # Denormalized counts. Kept in sync via m2m_changed and pipeline-completion
+    # hooks. Reads are O(1).
+    source_images_count = models.IntegerField(default=0)


How can we indicate which fields are cached counts? field properties? prefix? I believe we have a cached_ prefix on other models, but then it makes the method replacements not 1:1. maybe thats okay. open to ideas.

Claude says: Checked — no cached_ prefix exists on any model field in the codebase. All denormalized count fields use the bare <thing>_count pattern: Deployment.events_count, Deployment.captures_count, Event.captures_count, SourceImage.detections_count, etc. The new fields match this convention.

I'd lean against adding a cached_ prefix just to this PR's fields — it would create a split where some counts are prefixed and most aren't, and method names (occurrences_count, taxa_count on the same model) wouldn't be 1:1 anymore as you noted. If we want the prefix project-wide, that's a separate refactor PR covering all the count fields.

Happy to do either — flag if you'd prefer the new fields prefixed and we adopt project-wide later, or leave matching the existing convention.

Is there a simple way to mark/tag/categorize/group model fields? To indicate these are cached counts that could be potentially stale or empty at any time? declare in a list on the model? add a model field property, etc?

mihow · 2026-05-14T00:29:22Z

Claude says: PR body updated. Two new pieces since the initial draft:

CachedCountField marker class added to ami/base/models.py and applied to all 12 existing cached-count columns project-wide (Deployment×5, Event×3, SourceImage×1, SourceImageCollection×3). Migration 0086_use_cached_count_field is recorded but the generated SQL is no-op for every AlterField — column type unchanged. Kept separate from the 0085 backfill migration on purpose.
Dropped a tautological test_initial_counts_zero (asserted Django's default=0, not our code) and rewrote a handful of em-dash fragment comments added during earlier review rounds.

@coderabbitai full review please. Base branch is perf/list-endpoint-subquery-rewrite (sister PR #1300, which strips down to a single select_related line). Will retarget to main once #1300 merges.

coderabbitai · 2026-05-14T00:29:34Z

@mihow Sure, I'll kick off a full review of this PR now, taking into account the updated PR body and the two new additions (the CachedCountField marker class with its no-op migration 0086, and the dropped tautological test).

✅ Actions performed

Full review triggered.

mihow · 2026-05-14T01:06:02Z

 from ami.users.models import User


+class CachedCountField(models.IntegerField):


@mohamedelabbas1996 I think we discussed a field type for cached counts last year. it finally happened!

Three planning notes from 3f34b78 were committed under docs/planning/ (wrong convention — should be docs/claude/planning/ per CLAUDE.md) and have since rotted: PR-A was superseded by the denormalize approach in #1301; the .only() recommendation was reverted per review; PR-B1 ships in this PR. Two genuinely useful items have been harvested: - New Relic baseline numbers for 9 list endpoints (post-NR-12.1.0) → memory entry reference_nr_baseline_2026_05_11.md - 22:22 UTC PG connection-pool burst hypothesis → issue #1302 - SourceImageViewSet.retrieve 571-call N+1 audit → issue #1303 Co-Authored-By: Claude <noreply@anthropic.com>

@mihow

* perf(api): trim captures list SELECT via .only() Replaces the unconditional wide SELECT (every column on main_sourceimage + main_deployment + main_event) on SourceImageViewSet.list with a `.only()` list scoped to fields SourceImageListSerializer actually reads. Row width drops from 52 columns to 26. select_related("deployment__data_source") added so SourceImage.public_url()'s presigned-URL fallback (when public_base_url is blank) does not fire a per-row lazy fetch. SOURCE_IMAGE_LIST_ONLY_FIELDS is exposed as a module-level constant with a maintenance docstring covering how to keep it in sync as the serializer or model methods change. SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES is split out as a sub-constant so the 11 fields that exist only to support presigned-URL generation are visibly grouped — that whole chain goes away once images are served through the upcoming image-resizing/CDN layer instead of directly from source buckets. Only applied to the list action; retrieve keeps the wide SELECT so the detail serializer's broader field set is unaffected. Tests: - TestSourceImageListQueryCount extended with response-shape assertion guarding against `.only()` over-restriction (verifies url, size_display, deployment.name, event present without lazy loads). Refs: docs/planning/2026-05-11-list-endpoint-perf-continuation-plan.md PR-B1. Co-Authored-By: Claude <noreply@anthropic.com> * docs: list-endpoint perf planning trail Adds: - 2026-05-11-newrelic-post-upgrade-findings.md — NR prod data after the agent upgrade that surfaced the list-endpoint hot paths. - 2026-05-11-list-endpoint-perf-analysis.md — SQL-level root-cause analysis for the 3 endpoints (collection list, captures list, project charts). - 2026-05-11-list-endpoint-perf-continuation-plan.md — sequenced PR plan. PR-A (collection subquery rewrite) superseded by the denormalize-counts approach on perf/sourceimagecollection-cached-counts. PR-B1 ships in this PR. PR-C/D/E queued. Co-Authored-By: Claude <noreply@anthropic.com> * refactor(api): move SourceImage list-only field groups into viewset class Per review on #1300: pulled `SOURCE_IMAGE_LIST_ONLY_FIELDS` / `SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES` out of module scope and into `SourceImageViewSet` as class attributes, grouped by what reads them (core, deployment-nested, event-nested, public-url). Trimmed the docstring and added a short note on why we whitelist with `.only()` instead of blacklisting with `.defer()`, with a link to the Django ref. Co-Authored-By: Claude <noreply@anthropic.com> * refactor(api): drop .only() — select_related is the N+1 fix Per @mihow review on #1300: the `.only()` machinery added in this PR was acting as a row-width trim on top of select_related, not as the N+1 fix. The actual N+1 prevention is `select_related("deployment__data_source")` ensuring `SourceImage.public_url()` doesn't lazy-load the data_source chain per row. With that in place, dropping `.only()` keeps the captures list query count flat (verified empirically — `TestSourceImageListQueryCount` still passes at limit=1, 5, 25). This removes the maintenance burden of keeping the field tuples in sync with the serializer surface. django-zen-queries and the existing perf tests are the safety net for regressions. Renamed `test_list_response_shape_preserved_after_only` to `test_list_response_shape_has_no_lazy_loads` so the contract it asserts is independent of any specific deferred-fields mechanism. Co-Authored-By: Claude <noreply@anthropic.com> * docs: drop list-endpoint perf planning docs Three planning notes from 3f34b78 were committed under docs/planning/ (wrong convention — should be docs/claude/planning/ per CLAUDE.md) and have since rotted: PR-A was superseded by the denormalize approach in #1301; the .only() recommendation was reverted per review; PR-B1 ships in this PR. Two genuinely useful items have been harvested: - New Relic baseline numbers for 9 list endpoints (post-NR-12.1.0) → memory entry reference_nr_baseline_2026_05_11.md - 22:22 UTC PG connection-pool burst hypothesis → issue #1302 - SourceImageViewSet.retrieve 571-call N+1 audit → issue #1303 Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>

The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>

…ormalized count columns Add `CachedCountField(IntegerField)` to `ami/base/models.py` as a marker subclass for cached aggregate columns. Swap it in for the 12 existing denormalized count fields across Deployment, Event, SourceImage, and SourceImageCollection. Discoverable via `Model._meta.get_fields()` + `isinstance(f, CachedCountField)` for future use (refresh tasks, admin display, list-endpoint defer()). The DB column type is unchanged (IntegerField -> IntegerField), so the generated migration `0086_use_cached_count_field` is a pure no-op at the SQL level; it only updates Django's recorded model state. Co-Authored-By: Claude <noreply@anthropic.com>

- Remove `test_initial_counts_zero` from `TestSourceImageCollectionCountsDenormalize`; it asserted the `default=0` field-level default rather than any code path we own. - Rewrite a handful of em-dash fragment comments added during PR review rounds (`get_source_image_counts` docstring, captures-list query-count assertion message, collection-list ordering comment, denormalize-test side-effect note) in normal prose. Co-Authored-By: Claude <noreply@anthropic.com>

netlify · 2026-05-14T01:34:58Z

✅ Deploy Preview for antenna-ssec ready!

Name	Link
🔨 Latest commit	`9ac2ce1`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-ssec/deploys/6a06c92e77c58b00085a5dfd
😎 Deploy Preview	https://deploy-preview-1301--antenna-ssec.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.
🤖 Make changes	Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/signals.py`:
- Around line 210-211: The signal currently calls
instance.update_calculated_fields(save=True) synchronously for actions
("post_add", "post_remove", "post_clear"), which can persist stale counts under
concurrent transactions; change these immediate calls to run after the DB
transaction commits by wrapping the call in django.db.transaction.on_commit
(e.g., transaction.on_commit(lambda:
instance.update_calculated_fields(save=True))). Apply the same replacement for
the other occurrence around lines referencing the same actions so recomputation
is deferred until successful commit.

In `@ami/main/tests.py`:
- Line 3386: Test fixtures are using normalized [0–1] bbox values but this repo
expects pixel-space coordinates; update the Detection/BoundingBox fixtures
(e.g., the Detection instances setting bbox=[0.1,0.1,0.2,0.2] and the other
occurrences at the mentioned sites) to use absolute pixel coordinates consistent
with the image dimensions used in these tests. Locate the test rows creating
Detection objects (look for Detection(..., bbox=...) and any direct
BoundingBox(...) usages) and replace the normalized floats with the
corresponding pixel values (or explicit integers) so the fixtures represent true
pixel-space bboxes across the cases mentioned (lines ~3386, ~3398, ~3424,
~3443).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e0f8a9e-539a-4e36-8acc-5da8d6ba207a

📥 Commits

Reviewing files that changed from the base of the PR and between 0dedd4c and 2725599.

📒 Files selected for processing (8)

ami/base/models.py
ami/main/api/views.py
ami/main/migrations/0085_denormalize_sourceimagecollection_counts.py
ami/main/migrations/0086_use_cached_count_field.py
ami/main/models.py
ami/main/signals.py
ami/main/tests.py
ami/ml/models/pipeline.py

Address opus takeaway-review findings on the SourceImageCollection denormalize work. Signal handlers no longer block on the recompute: - Detection post_save/post_delete dedupes affected source_image_ids on a per-connection set and fans out one task per affected collection via transaction.on_commit. A 10k-detection bulk_create-loop (where signals fire) now triggers at most one task per affected collection instead of 10k synchronous aggregates per detection. - m2m_changed on SourceImageCollection.images defers to a Celery task via transaction.on_commit. New ami.main.checks.cached_counts module catches drift from bulk write paths that skip signals (bulk_create, bulk_update, raw SQL, ML post- processors like small_size_filter). Generic across every model that declares a CachedCountField column — discovered via Model._meta.get_fields(). Layered defense: signals = best-effort sync, periodic reconcile_cached_counts_task = safety net. This also gives CachedCountField its first consumer, addressing the "marker class with no consumer" critique. Other tweaks from the same review pass: - CachedCountField docstring documents migration-friction tradeoff (future AlterField for unrelated IntegerFields near cached ones). - SourceImageCollection.update_calculated_fields docstring explains the .filter(pk=).update() vs .save() divergence on the other 3 models. - TestSourceImageCollectionCountsDenormalize → TransactionTestCase + eager Celery so on_commit hooks actually fire in tests. - TestCachedCountsIntegrityCheck (5 tests) covers discover / find / reconcile dry-run / reconcile repair / no-drift baseline. - Detection bbox fixtures: [0.1...] → [10, 10, 20, 20] for consistency with codebase pixel-space convention (CodeRabbit nit). Co-Authored-By: Claude <noreply@anthropic.com>

Copilot

Pull request overview

This PR speeds up capture-set list queries by replacing per-row source-image count annotations with denormalized cached columns and adding infrastructure to discover/reconcile cached count drift.

Changes:

Adds cached count fields and migrations for SourceImageCollection, plus CachedCountField marker usage.
Updates list/query logic, signals, tasks, and pipeline bulk-save handling to refresh cached collection counts.
Adds integrity-check helpers and regression/query-count tests.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`ami/base/models.py`	Adds `CachedCountField` marker class.
`ami/main/models.py`	Converts cached count fields and adds SourceImageCollection count recomputation.
`ami/main/api/views.py`	Removes source-image count annotations from collection queryset.
`ami/main/signals.py`	Adds M2M/detection signal handlers for collection count refresh.
`ami/main/tasks.py`	Adds collection refresh and cached-count reconciliation Celery tasks.
`ami/main/checks/cached_counts.py`	Adds cached-count discovery, drift detection, and reconciliation helpers.
`ami/ml/models/pipeline.py`	Refreshes affected collection counts after bulk-created detections.
`ami/main/tests.py`	Adds denormalized count, integrity-check, and list query-count tests.
`ami/main/migrations/0085_denormalize_sourceimagecollection_counts.py`	Adds/backfills SourceImageCollection cached count columns.
`ami/main/migrations/0086_use_cached_count_field.py`	Updates migration state to use `CachedCountField`.

Comments suppressed due to low confidence (3)

ami/main/signals.py:268

This only refreshes collections for the detection's current source_image_id. Because Detection.source_image is writable through the serializer/API, moving a detection to another capture leaves collections containing the old source image with stale processed/with-detections counts; the old source image id needs to be captured before save and refreshed too when it changes.

@receiver(post_save, sender=Detection)
@receiver(post_delete, sender=Detection)
def update_collection_counts_on_detection_change(sender, instance, **kwargs):
    """Schedule a collection-counts refresh for every collection containing the affected SourceImage.

    The dedup + on_commit indirection means even tight per-row Detection write
    loops fan out to at most one task per affected collection. ``bulk_create``
    / ``bulk_update`` skip signals entirely — those rely on the periodic
    reconciliation task to repair drift.
    """
    if not instance.source_image_id:
        return
    _schedule_collection_refresh_for_source_image(instance.source_image_id)

ami/main/signals.py:253

This does not cover SourceImage deletion: Django removes auto-created M2M through rows during cascade delete without sending m2m_changed, and the detection post_delete path can no longer discover the collections once the image membership is gone. Deleting captures will therefore leave source_images_count (and derived processed/detection counts) stale unless affected collection ids are captured before the image is deleted.

@receiver(m2m_changed, sender=SourceImageCollection.images.through)
def update_collection_counts_on_m2m(sender, instance, action, **kwargs):
    """Recompute denormalized counts when images are added to or removed from a collection."""
    if action in ("post_add", "post_remove", "post_clear"):
        collection_pk = instance.pk
        transaction.on_commit(lambda: refresh_collection_cached_counts.delay(collection_pk))

ami/main/signals.py:253

The handler assumes instance is always a SourceImageCollection, but m2m_changed also fires for the reverse manager (source_image.collections.add/remove(...)) with instance as a SourceImage and the collection ids in pk_set. In that path this schedules a refresh for the source-image id instead of the affected collection ids, leaving the real collection counts stale.

@receiver(m2m_changed, sender=SourceImageCollection.images.through)
def update_collection_counts_on_m2m(sender, instance, action, **kwargs):
    """Recompute denormalized counts when images are added to or removed from a collection."""
    if action in ("post_add", "post_remove", "post_clear"):
        collection_pk = instance.pk
        transaction.on_commit(lambda: refresh_collection_cached_counts.delay(collection_pk))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    pending = getattr(connection, _PENDING_SOURCE_IMAGE_IDS_ATTR, None)
+    is_new = pending is None
+    if is_new:
+        pending = set()
+        setattr(connection, _PENDING_SOURCE_IMAGE_IDS_ATTR, pending)
+    pending.add(source_image_id)
+    if is_new:
+        # Outside an atomic block, ``on_commit`` fires synchronously at
+        # registration time — so the ``add`` above must precede it or the
+        # flush sees an empty set.
+        transaction.on_commit(_flush_pending_collection_refreshes)


+        for instance, stored, computed in find_stale_cached_counts(m, project_id=project_id):
+            result.checked += 1


mihow · 2026-05-14T06:00:54Z

+    source_images_count = CachedCountField(default=0)
+    source_images_with_detections_count = CachedCountField(default=0)
+    source_images_processed_count = CachedCountField(default=0)


Great catch

+        migrations.AddField(
+            model_name="sourceimagecollection",
+            name="source_images_count",
+            field=models.IntegerField(default=0),
+        ),
+        migrations.AddField(
+            model_name="sourceimagecollection",
+            name="source_images_with_detections_count",
+            field=models.IntegerField(default=0),
+        ),
+        migrations.AddField(
+            model_name="sourceimagecollection",
+            name="source_images_processed_count",
+            field=models.IntegerField(default=0),
+        ),


+    # bulk_create above skips Detection signals; refresh affected collections explicitly.
+    source_image_ids = [img.pk for img in source_images]
+    for collection in SourceImageCollection.objects.filter(images__id__in=source_image_ids).distinct():
+        collection.update_calculated_fields(save=True)


+        # occurrences/taxa subquery annotations.
+        self.assertLessEqual(count, 10, f"Collection list with_counts too many queries: {count}")
+
+    def test_list_query_count_ordering_by_annotated_count(self):


+    def test_list_query_count_does_not_scale_with_page_size(self):
+        small = self._list_query_count(f"/api/v2/captures/collections/?project_id={self.project.pk}&limit=1")
+        large = self._list_query_count(f"/api/v2/captures/collections/?project_id={self.project.pk}&limit=25")
+        print(f"\n[AUDIT] Collection list: limit=1 -> {small}q, limit=25 -> {large}q")
+        self.assertLessEqual(large, small + 2, f"Collection list scaling: {small} -> {large} (likely N+1)")


mihow · 2026-05-14T06:34:02Z

Claude says: Self-suggestion for follow-up PR (or fold into this one): simplify the cached-counts plumbing introduced here. Current shape works but has three per-model pieces (dedup state attr + flush helper + per-model Celery task). Next cached-count addition repeats all three.

Proposed consolidation

One free helper + queryset/instance methods on BaseModel / BaseQuerySet. Fits the established run_async= flag pattern (Deployment.save(regroup_async=), Job.retry(async_task=), process_single_source_image(run_async=)).

# ami/base/cached_counts.py
_PENDING_ATTR = "_pending_cached_count_recomputes"

def _schedule_recompute(model, pk):
    """Per-connection dedup + transaction.on_commit dispatch.

    Outside an atomic block, on_commit fires synchronously at registration
    time, so we set the pending set BEFORE registering the flush callback.
    """
    pending = getattr(connection, _PENDING_ATTR, None)
    is_new = pending is None
    if is_new:
        pending = set()
        setattr(connection, _PENDING_ATTR, pending)
    pending.add((model._meta.label, pk))
    if is_new:
        transaction.on_commit(_flush)

def _flush():
    pending = getattr(connection, _PENDING_ATTR, set())
    try: delattr(connection, _PENDING_ATTR)
    except AttributeError: pass
    for label, pk in pending:
        recompute_cached_counts_task.delay(label, pk)

@shared_task(ignore_result=True)
def recompute_cached_counts_task(model_label, pk):
    model = apps.get_model(model_label)
    try: instance = model.objects.get(pk=pk)
    except model.DoesNotExist: return
    instance.update_calculated_fields(save=True)

# ami/base/models.py
class BaseModel(models.Model):
    def update_cached_counts(self, run_async: bool = True) -> None:
        if run_async:
            _schedule_recompute(type(self), self.pk)
            return
        self.update_calculated_fields(save=True)

class BaseQuerySet(QuerySet):
    def update_cached_counts(self, run_async: bool = True) -> None:
        for pk in self.values_list("pk", flat=True):
            if run_async:
                _schedule_recompute(self.model, pk)
            else:
                self.model.objects.get(pk=pk).update_calculated_fields(save=True)

Call sites collapse

Pipeline save_results:

SourceImageCollection.objects.filter(
    images__id__in=source_image_ids
).update_cached_counts(run_async=False)

Signals:

@receiver(post_save, sender=Detection)
@receiver(post_delete, sender=Detection)
def update_collection_counts_on_detection_change(sender, instance, **kwargs):
    if not instance.source_image_id:
        return
    SourceImageCollection.objects.filter(
        images__id=instance.source_image_id
    ).update_cached_counts()  # async by default

Reconcile loop in ami/main/checks/cached_counts.py swaps update_calculated_fields(save=True) → update_cached_counts(run_async=False).

What goes away

_PENDING_SOURCE_IMAGE_IDS_ATTR + _flush_pending_collection_refreshes + _schedule_collection_refresh_for_source_image in signals.py (per-model state attrs disappear)
refresh_collection_cached_counts(collection_id) task in ami/main/tasks.py (replaced by generic recompute_cached_counts_task(model_label, pk))
Per-model signal handlers that need to know which Celery task to call

What stays

CachedCountField marker (still drives reconcile discovery)
Per-model update_calculated_fields(save=True) (recompute logic lives where the model lives)
Periodic reconcile_cached_counts_task (safety net for bulk-write paths that bypass signals)
.filter(pk=).update() vs .save() shape divergence in update_calculated_fields — documented inline, leaving alone

Next cached-count addition

Three things instead of six:

Field: foo_count = CachedCountField()
Recompute body in update_calculated_fields(save)
One signal handler per write source, each ending in .update_cached_counts()

Reconcile + drift detection are free.

What I'd NOT do

Declarative invalidate_on=[...] on the field — hides signal wiring, adds magic
Drop signals entirely + rely on TTL reconciliation — breaks interactive freshness (just-uploaded image not showing in collection)
Lazy recompute-on-read — read path doing writes is its own footgun

Happy to land as a follow-up PR (cleaner review scope) or fold into this branch if reviewers prefer the unified pattern in one place.

Captures the design behind replacing PR #1301's per-source-table dedup state + per-model Celery refresh tasks with a generic instance/queryset method on BaseModel/BaseQuerySet. Wraps update_calculated_fields(save=True) with caller-controlled sync vs async (run_async=True default) and per- (model, pk) dedup keyed on the DB connection. Co-Authored-By: Claude <noreply@anthropic.com>

…t method Replace the per-source-table dedup state + per-model Celery refresh tasks introduced in 1a242aa with a generic `update_cached_counts(run_async=True)` method on `BaseModel` and `BaseQuerySet`. Wraps the existing `update_calculated_fields(save=True)` body; per-(model_label, pk) dedup keyed on the active DB connection collapses high-volume signal fan-out to one task per affected row. Architecture: - `ami/base/cached_counts.py` — `schedule_recompute(label, pk)` queues a recompute in a per-connection dedup set; `_flush_pending_recomputes` is registered via `transaction.on_commit` (once per call, idempotent on the drain side so it survives rollback-then-fresh-transaction in tests). - `BaseModel.update_cached_counts(run_async=True)` — schedules a recompute for `self`. With `run_async=False` runs inline. - `BaseQuerySet.update_cached_counts(run_async=True)` — schedules one per row in the queryset. - `ami/main/tasks.py::recompute_cached_counts_task` — generic Celery task, loads model via `apps.get_model(label)` and calls `update_calculated_fields(save=True)`. Call site changes: - Detection signal handler is now one line: `SourceImageCollection.objects.filter(images__id=instance.source_image_id).update_cached_counts()` - Collection m2m handler: `instance.update_cached_counts()`. - `refresh_project_cached_counts` body keeps the existing `project.update_related_calculated_fields()` (preserves the bulk subquery UPDATE for `SourceImage.detections_count`); body no longer duplicates the Events/Deployments loop here. - Reconcile loop in `ami/main/checks/cached_counts.py` calls `instance.update_cached_counts(run_async=False)` for stylistic unity. Removed: - `_PENDING_SOURCE_IMAGE_IDS_ATTR`, `_flush_pending_collection_refreshes`, `_schedule_collection_refresh_for_source_image` (signals.py) - `refresh_collection_cached_counts` task (tasks.py) Cost of adding the next cached-count column drops from ~50 LOC (new field + recompute body + dedup attr + flush helper + Celery task + signal handler) to ~10 LOC (new field + recompute body + signal handler calling the method). Design rationale and trade-offs in `docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md`. Co-Authored-By: Claude <noreply@anthropic.com>

mihow · 2026-05-15T07:20:29Z

Claude says: Landed in 9ac2ce17. update_cached_counts(run_async=True) lives on BaseModel and BaseQuerySet; backing module is ami/base/cached_counts.py. Detection signal handler is one line. refresh_collection_cached_counts task is gone; refresh_project_cached_counts body unchanged from this PR (kept the bulk-subquery UPDATE for SourceImage.detections_count).

Test isolation note worth flagging: the per-connection dedup set survives TestCase rollback because connection attribute state is not reset between tests, but the registered on_commit callbacks are discarded. Fix is to register on_commit on every schedule_recompute call and make _flush_pending_recomputes idempotent on the drain side (first call drains, subsequent calls see an empty set and no-op). That keeps the dedup benefit (10k Detection writes still collapse to one task per collection) without the rollback leak.

Reconcile loop in ami/main/checks/cached_counts.py switched to update_cached_counts(run_async=False) for stylistic unity — same behavior since reconcile is already in a worker.

Design doc + trade-offs in docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md (committed in 451c8be).

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

ami/base/models.py (1)

74-80: ⚡ Quick win

Optimize the sync path to avoid N+1 queries.

The sync path (lines 79-80) loads instances one at a time with self.model.objects.get(pk=pk), creating N+1 queries. Since the async path only needs PKs but the sync path needs full instances, branch on run_async and use self.iterator() for the sync path to load instances efficiently in a single query.

⚡ Proposed refactor

-        model_label = self.model._meta.label
-        for pk in self.values_list("pk", flat=True):
-            if run_async:
+        if run_async:
+            model_label = self.model._meta.label
+            for pk in self.values_list("pk", flat=True):
                 schedule_recompute(model_label, pk)
-            else:
-                instance = self.model.objects.get(pk=pk)
+        else:
+            for instance in self.iterator():
                 instance.update_calculated_fields(save=True)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/base/models.py` around lines 74 - 80, The current sync path does N+1
queries by calling self.model.objects.get(pk=pk) for each pk; change the logic
to branch on run_async: if run_async keep the existing loop over
self.values_list("pk", flat=True) and call schedule_recompute(model_label, pk),
but for the sync path iterate the queryset itself using self.iterator() (or
another queryset iterator) to load instances efficiently and call
instance.update_calculated_fields(save=True) on each instance; reference
model_label, schedule_recompute, run_async, self.values_list("pk", flat=True),
self.iterator(), and update_calculated_fields when applying the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/tasks.py`:
- Line 11: The task function recompute_cached_counts_task currently types pk as
int but should accept any primary key type to match schedule_recompute in
ami/base/cached_counts.py; change the signature to use pk: Any and add "from
typing import Any" at the top of ami/main/tasks.py, then pass pk through to
schedule_recompute unchanged so type checkers no longer flag the mismatch (refer
to recompute_cached_counts_task and schedule_recompute).

---

Nitpick comments:
In `@ami/base/models.py`:
- Around line 74-80: The current sync path does N+1 queries by calling
self.model.objects.get(pk=pk) for each pk; change the logic to branch on
run_async: if run_async keep the existing loop over self.values_list("pk",
flat=True) and call schedule_recompute(model_label, pk), but for the sync path
iterate the queryset itself using self.iterator() (or another queryset iterator)
to load instances efficiently and call
instance.update_calculated_fields(save=True) on each instance; reference
model_label, schedule_recompute, run_async, self.values_list("pk", flat=True),
self.iterator(), and update_calculated_fields when applying the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e47d60b5-368e-40bd-a816-14549cda47b0

📥 Commits

Reviewing files that changed from the base of the PR and between 2725599 and 9ac2ce1.

📒 Files selected for processing (8)

ami/base/cached_counts.py
ami/base/models.py
ami/main/checks/cached_counts.py
ami/main/models.py
ami/main/signals.py
ami/main/tasks.py
ami/main/tests.py
docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md

✅ Files skipped from review due to trivial changes (1)

docs/superpowers/specs/2026-05-14-cached-counts-update-method-design.md

🚧 Files skipped from review as they are similar to previous changes (2)

ami/main/signals.py
ami/main/models.py

coderabbitai · 2026-05-15T07:25:17Z



+@celery_app.task(ignore_result=True)
+def recompute_cached_counts_task(model_label: str, pk: int) -> None:


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Widen the pk type annotation to support non-integer primary keys.

The signature declares pk: int, but schedule_recompute in ami/base/cached_counts.py accepts pk: Any (line 27), and Django models can use UUIDField or other types as primary keys. While Celery serialization handles this at runtime, type checkers will flag the mismatch.

🔧 Proposed fix

-def recompute_cached_counts_task(model_label: str, pk: int) -> None: +def recompute_cached_counts_task(model_label: str, pk: Any) -> None:

Add the import at the top if not already present:

+from typing import Any

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def recompute_cached_counts_task(model_label: str, pk: int) -> None:

from typing import Any

def recompute_cached_counts_task(model_label: str, pk: Any) -> None:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ami/main/tasks.py` at line 11, The task function recompute_cached_counts_task currently types pk as int but should accept any primary key type to match schedule_recompute in ami/base/cached_counts.py; change the signature to use pk: Any and add "from typing import Any" at the top of ami/main/tasks.py, then pass pk through to schedule_recompute unchanged so type checkers no longer flag the mismatch (refer to recompute_cached_counts_task and schedule_recompute).

mihow changed the title ~~Denormalize SourceImageCollection counts as cached columns~~ Speed up the capture set list view May 13, 2026

mihow commented May 13, 2026

View reviewed changes

Comment thread ami/main/signals.py Outdated

mihow commented May 13, 2026

View reviewed changes

Comment thread ami/ml/models/pipeline.py Outdated

mihow force-pushed the perf/sourceimagecollection-cached-counts branch from af24585 to 39a8f02 Compare May 13, 2026 23:08

mihow commented May 13, 2026

View reviewed changes

Comment thread ami/main/models.py Outdated

mihow force-pushed the perf/sourceimagecollection-cached-counts branch from 39a8f02 to 401a613 Compare May 13, 2026 23:29

mihow mentioned this pull request May 14, 2026

Speed up the captures list view #1300

Merged

2 tasks

mihow commented May 14, 2026

View reviewed changes

mihow force-pushed the perf/sourceimagecollection-cached-counts branch from d12f477 to f1019cd Compare May 14, 2026 01:13

Base automatically changed from perf/list-endpoint-subquery-rewrite to main May 14, 2026 01:18

mihow and others added 3 commits May 13, 2026 18:34

mihow force-pushed the perf/sourceimagecollection-cached-counts branch from f1019cd to 2725599 Compare May 14, 2026 01:34

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Comment thread ami/main/signals.py Outdated

Comment thread ami/main/tests.py Outdated

Copilot AI review requested due to automatic review settings May 14, 2026 01:50

Copilot started reviewing on behalf of mihow May 14, 2026 01:50 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

mihow and others added 2 commits May 14, 2026 23:50

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

		from ami.users.models import User


		class CachedCountField(models.IntegerField):

		for instance, stored, computed in find_stale_cached_counts(m, project_id=project_id):
		result.checked += 1



		@celery_app.task(ignore_result=True)
		def recompute_cached_counts_task(model_label: str, pk: int) -> None:

-def recompute_cached_counts_task(model_label: str, pk: int) -> None:
+from typing import Any
+def recompute_cached_counts_task(model_label: str, pk: Any) -> None:

Conversation

mihow commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Denormalization details

CachedCountField details

Why denormalize rather than the subquery rewrite from the earlier draft of #1300

Recompute granularity tradeoff

Tests

Rollout

Test plan

Summary by CodeRabbit

Uh oh!

netlify Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview ready!

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

mihow May 13, 2026

Choose a reason for hiding this comment

Uh oh!

mihow May 13, 2026

Choose a reason for hiding this comment

Uh oh!

mihow May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mihow commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026

Uh oh!

mihow May 14, 2026

Choose a reason for hiding this comment

Uh oh!

netlify Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-ssec ready!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

mihow May 14, 2026

Choose a reason for hiding this comment

Uh oh!

mihow commented May 14, 2026

Proposed consolidation

Call sites collapse

What goes away

What stays

Next cached-count addition

What I'd NOT do

Uh oh!

mihow commented May 15, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 15, 2026

Choose a reason for hiding this comment

mihow commented May 13, 2026 •

edited by coderabbitai Bot

Loading

netlify Bot commented May 13, 2026 •

edited

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading

netlify Bot commented May 14, 2026 •

edited

Loading