Speed up the captures list view#1300
Conversation
✅ Deploy Preview for antenna-preview canceled.
|
✅ Deploy Preview for antenna-ssec canceled.
|
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughSourceImage list responses now restrict hydrated DB columns via a module-level ChangesQuery Performance Optimization
Sequence DiagramsequenceDiagram
participant Client
participant SourceImageViewSet
participant SourceImageListSerializer
participant Database
Client->>SourceImageViewSet: GET /api/v2/captures/
SourceImageViewSet->>Database: queryset.select_related(...).only(*SOURCE_IMAGE_LIST_ONLY_FIELDS)
SourceImageViewSet->>SourceImageListSerializer: serialize(page)
SourceImageListSerializer->>Database: read whitelisted fields (no deferred loads)
SourceImageViewSet->>Client: JSON response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai ready for review! our outstanding question is this even necessary if we cache the detections_count filed |
🧩 Analysis chain🏁 Script executed: Length of output: 150 🏁 Script executed: Length of output: 11971 🏁 Script executed: Length of output: 2799
Does that make this PR unnecessary? No — the biggest win here is the paginator Is there an incremental improvement available within this PR though? Yes — since
The caveat: If you're asking about full collection-level caching (planning doc §1.B — adding Recommendation: Land this PR as-is — the paginator fix alone justifies it. The ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ami/main/tests.py (1)
3297-3330: ⚡ Quick winIncrease fixture rows so the page-size scaling test is meaningful.
Right now only 2 collections are created, so
limit=25still returns at most 2 rows. That can miss row-scaling regressions.Proposed change
- # Two collections so list endpoint has >1 row to paginate. - for i in range(2): + # Create enough rows so limit=25 actually exercises a larger page. + for i in range(30): c = SourceImageCollection.objects.create( name=f"qcount-collection-{i}", project=self.project, method="manual", kwargs={"image_ids": [img.pk for img in images]}, ) c.images.set(images)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ami/main/tests.py` around lines 3297 - 3330, The test test_list_query_count_does_not_scale_with_page_size currently only creates 2 SourceImageCollection rows so comparing limit=1 vs limit=25 isn't meaningful; update the setup that creates collections (the for i in range(2) block that builds SourceImageCollection objects used by the test and referenced by _list_query_count) to create enough rows (e.g., 25–50 collections) so the list endpoint can actually return >25 items and expose any N+1 scaling issues; ensure each created SourceImageCollection still sets c.images.set(images) and leave _list_query_count and the assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ami/main/api/views.py`:
- Around line 545-570: The queryset used for listing trims fields via
queryset.only(...) but leaves SourceImage.public_url() able to access
self.deployment.data_source, causing an N+1; update the list view to either
select_related('deployment__data_source') (or
select_related('deployment').select_related('deployment__data_source') as
appropriate) so the related DataSource is preloaded for
SourceImage.public_url(), or change SourceImageListSerializer to only use cached
fields (public_base_url/path) and avoid calling SourceImage.public_url(); locate
the queryset.only block and the SourceImage.public_url /
SourceImageListSerializer usages to apply the select_related preload or remove
the serializer branch that reads deployment.data_source.
---
Nitpick comments:
In `@ami/main/tests.py`:
- Around line 3297-3330: The test
test_list_query_count_does_not_scale_with_page_size currently only creates 2
SourceImageCollection rows so comparing limit=1 vs limit=25 isn't meaningful;
update the setup that creates collections (the for i in range(2) block that
builds SourceImageCollection objects used by the test and referenced by
_list_query_count) to create enough rows (e.g., 25–50 collections) so the list
endpoint can actually return >25 items and expose any N+1 scaling issues; ensure
each created SourceImageCollection still sets c.images.set(images) and leave
_list_query_count and the assertions unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 52a221ca-5249-4b18-b29e-cf3d064913de
📒 Files selected for processing (3)
ami/main/api/views.pyami/main/models.pyami/main/tests.py
There was a problem hiding this comment.
Pull request overview
This PR targets two high-traffic list endpoints in the ami.main API to reduce database work on pagination and to reduce payload/row width during list serialization.
Changes:
- Rewrites
SourceImageCollectionlist count annotations from join-basedCount(... distinct=True)to correlated scalarSubquery(Count(...)), avoiding outerGROUP BYand the associated paginatorCOUNT(*)blow-up. - Adds
.only(...)toSourceImageViewSet.get_queryset()for thelistaction to limit selected columns to those needed bySourceImageListSerializer. - Adds/extends query-count and response-shape regression tests for both endpoints.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
ami/main/models.py |
Replaces collection image-count annotations with correlated subqueries + Coalesce to prevent join amplification during pagination. |
ami/main/api/views.py |
Applies .only() on the SourceImage list queryset to trim selected columns while keeping existing filtering/prefetch behavior intact. |
ami/main/tests.py |
Adds regression tests to guard query-count flatness and ensure .only() doesn’t trigger deferred-field lazy loads during serialization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Addresses inline review comments on PR #1300: - mihow:556 — drop misleading inline comment that mixed up `detections_count` vs `occurrences_count` / `taxa_count`. Maintenance rules now live in the docstring above `SOURCE_IMAGE_LIST_ONLY_FIELDS`. - mihow:548 — pull the `.only()` field list into a module-level constant with a docstring describing how to keep it in sync (which serializer + model methods drive each column). - coderabbit:570 — add `select_related("deployment__data_source")` and preload the S3StorageSource columns that `SourceImage.public_url()`'s presigned-URL branch reads (bucket, region, prefix, access_key, secret_key, endpoint_url, public_base_url). Without this, rows whose `public_base_url` is blank pay one extra query per row during serialization. - coderabbit:3297 — bump `TestSourceImageCollectionListQueryCount` fixture from 2 → 30 collections so `limit=25` actually exercises a real page boundary; the per-row subquery scaling regression only surfaces with N > 1 page. Tests: 16 SourceImage* + SourceImageCollection* tests pass. Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude says: Pushed 8ec3e81 addressing all 4 review comments:
All 16 tests in The collection-counts denormalize-vs-CTE question (E1/E2/E3 in the PR body) is still open — flagged as follow-up work, not blocking this PR. |
Adds: - `docs/planning/2026-05-11-newrelic-post-upgrade-findings.md` — NR prod data after the agent upgrade that surfaced the list-endpoint hot paths. - `docs/planning/2026-05-11-list-endpoint-perf-analysis.md` — SQL-level root-cause analysis for the 3 endpoints (collection list, captures list, project charts). - `docs/planning/2026-05-11-list-endpoint-perf-continuation-plan.md` — sequenced PR plan; PR-A + PR-B1 landed as #1300, PR-C/D/E queued. - `docs/claude/prompts/NEXT_SESSION_PROMPT.md` — implementation plan for the cached-column denormalize (E3 in #1300's options) follow-up PR. Co-Authored-By: Claude <noreply@anthropic.com>
Replaces the unconditional wide SELECT (every column on main_sourceimage +
main_deployment + main_event) on SourceImageViewSet.list with a `.only()`
list scoped to fields SourceImageListSerializer actually reads. Row width
drops from 52 columns to 26.
select_related("deployment__data_source") added so SourceImage.public_url()'s
presigned-URL fallback (when public_base_url is blank) does not fire a per-row
lazy fetch.
SOURCE_IMAGE_LIST_ONLY_FIELDS is exposed as a module-level constant with a
maintenance docstring covering how to keep it in sync as the serializer or
model methods change. SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES is split out as a
sub-constant so the 11 fields that exist only to support presigned-URL
generation are visibly grouped — that whole chain goes away once images are
served through the upcoming image-resizing/CDN layer instead of directly
from source buckets.
Only applied to the list action; retrieve keeps the wide SELECT so the
detail serializer's broader field set is unaffected.
Tests:
- TestSourceImageListQueryCount extended with response-shape assertion
guarding against `.only()` over-restriction (verifies url, size_display,
deployment.name, event present without lazy loads).
Refs: docs/planning/2026-05-11-list-endpoint-perf-continuation-plan.md PR-B1.
Co-Authored-By: Claude <noreply@anthropic.com>
Adds: - 2026-05-11-newrelic-post-upgrade-findings.md — NR prod data after the agent upgrade that surfaced the list-endpoint hot paths. - 2026-05-11-list-endpoint-perf-analysis.md — SQL-level root-cause analysis for the 3 endpoints (collection list, captures list, project charts). - 2026-05-11-list-endpoint-perf-continuation-plan.md — sequenced PR plan. PR-A (collection subquery rewrite) superseded by the denormalize-counts approach on perf/sourceimagecollection-cached-counts. PR-B1 ships in this PR. PR-C/D/E queued. Co-Authored-By: Claude <noreply@anthropic.com>
7438f76 to
3f34b78
Compare
The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>
…lass Per review on #1300: pulled `SOURCE_IMAGE_LIST_ONLY_FIELDS` / `SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES` out of module scope and into `SourceImageViewSet` as class attributes, grouped by what reads them (core, deployment-nested, event-nested, public-url). Trimmed the docstring and added a short note on why we whitelist with `.only()` instead of blacklisting with `.defer()`, with a link to the Django ref. Co-Authored-By: Claude <noreply@anthropic.com>
The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>
Per @mihow review on #1300: the `.only()` machinery added in this PR was acting as a row-width trim on top of select_related, not as the N+1 fix. The actual N+1 prevention is `select_related("deployment__data_source")` ensuring `SourceImage.public_url()` doesn't lazy-load the data_source chain per row. With that in place, dropping `.only()` keeps the captures list query count flat (verified empirically — `TestSourceImageListQueryCount` still passes at limit=1, 5, 25). This removes the maintenance burden of keeping the field tuples in sync with the serializer surface. django-zen-queries and the existing perf tests are the safety net for regressions. Renamed `test_list_response_shape_preserved_after_only` to `test_list_response_shape_has_no_lazy_loads` so the contract it asserts is independent of any specific deferred-fields mechanism. Co-Authored-By: Claude <noreply@anthropic.com>
The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude says: PR body updated to reflect current state — @coderabbitai full review please. Sister PR #1301 stacks on this branch and ports the collection-count denormalization out of the original draft. |
|
✅ Actions performedFull review triggered. |
Three planning notes from 3f34b78 were committed under docs/planning/ (wrong convention — should be docs/claude/planning/ per CLAUDE.md) and have since rotted: PR-A was superseded by the denormalize approach in #1301; the .only() recommendation was reverted per review; PR-B1 ships in this PR. Two genuinely useful items have been harvested: - New Relic baseline numbers for 9 list endpoints (post-NR-12.1.0) → memory entry reference_nr_baseline_2026_05_11.md - 22:22 UTC PG connection-pool burst hypothesis → issue #1302 - SourceImageViewSet.retrieve 571-call N+1 audit → issue #1303 Co-Authored-By: Claude <noreply@anthropic.com>
The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>
The collection list endpoint previously ran 3 correlated count subqueries per row (source_images_count, source_images_with_detections_count, source_images_processed_count). On large collections each subquery scans the M2M and Detection tables. Reads are now O(1) against cached columns. - Add 3 IntegerField columns on SourceImageCollection with default=0 - Migration 0085 backfills via a single GROUP BY over the M2M with FILTER clauses; atomic=False so prod-sized M2M can chunk - update_calculated_fields() recomputes all 3 counts in one aggregate query - m2m_changed on images.through keeps counts fresh on add/remove (both directions; reverse post_clear documented as caller responsibility) - Detection post_save/post_delete recompute for collections containing the affected source image - pipeline.save_results() recomputes for affected collections after the bulk_create batch (which skips signals) - Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet and the chain in SourceImageCollectionViewSet.queryset - TestSourceImageCollectionCountsDenormalize covers initial state, add/remove, Detection create/delete, null-bbox processed-but-not-with-detections case, and update_calculated_fields() recovery from drift - TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q (was <=15), ordering by source_images_count uses column directly Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly invisible since the UI does not paginate collections; this is the actual UX-visible improvement. Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Adds
select_related("deployment__data_source")toSourceImageViewSet.list. Net diff is a single line inami/main/api/views.pyplus a regression test.SourceImage.public_url()reads 8 S3-credential columns offdeployment.data_sourcewhen the deployment has nopublic_base_url. Without the eager join, every such row fires one extra query during list serialization. Preloading collapses those into the main SELECT.What changed since the original draft
The earlier draft of this PR layered three things together: this
select_related, an explicit.only(*SOURCE_IMAGE_LIST_ONLY_FIELDS)whitelist on the list queryset, and a denormalization ofSourceImageCollectioncounts. Following review:.only()was removed (commit 04999ab). Per @mihow's pushback: the N+1 fix isselect_related, not.only(). Future heavy fields will be caught bydjango-zen-queries+ perf testing, and the whitelist friction outweighed the marginal row-width win on a 26-of-52-column table where the bulk are already small scalars.with_*_countannotations as correlated subqueries was invisible to UX since the collections list does not paginate. Speed up the capture set list view #1301 denormalizes the 3 counts as cached columns instead.What stays here: the
select_relatedjoin + a regression test that pins the row shape (url,size_display,deployment.name,event,detections_count,path) and asserts no per-row lazy loads fire.Tests
TestSourceImageListQueryCount.test_list_response_shape_has_no_lazy_loads— fails if the eager join is dropped or a serializer field is wired through a method that re-fetches.TestSourceImage*andTestOccurrence*QueryCountsuites pass unchanged.Out of scope
(project_id, timestamp DESC)onmain_sourceimage— needsCREATE INDEX CONCURRENTLYon prod + measured before/after.SOURCE_IMAGE_LIST_ONLY_FIELDSconstant and itsSOURCE_IMAGE_PUBLIC_URL_DEPENDENCIESsub-group were committed and then reverted (8dfc3a0 → 04999ab). The reasoning trail lives in the planning docs underdocs/claude/planning/.Test plan
docker compose run --rm django python manage.py test ami.main.tests.TestSourceImageListQueryCount— 4/4 pass cold-cache/api/v2/captures/