Skip to content

Speed up the captures list view#1300

Merged
mihow merged 5 commits into
mainfrom
perf/list-endpoint-subquery-rewrite
May 14, 2026
Merged

Speed up the captures list view#1300
mihow merged 5 commits into
mainfrom
perf/list-endpoint-subquery-rewrite

Conversation

@mihow
Copy link
Copy Markdown
Collaborator

@mihow mihow commented May 12, 2026

Summary

Adds select_related("deployment__data_source") to SourceImageViewSet.list. Net diff is a single line in ami/main/api/views.py plus a regression test.

SourceImage.public_url() reads 8 S3-credential columns off deployment.data_source when the deployment has no public_base_url. Without the eager join, every such row fires one extra query during list serialization. Preloading collapses those into the main SELECT.

What changed since the original draft

The earlier draft of this PR layered three things together: this select_related, an explicit .only(*SOURCE_IMAGE_LIST_ONLY_FIELDS) whitelist on the list queryset, and a denormalization of SourceImageCollection counts. Following review:

  • .only() was removed (commit 04999ab). Per @mihow's pushback: the N+1 fix is select_related, not .only(). Future heavy fields will be caught by django-zen-queries + perf testing, and the whitelist friction outweighed the marginal row-width win on a 26-of-52-column table where the bulk are already small scalars.
  • Collection-count denormalization moved to Speed up the capture set list view #1301 (stacked on this branch). The paginator-COUNT win from rewriting with_*_count annotations as correlated subqueries was invisible to UX since the collections list does not paginate. Speed up the capture set list view #1301 denormalizes the 3 counts as cached columns instead.

What stays here: the select_related join + a regression test that pins the row shape (url, size_display, deployment.name, event, detections_count, path) and asserts no per-row lazy loads fire.

Tests

  • TestSourceImageListQueryCount.test_list_response_shape_has_no_lazy_loads — fails if the eager join is dropped or a serializer field is wired through a method that re-fetches.
  • Existing TestSourceImage* and TestOccurrence*QueryCount suites pass unchanged.

Out of scope

  • Composite index (project_id, timestamp DESC) on main_sourceimage — needs CREATE INDEX CONCURRENTLY on prod + measured before/after.
  • The SOURCE_IMAGE_LIST_ONLY_FIELDS constant and its SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES sub-group were committed and then reverted (8dfc3a004999ab). The reasoning trail lives in the planning docs under docs/claude/planning/.

Test plan

  • docker compose run --rm django python manage.py test ami.main.tests.TestSourceImageListQueryCount — 4/4 pass cold-cache
  • Staging arbutus-2026 deploy + NR p50/p95/p99 check on /api/v2/captures/

@netlify
Copy link
Copy Markdown

netlify Bot commented May 12, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 875ded2
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6a05219e2dc2030007da39f9

@netlify
Copy link
Copy Markdown

netlify Bot commented May 12, 2026

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit 875ded2
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/6a05219ec25cf8000812b68d

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Warning

Rate limit exceeded

@mihow has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 57 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eb4dd7a2-7bb2-4997-981a-fb22853ff869

📥 Commits

Reviewing files that changed from the base of the PR and between 7438f76 and 875ded2.

📒 Files selected for processing (2)
  • ami/main/api/views.py
  • ami/main/tests.py
📝 Walkthrough

Walkthrough

SourceImage list responses now restrict hydrated DB columns via a module-level SOURCE_IMAGE_LIST_ONLY_FIELDS applied with .only(); collection-counting annotations were refactored from join-based Count() to scalar Subquery + Coalesce (with Exists where needed). Tests assert serialized fields and bounded query counts.

Changes

Query Performance Optimization

Layer / File(s) Summary
SourceImage list field optimization
ami/main/api/views.py, ami/main/tests.py
Adds SOURCE_IMAGE_LIST_ONLY_FIELDS and applies queryset.only(...) for SourceImageViewSet list action; adds test that verifies serialized fields (including deployment.name) and bounds DB queries during list serialization.
Counting method documentation
ami/main/models.py
Adds inline comments to SourceImageQuerySet.with_occurrences_count() and with_taxa_count() documenting the scalar subquery + Coalesce(..., 0) approach.
Collection counting query optimization
ami/main/models.py, ami/main/tests.py
Replaces join-based Count() annotations with scalar Subquery(...).values(...) + Coalesce(..., 0) and Exists() filters in SourceImageCollectionQuerySet counting methods; adds tests verifying query counts across page sizes, with_counts=true, and ordering by annotated counts.

Sequence Diagram

sequenceDiagram
  participant Client
  participant SourceImageViewSet
  participant SourceImageListSerializer
  participant Database
  Client->>SourceImageViewSet: GET /api/v2/captures/
  SourceImageViewSet->>Database: queryset.select_related(...).only(*SOURCE_IMAGE_LIST_ONLY_FIELDS)
  SourceImageViewSet->>SourceImageListSerializer: serialize(page)
  SourceImageListSerializer->>Database: read whitelisted fields (no deferred loads)
  SourceImageViewSet->>Client: JSON response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • RolnickLab/antenna#1043: Modifies SourceImageViewSet.get_queryset and related list/count behaviors, touching similar viewset/queryset logic.

Suggested labels

backend

Suggested reviewers

  • annavik

Poem

🐰 I nibble fields to keep them tight,
Only what’s needed sees the light.
Subqueries hum, zeros in tow,
Collections count without a show —
Hop, the queries run just right!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description significantly deviates from the template structure and omits several required sections. Add missing template sections: 'List of Changes', 'Related Issues', 'How to Test the Changes', 'Screenshots' (if applicable), 'Deployment Notes', and complete the 'Checklist'. Restructure to match the template format.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title 'Speed up the captures list view' is directly related to the main change: optimizing SourceImageViewSet.list performance through .only() field restriction and select_related preloading.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/list-endpoint-subquery-rewrite

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated
@mihow mihow marked this pull request as ready for review May 12, 2026 05:32
Copilot AI review requested due to automatic review settings May 12, 2026 05:32
@mihow
Copy link
Copy Markdown
Collaborator Author

mihow commented May 12, 2026

@coderabbitai ready for review! our outstanding question is this even necessary if we cache the detections_count filed

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look at the changed files in the PR
git diff origin/main...HEAD --name-only

# Look at the collection viewset changes
fd -t f "collection" --exec grep -l "Count\|subquery\|Subquery\|detections_count\|images_count" {} \; 2>/dev/null | head -20

Length of output: 150


🏁 Script executed:

#!/bin/bash
# Look at the collection viewset changes - specifically the Count/Subquery annotations
grep -n "Count\|Subquery\|subquery\|detections_count\|source_images_count\|processed_count\|with_detections" ami/main/api/views.py | head -60

# Look at SourceImageCollection model for any cached/denormalized fields
grep -n "detections_count\|images_count\|processed_count\|cached\|denorm\|signal\|post_save\|source_images_count" ami/main/models.py | head -60

# See the diff for models.py to understand what changed
git diff origin/main...HEAD -- ami/main/models.py | head -100

# See the diff for views.py
git diff origin/main...HEAD -- ami/main/api/views.py | head -150

Length of output: 11971


🏁 Script executed:

#!/bin/bash
# Check NULL_DETECTIONS_FILTER definition
grep -n "NULL_DETECTIONS_FILTER" ami/main/models.py | head -20

# Check how detections_count on SourceImage is updated (signals, tasks, etc.)
grep -n "detections_count" ami/main/models.py | grep -v "def\|#\|assert\|blank\|null\|annotate\|filter\|Count\|Sum\|Min\|Max\|cached_property\|values\|interval\|update_fields\|detection_counts\|occurrences_count" | head -30

# Also check if there's a signal or task that updates SourceImage.detections_count
grep -n "populate_detections_count\|update_detections_count\|save.*detections_count\|detections_count.*save" ami/main/models.py | head -20

# Check the function that populates detections_count  
sed -n '2220,2260p' ami/main/models.py

Length of output: 2799


@mihow — great question. Here's the breakdown after looking at the code:

SourceImage.detections_count is already a cached DB column (populated by update_detection_counts() with Coalesce(subquery, 0) using the same NULL_DETECTIONS_FILTER). Semantically:

  • NULL → not yet processed / count not yet backfilled
  • 0 → processed, no valid detections
  • >0 → has valid detections

Does that make this PR unnecessary? No — the biggest win here is the paginator COUNT(*) fix, which is independent of how the per-row annotations are computed. The 13,800× speedup on project 18's COUNT(*) comes purely from eliminating the M2M GROUP BY blowup; caching collection-level counts wouldn't change that.

Is there an incremental improvement available within this PR though? Yes — since detections_count is cached on SourceImage, the two subqueries that still join through main_detection could be simplified:

Method Current (new shape) Possible simplification
with_source_images_with_detections_count Exists(Detection…exclude(NULL_DETECTIONS_FILTER)) filter(detections_count__gt=0) — no Detection join
with_source_images_processed_count Exists(Detection…) — any detection filter(detections_count__isnull=False) — no Detection join

The caveat: update_detection_counts() is not signal-driven — it has an @TODO Needs testing comment and is called manually/via task. Images that were processed but haven't had detections_count backfilled yet would show as "not processed" under the simplified approach, making it potentially less accurate than hitting main_detection directly.

If you're asking about full collection-level caching (planning doc §1.B — adding source_images_count etc. as denormalized columns on SourceImageCollection maintained via signals), that would make the subquery approach unnecessary entirely and is the "V2" path noted in your PR description. That's a larger scope change worth tracking separately.

Recommendation: Land this PR as-is — the paginator fix alone justifies it. The detections_count-based simplification for the two Detection-table subqueries is a worthwhile follow-up, but should be gated on first making update_detection_counts() reliable (signal-driven or at least thoroughly tested). I'll proceed with the full review now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ami/main/tests.py (1)

3297-3330: ⚡ Quick win

Increase fixture rows so the page-size scaling test is meaningful.

Right now only 2 collections are created, so limit=25 still returns at most 2 rows. That can miss row-scaling regressions.

Proposed change
-        # Two collections so list endpoint has >1 row to paginate.
-        for i in range(2):
+        # Create enough rows so limit=25 actually exercises a larger page.
+        for i in range(30):
             c = SourceImageCollection.objects.create(
                 name=f"qcount-collection-{i}",
                 project=self.project,
                 method="manual",
                 kwargs={"image_ids": [img.pk for img in images]},
             )
             c.images.set(images)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` around lines 3297 - 3330, The test
test_list_query_count_does_not_scale_with_page_size currently only creates 2
SourceImageCollection rows so comparing limit=1 vs limit=25 isn't meaningful;
update the setup that creates collections (the for i in range(2) block that
builds SourceImageCollection objects used by the test and referenced by
_list_query_count) to create enough rows (e.g., 25–50 collections) so the list
endpoint can actually return >25 items and expose any N+1 scaling issues; ensure
each created SourceImageCollection still sets c.images.set(images) and leave
_list_query_count and the assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/api/views.py`:
- Around line 545-570: The queryset used for listing trims fields via
queryset.only(...) but leaves SourceImage.public_url() able to access
self.deployment.data_source, causing an N+1; update the list view to either
select_related('deployment__data_source') (or
select_related('deployment').select_related('deployment__data_source') as
appropriate) so the related DataSource is preloaded for
SourceImage.public_url(), or change SourceImageListSerializer to only use cached
fields (public_base_url/path) and avoid calling SourceImage.public_url(); locate
the queryset.only block and the SourceImage.public_url /
SourceImageListSerializer usages to apply the select_related preload or remove
the serializer branch that reads deployment.data_source.

---

Nitpick comments:
In `@ami/main/tests.py`:
- Around line 3297-3330: The test
test_list_query_count_does_not_scale_with_page_size currently only creates 2
SourceImageCollection rows so comparing limit=1 vs limit=25 isn't meaningful;
update the setup that creates collections (the for i in range(2) block that
builds SourceImageCollection objects used by the test and referenced by
_list_query_count) to create enough rows (e.g., 25–50 collections) so the list
endpoint can actually return >25 items and expose any N+1 scaling issues; ensure
each created SourceImageCollection still sets c.images.set(images) and leave
_list_query_count and the assertions unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 52a221ca-5249-4b18-b29e-cf3d064913de

📥 Commits

Reviewing files that changed from the base of the PR and between ad0826d and 4a45b40.

📒 Files selected for processing (3)
  • ami/main/api/views.py
  • ami/main/models.py
  • ami/main/tests.py

Comment thread ami/main/api/views.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets two high-traffic list endpoints in the ami.main API to reduce database work on pagination and to reduce payload/row width during list serialization.

Changes:

  • Rewrites SourceImageCollection list count annotations from join-based Count(... distinct=True) to correlated scalar Subquery(Count(...)), avoiding outer GROUP BY and the associated paginator COUNT(*) blow-up.
  • Adds .only(...) to SourceImageViewSet.get_queryset() for the list action to limit selected columns to those needed by SourceImageListSerializer.
  • Adds/extends query-count and response-shape regression tests for both endpoints.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
ami/main/models.py Replaces collection image-count annotations with correlated subqueries + Coalesce to prevent join amplification during pagination.
ami/main/api/views.py Applies .only() on the SourceImage list queryset to trim selected columns while keeping existing filtering/prefetch behavior intact.
ami/main/tests.py Adds regression tests to guard query-count flatness and ensure .only() doesn’t trigger deferred-field lazy loads during serialization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mihow added a commit that referenced this pull request May 12, 2026
Addresses inline review comments on PR #1300:

- mihow:556 — drop misleading inline comment that mixed up `detections_count`
  vs `occurrences_count` / `taxa_count`. Maintenance rules now live in the
  docstring above `SOURCE_IMAGE_LIST_ONLY_FIELDS`.
- mihow:548 — pull the `.only()` field list into a module-level constant
  with a docstring describing how to keep it in sync (which serializer +
  model methods drive each column).
- coderabbit:570 — add `select_related("deployment__data_source")` and
  preload the S3StorageSource columns that `SourceImage.public_url()`'s
  presigned-URL branch reads (bucket, region, prefix, access_key,
  secret_key, endpoint_url, public_base_url). Without this, rows whose
  `public_base_url` is blank pay one extra query per row during
  serialization.
- coderabbit:3297 — bump `TestSourceImageCollectionListQueryCount` fixture
  from 2 → 30 collections so `limit=25` actually exercises a real page
  boundary; the per-row subquery scaling regression only surfaces with
  N > 1 page.

Tests: 16 SourceImage* + SourceImageCollection* tests pass.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow
Copy link
Copy Markdown
Collaborator Author

mihow commented May 12, 2026

Claude says: Pushed 8ec3e81 addressing all 4 review comments:

  1. mihow:556 (misleading comment) — dropped, maintenance notes moved to docstring above the new SOURCE_IMAGE_LIST_ONLY_FIELDS constant.
  2. mihow:548 (.only() maintainability) — refactored into module-level constant + docstring with maintenance rules. Behavioral guard already exists via test_list_response_shape_preserved_after_only.
  3. coderabbit:570 (data_source N+1) — added select_related("deployment__data_source") + the 7 S3StorageSource columns public_url() reads.
  4. coderabbit nitpick (tests.py:3297) — bumped TestSourceImageCollectionListQueryCount fixtures from 2 → 30 collections so limit=25 exercises a real page boundary.

All 16 tests in TestSourceImage* / TestSourceImageCollection* pass locally.

The collection-counts denormalize-vs-CTE question (E1/E2/E3 in the PR body) is still open — flagged as follow-up work, not blocking this PR.

mihow added a commit that referenced this pull request May 12, 2026
Adds:
- `docs/planning/2026-05-11-newrelic-post-upgrade-findings.md` — NR prod
  data after the agent upgrade that surfaced the list-endpoint hot paths.
- `docs/planning/2026-05-11-list-endpoint-perf-analysis.md` — SQL-level
  root-cause analysis for the 3 endpoints (collection list, captures list,
  project charts).
- `docs/planning/2026-05-11-list-endpoint-perf-continuation-plan.md` —
  sequenced PR plan; PR-A + PR-B1 landed as #1300, PR-C/D/E queued.
- `docs/claude/prompts/NEXT_SESSION_PROMPT.md` — implementation plan for
  the cached-column denormalize (E3 in #1300's options) follow-up PR.

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated
mihow and others added 2 commits May 12, 2026 11:23
Replaces the unconditional wide SELECT (every column on main_sourceimage +
main_deployment + main_event) on SourceImageViewSet.list with a `.only()`
list scoped to fields SourceImageListSerializer actually reads. Row width
drops from 52 columns to 26.

select_related("deployment__data_source") added so SourceImage.public_url()'s
presigned-URL fallback (when public_base_url is blank) does not fire a per-row
lazy fetch.

SOURCE_IMAGE_LIST_ONLY_FIELDS is exposed as a module-level constant with a
maintenance docstring covering how to keep it in sync as the serializer or
model methods change. SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES is split out as a
sub-constant so the 11 fields that exist only to support presigned-URL
generation are visibly grouped — that whole chain goes away once images are
served through the upcoming image-resizing/CDN layer instead of directly
from source buckets.

Only applied to the list action; retrieve keeps the wide SELECT so the
detail serializer's broader field set is unaffected.

Tests:
- TestSourceImageListQueryCount extended with response-shape assertion
  guarding against `.only()` over-restriction (verifies url, size_display,
  deployment.name, event present without lazy loads).

Refs: docs/planning/2026-05-11-list-endpoint-perf-continuation-plan.md PR-B1.

Co-Authored-By: Claude <noreply@anthropic.com>
Adds:
- 2026-05-11-newrelic-post-upgrade-findings.md — NR prod data after the
  agent upgrade that surfaced the list-endpoint hot paths.
- 2026-05-11-list-endpoint-perf-analysis.md — SQL-level root-cause
  analysis for the 3 endpoints (collection list, captures list, project
  charts).
- 2026-05-11-list-endpoint-perf-continuation-plan.md — sequenced PR plan.
  PR-A (collection subquery rewrite) superseded by the denormalize-counts
  approach on perf/sourceimagecollection-cached-counts. PR-B1 ships in
  this PR. PR-C/D/E queued.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow force-pushed the perf/list-endpoint-subquery-rewrite branch from 7438f76 to 3f34b78 Compare May 12, 2026 18:24
@mihow mihow changed the title perf(api): rewrite collection counts as subqueries; trim capture list SELECT Speed up the captures list view May 12, 2026
mihow added a commit that referenced this pull request May 13, 2026
The collection list endpoint previously ran 3 correlated count subqueries
per row (source_images_count, source_images_with_detections_count,
source_images_processed_count). On large collections each subquery scans
the M2M and Detection tables. Reads are now O(1) against cached columns.

- Add 3 IntegerField columns on SourceImageCollection with default=0
- Migration 0085 backfills via a single GROUP BY over the M2M with FILTER
  clauses; atomic=False so prod-sized M2M can chunk
- update_calculated_fields() recomputes all 3 counts in one aggregate query
- m2m_changed on images.through keeps counts fresh on add/remove (both
  directions; reverse post_clear documented as caller responsibility)
- Detection post_save/post_delete recompute for collections containing the
  affected source image
- pipeline.save_results() recomputes for affected collections after the
  bulk_create batch (which skips signals)
- Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet
  and the chain in SourceImageCollectionViewSet.queryset
- TestSourceImageCollectionCountsDenormalize covers initial state, add/remove,
  Detection create/delete, null-bbox processed-but-not-with-detections case,
  and update_calculated_fields() recovery from drift
- TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q
  (was <=15), ordering by source_images_count uses column directly

Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly
invisible since the UI does not paginate collections; this is the actual
UX-visible improvement.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow mentioned this pull request May 13, 2026
3 tasks
Comment thread ami/main/api/views.py Outdated
…lass

Per review on #1300: pulled `SOURCE_IMAGE_LIST_ONLY_FIELDS` /
`SOURCE_IMAGE_PUBLIC_URL_DEPENDENCIES` out of module scope and into
`SourceImageViewSet` as class attributes, grouped by what reads them
(core, deployment-nested, event-nested, public-url). Trimmed the
docstring and added a short note on why we whitelist with `.only()`
instead of blacklisting with `.defer()`, with a link to the Django ref.

Co-Authored-By: Claude <noreply@anthropic.com>
mihow added a commit that referenced this pull request May 13, 2026
The collection list endpoint previously ran 3 correlated count subqueries
per row (source_images_count, source_images_with_detections_count,
source_images_processed_count). On large collections each subquery scans
the M2M and Detection tables. Reads are now O(1) against cached columns.

- Add 3 IntegerField columns on SourceImageCollection with default=0
- Migration 0085 backfills via a single GROUP BY over the M2M with FILTER
  clauses; atomic=False so prod-sized M2M can chunk
- update_calculated_fields() recomputes all 3 counts in one aggregate query
- m2m_changed on images.through keeps counts fresh on add/remove (both
  directions; reverse post_clear documented as caller responsibility)
- Detection post_save/post_delete recompute for collections containing the
  affected source image
- pipeline.save_results() recomputes for affected collections after the
  bulk_create batch (which skips signals)
- Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet
  and the chain in SourceImageCollectionViewSet.queryset
- TestSourceImageCollectionCountsDenormalize covers initial state, add/remove,
  Detection create/delete, null-bbox processed-but-not-with-detections case,
  and update_calculated_fields() recovery from drift
- TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q
  (was <=15), ordering by source_images_count uses column directly

Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly
invisible since the UI does not paginate collections; this is the actual
UX-visible improvement.

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread ami/main/api/views.py Outdated
Comment thread ami/main/api/views.py Outdated
Per @mihow review on #1300: the `.only()` machinery added in this PR was
acting as a row-width trim on top of select_related, not as the N+1 fix.
The actual N+1 prevention is `select_related("deployment__data_source")`
ensuring `SourceImage.public_url()` doesn't lazy-load the data_source
chain per row. With that in place, dropping `.only()` keeps the captures
list query count flat (verified empirically — `TestSourceImageListQueryCount`
still passes at limit=1, 5, 25).

This removes the maintenance burden of keeping the field tuples in sync
with the serializer surface. django-zen-queries and the existing perf
tests are the safety net for regressions.

Renamed `test_list_response_shape_preserved_after_only` to
`test_list_response_shape_has_no_lazy_loads` so the contract it asserts
is independent of any specific deferred-fields mechanism.

Co-Authored-By: Claude <noreply@anthropic.com>
mihow added a commit that referenced this pull request May 13, 2026
The collection list endpoint previously ran 3 correlated count subqueries
per row (source_images_count, source_images_with_detections_count,
source_images_processed_count). On large collections each subquery scans
the M2M and Detection tables. Reads are now O(1) against cached columns.

- Add 3 IntegerField columns on SourceImageCollection with default=0
- Migration 0085 backfills via a single GROUP BY over the M2M with FILTER
  clauses; atomic=False so prod-sized M2M can chunk
- update_calculated_fields() recomputes all 3 counts in one aggregate query
- m2m_changed on images.through keeps counts fresh on add/remove (both
  directions; reverse post_clear documented as caller responsibility)
- Detection post_save/post_delete recompute for collections containing the
  affected source image
- pipeline.save_results() recomputes for affected collections after the
  bulk_create batch (which skips signals)
- Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet
  and the chain in SourceImageCollectionViewSet.queryset
- TestSourceImageCollectionCountsDenormalize covers initial state, add/remove,
  Detection create/delete, null-bbox processed-but-not-with-detections case,
  and update_calculated_fields() recovery from drift
- TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q
  (was <=15), ordering by source_images_count uses column directly

Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly
invisible since the UI does not paginate collections; this is the actual
UX-visible improvement.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow
Copy link
Copy Markdown
Collaborator Author

mihow commented May 14, 2026

Claude says: PR body updated to reflect current state — .only() machinery dropped per @mihow's review, net diff is now a single select_related("deployment__data_source") line plus a regression test. Collection-count work moved to #1301 (stacked on this branch).

@coderabbitai full review please. Sister PR #1301 stacks on this branch and ports the collection-count denormalization out of the original draft.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

@mihow: Understood — noted that the scope has been trimmed significantly: the .only() constant machinery is out, the net change is select_related("deployment__data_source") plus the regression test, and the collection-count denormalization is tracked separately in #1301. Kicking off a full review now.

✅ Actions performed

Full review triggered.

Three planning notes from 3f34b78 were committed under docs/planning/
(wrong convention — should be docs/claude/planning/ per CLAUDE.md) and
have since rotted: PR-A was superseded by the denormalize approach in
#1301; the .only() recommendation was reverted per review; PR-B1 ships
in this PR. Two genuinely useful items have been harvested:

- New Relic baseline numbers for 9 list endpoints (post-NR-12.1.0)
  → memory entry reference_nr_baseline_2026_05_11.md
- 22:22 UTC PG connection-pool burst hypothesis
  → issue #1302
- SourceImageViewSet.retrieve 571-call N+1 audit
  → issue #1303

Co-Authored-By: Claude <noreply@anthropic.com>
mihow added a commit that referenced this pull request May 14, 2026
The collection list endpoint previously ran 3 correlated count subqueries
per row (source_images_count, source_images_with_detections_count,
source_images_processed_count). On large collections each subquery scans
the M2M and Detection tables. Reads are now O(1) against cached columns.

- Add 3 IntegerField columns on SourceImageCollection with default=0
- Migration 0085 backfills via a single GROUP BY over the M2M with FILTER
  clauses; atomic=False so prod-sized M2M can chunk
- update_calculated_fields() recomputes all 3 counts in one aggregate query
- m2m_changed on images.through keeps counts fresh on add/remove (both
  directions; reverse post_clear documented as caller responsibility)
- Detection post_save/post_delete recompute for collections containing the
  affected source image
- pipeline.save_results() recomputes for affected collections after the
  bulk_create batch (which skips signals)
- Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet
  and the chain in SourceImageCollectionViewSet.queryset
- TestSourceImageCollectionCountsDenormalize covers initial state, add/remove,
  Detection create/delete, null-bbox processed-but-not-with-detections case,
  and update_calculated_fields() recovery from drift
- TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q
  (was <=15), ordering by source_images_count uses column directly

Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly
invisible since the UI does not paginate collections; this is the actual
UX-visible improvement.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow merged commit 0dedd4c into main May 14, 2026
4 of 7 checks passed
@mihow mihow deleted the perf/list-endpoint-subquery-rewrite branch May 14, 2026 01:18
mihow added a commit that referenced this pull request May 14, 2026
The collection list endpoint previously ran 3 correlated count subqueries
per row (source_images_count, source_images_with_detections_count,
source_images_processed_count). On large collections each subquery scans
the M2M and Detection tables. Reads are now O(1) against cached columns.

- Add 3 IntegerField columns on SourceImageCollection with default=0
- Migration 0085 backfills via a single GROUP BY over the M2M with FILTER
  clauses; atomic=False so prod-sized M2M can chunk
- update_calculated_fields() recomputes all 3 counts in one aggregate query
- m2m_changed on images.through keeps counts fresh on add/remove (both
  directions; reverse post_clear documented as caller responsibility)
- Detection post_save/post_delete recompute for collections containing the
  affected source image
- pipeline.save_results() recomputes for affected collections after the
  bulk_create batch (which skips signals)
- Drop the 3 with_*_count() annotation methods on SourceImageCollectionQuerySet
  and the chain in SourceImageCollectionViewSet.queryset
- TestSourceImageCollectionCountsDenormalize covers initial state, add/remove,
  Detection create/delete, null-bbox processed-but-not-with-detections case,
  and update_calculated_fields() recovery from drift
- TestSourceImageCollectionListQueryCount tightened: with_counts now <=10q
  (was <=15), ordering by source_images_count uses column directly

Follow-up to PR #1300 — the paginator COUNT(*) win there is mostly
invisible since the UI does not paginate collections; this is the actual
UX-visible improvement.

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants