Skip to content

docs/planner: avoid partial indexes with WHERE vector IS NOT NULL on NOT NULL vector columns #453

@Leask

Description

@Leask

Summary

We hit a non-obvious planner failure mode while using vchordrq on PostgreSQL 18.

If a vector column is declared NOT NULL, and the index is still created as a partial index with:

WHERE vector IS NOT NULL

then PostgreSQL may fail to use the vchordrq KNN path for queries of the form:

ORDER BY vector <-> $query LIMIT k

This looked at first like a broken ANN index on a large table, but we reduced it to a much smaller reproduction. The issue appears to be planner behavior around a redundant partial-index predicate, not a vchordrq search-kernel bug.

Why this is worth reporting

This is easy to write by habit, especially when teams have other nullable-vector tables and want a consistent index definition. The failure mode is expensive and hard to diagnose:

  • the index is valid
  • the query shape is correct
  • rebuilding the same partial index does not help
  • planner falls back to Seq Scan / Parallel Seq Scan

A docs warning would probably save a lot of debugging time.

Minimal reproduction

Case 1: nullable vector column + partial index

This works.

DROP TABLE IF EXISTS public.vchord_nullable_test;

CREATE TABLE public.vchord_nullable_test AS
SELECT id, vector
FROM some_source_table
LIMIT 50000;

CREATE INDEX vchord_nullable_test_idx
ON public.vchord_nullable_test
USING vchordrq (vector halfvec_l2_ops)
WITH (options='
    residual_quantization = true
    [build.internal]
    lists = [128]
    build_threads = 4
    spherical_centroids = false
')
WHERE vector IS NOT NULL;

ANALYZE public.vchord_nullable_test;

EXPLAIN
SELECT id
FROM public.vchord_nullable_test
WHERE vector IS NOT NULL
ORDER BY vector <-> (
    SELECT vector
    FROM public.vchord_nullable_test
    LIMIT 1
)
LIMIT 20;

Expected / observed result:

  • Index Scan using vchord_nullable_test_idx

Case 2: same table shape, but vector column marked NOT NULL

This fails.

DROP TABLE IF EXISTS public.vchord_notnull_test;

CREATE TABLE public.vchord_notnull_test AS
SELECT id, vector
FROM some_source_table
LIMIT 50000;

ALTER TABLE public.vchord_notnull_test
ALTER COLUMN vector SET NOT NULL;

CREATE INDEX vchord_notnull_test_idx
ON public.vchord_notnull_test
USING vchordrq (vector halfvec_l2_ops)
WITH (options='
    residual_quantization = true
    [build.internal]
    lists = [128]
    build_threads = 4
    spherical_centroids = false
')
WHERE vector IS NOT NULL;

ANALYZE public.vchord_notnull_test;

EXPLAIN
SELECT id
FROM public.vchord_notnull_test
WHERE vector IS NOT NULL
ORDER BY vector <-> (
    SELECT vector
    FROM public.vchord_notnull_test
    LIMIT 1
)
LIMIT 20;

Observed result:

  • Parallel Seq Scan plus Sort
  • no vchordrq KNN path

Case 3: same NOT NULL table, but full index instead of partial index

This works again.

DROP INDEX IF EXISTS public.vchord_notnull_test_full_idx;

CREATE INDEX vchord_notnull_test_full_idx
ON public.vchord_notnull_test
USING vchordrq (vector halfvec_l2_ops)
WITH (options='
    residual_quantization = true
    [build.internal]
    lists = [128]
    build_threads = 4
    spherical_centroids = false
');

ANALYZE public.vchord_notnull_test;

EXPLAIN
SELECT id
FROM public.vchord_notnull_test
ORDER BY vector <-> (
    SELECT vector
    FROM public.vchord_notnull_test
    LIMIT 1
)
LIMIT 20;

Observed result:

  • Index Scan using vchord_notnull_test_full_idx

Btree control case

We also reproduced the same planner pattern with a plain btree partial index.

Nullable column

DROP TABLE IF EXISTS public.partial_nullable_btree_test;

CREATE TABLE public.partial_nullable_btree_test (
    id bigint PRIMARY KEY,
    x bigint
);

INSERT INTO public.partial_nullable_btree_test
SELECT g, g
FROM generate_series(1, 50000) AS g;

CREATE INDEX partial_nullable_btree_test_x_idx
ON public.partial_nullable_btree_test (x)
WHERE x IS NOT NULL;

ANALYZE public.partial_nullable_btree_test;
SET enable_seqscan = off;

EXPLAIN
SELECT id
FROM public.partial_nullable_btree_test
WHERE x IS NOT NULL
ORDER BY x
LIMIT 20;

Observed result:

  • planner uses the partial btree index

NOT NULL column

DROP TABLE IF EXISTS public.partial_notnull_btree_test;

CREATE TABLE public.partial_notnull_btree_test (
    id bigint PRIMARY KEY,
    x bigint NOT NULL
);

INSERT INTO public.partial_notnull_btree_test
SELECT g, g
FROM generate_series(1, 50000) AS g;

CREATE INDEX partial_notnull_btree_test_x_idx
ON public.partial_notnull_btree_test (x)
WHERE x IS NOT NULL;

ANALYZE public.partial_notnull_btree_test;
SET enable_seqscan = off;

EXPLAIN
SELECT id
FROM public.partial_notnull_btree_test
WHERE x IS NOT NULL
ORDER BY x
LIMIT 20;

Observed result:

  • planner still falls back to Seq Scan plus Sort

This strongly suggests the root cause is PostgreSQL planner behavior around redundant partial predicates, not a vchordrq ANN execution bug.

Real-world impact

We hit this on a serving table with ~4.38M rows and halfvec(256) NOT NULL.

The original index shape was:

CREATE INDEX ... USING vchordrq (vector halfvec_l2_ops)
WHERE vector IS NOT NULL;

The minimal KNN query did not use the index.

Rebuilding the index as a full vchordrq index immediately restored the expected path:

  • Index Scan using ...__vector__vchord_idx
  • latency dropped from seconds to milliseconds

Suggested docs update

I think the docs should explicitly warn against building:

WHERE vector IS NOT NULL

on a vector column that is already declared NOT NULL.

Suggested guidance:

  • for NOT NULL vector columns, use a full vchordrq index
  • use partial indexes only for real predicates such as:
    • model = '...'
    • source_slug = '...'
    • or genuinely nullable vector columns where vector IS NOT NULL really shrinks the indexed subset

This looks small, but it is a very expensive trap in production.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions