Skip to content

Optimize FindQuery.update() to use key-only search and partial HSET #777

@abrookins

Description

@abrookins

Summary

Optimize FindQuery.update() to avoid loading full documents. Instead, fetch only keys and apply partial field updates directly.

Problem

Current implementation:

async def update(self, use_transaction=True, **field_values):
    for model in await self.all():       # Loads ALL documents into Python
        for field, value in field_values.items():
            setattr(model, field, value)
        await model.save(pipeline=pipeline)  # Full document HSET

Issues:

  1. self.all() loads full documents when we only need keys
  2. save() writes all fields when we only changed specific ones
  3. Pydantic validation runs N times (once per model) with identical values

Proposed Implementation

async def update(self, use_transaction=True, **field_values) -> int:
    # 1. Validate field values once upfront
    validate_model_fields(self.model, field_values)
    serialized = self._serialize_field_values(field_values)
    
    # 2. Get matching keys only (no document content)
    keys = await self._search_keys_only()
    
    # 3. Pipeline partial updates (preserve existing use_transaction behavior)
    pipeline = await self.model.db().pipeline(transaction=use_transaction)
    for key in keys:
        if self.model._is_json_model():
            for field, value in serialized.items():
                pipeline.json().set(key, f"$.{field}", value)
        else:
            pipeline.hset(key, mapping=serialized)
    
    await pipeline.execute()
    return len(keys)

Key Changes

1. Key-only search

async def _search_keys_only(self) -> List[str]:
    # Use NOCONTENT to get keys without document data
    results = await self._execute_search(nocontent=True)
    return [doc.id for doc in results.docs]

2. Validate once, not N times

def _serialize_field_values(self, field_values: Dict[str, Any]) -> Dict[str, Any]:
    # Validate and serialize each field value once
    # Uses Pydantic field validation
    ...

3. Partial HSET

# Only the fields being updated are written
pipeline.hset("user:123", mapping={"status": "active"})  # Only status

API

Preserves existing signature:

async def update(self, use_transaction=True, **field_values) -> int:

Only change: now returns count of updated records.

Performance Comparison

Metric Current Proposed
Documents loaded N 0
Pydantic validations N 1
Fields written per doc All Only updated
Data transferred (read) N × doc_size N × key_size
Data transferred (write) N × doc_size N × field_size

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions