Skip to content

refactor: derive last_accessed_at from claimed_at to unify timestamp tracking#16

Closed
jolovicdev wants to merge 1 commit into
masterfrom
test/consolidate-timestamp-tracking
Closed

refactor: derive last_accessed_at from claimed_at to unify timestamp tracking#16
jolovicdev wants to merge 1 commit into
masterfrom
test/consolidate-timestamp-tracking

Conversation

@jolovicdev
Copy link
Copy Markdown
Owner

Summary

Unifies last_accessed_at with claimed_at across SQLite and Redis backends, eliminating a redundant timestamp and the side-effect write in find_by_fingerprint.

Problem

The SQLite store tracked two independent timestamps:

  • claimed_at — set when a worker acquired the claim
  • last_accessed_at — updated on every find_by_fingerprint read

This caused two issues:

  1. Write contention on readsfind_by_fingerprint did an UPDATE commits SET last_accessed_at = ?, turning every cache lookup into a write.
  2. Driftput_commit passed a fresh now for last_accessed_at, which could diverge from claimed_at if the commit row was updated by a heartbeat or status change elsewhere.

Changes

  • src/cashet/store.py
    • put_commit now derives last_accessed_at from commit.claimed_at.isoformat() instead of a separate now.
    • find_by_fingerprint is now side-effect-free; the UPDATE last_accessed_at has been removed.
  • src/cashet/redis_store.py
    • _encode_commit uses commit.claimed_at for last_accessed_at to keep both backends consistent.
  • tests/test_store.py & tests/test_async_client.py
    • test_last_accessed_at_derived_from_claimed_at — verifies the invariant after task completion.
    • test_cache_hit_does_not_shift_last_accessed_at — verifies that repeated cache hits no longer move the access timestamp.

Why this is safe

Callers that need to "touch" a commit (cache-hit promotion, heartbeat) already call put_commit, which writes the row with the current claimed_at. Removing the implicit UPDATE inside find_by_fingerprint just makes the behavior explicit and idempotent.

Testing

All 298 tests pass (45 skipped for Redis).

…tracking

The SQLite store maintained two independent timestamps (claimed_at and
last_accessed_at) that were updated through different code paths. This
led to subtle inconsistencies:

- find_by_fingerprint did a side-effect UPDATE on every read, adding
  write contention and making reads non-deterministic.
- put_commit passed a separate 'now' for last_accessed_at, which could
  drift from claimed_at if the commit was modified elsewhere.

Changes:
- put_commit now uses commit.claimed_at as the source of truth for
  last_accessed_at. A commit is 'accessed' when it is claimed.
- find_by_fingerprint is now side-effect-free; callers that need to
  touch a commit already call put_commit on cache hits.
- Redis _encode_commit aligned to use commit.claimed_at for
  last_accessed_at for cross-backend consistency.

Tests verify the invariant last_accessed_at == claimed_at and that
repeated cache hits no longer shift the access timestamp.
Copy link
Copy Markdown

@ds-review ds-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review

PR: refactor: derive last_accessed_at from claimed_at to unify timestamp tracking

Important

Verdict: Request changes - 7 actionable findings, highest severity P1.

Findings (7)

P1 High Redis find_by_fingerprint still updates access index via _touch_commit

src/cashet/redis_store.py:395

The SQLite store no longer updates any timestamp on cache hits, but Redis still calls _touch_commit (line 395) which updates the access index sorted set. This creates inconsistent eviction behavior between backends. Remove the _touch_commit call from find_by_fingerprint for consistency.

P2 Medium _index_commit_commands sets access index to current time on every put_commit

src/cashet/redis_store.py:249

Line 249 writes now_ts to the access index for every put_commit call, even for updates that are not cache hits (e.g., status changes). This unintentionally defers eviction for commits that are updated but not accessed. Use commit.claimed_at or do not update the access index on put_commit at all.

P2 Medium _touch_commit updates access index but not commit's last_accessed_at field

src/cashet/redis_store.py:626-628

The _touch_commit method only updates the sorted set cashet:index:last_accessed with a new timestamp, but does not update the last_accessed_at field inside the commit JSON stored at cashet:commit:{hash}. Any code reading the commit's own field will see the original claimed_at instead of the touch time. Either update the commit data or remove _touch_commit entirely.

P2 Medium <img alt="P2 Medium" src="https://img.shields.io/badge/P2-Medium-ca8a04?style=flat-square" height="20" align="absmiddle"

src/cashet/store.py:275

P2 Medium Possible AttributeError if commit.claimed_at is None

The call commit.claimed_at.isoformat() will raise AttributeError if claimed_at is None. While claimed_at is typically set before put_commit, defensive code should verify it is not None to prevent crashes from malformed Commit objects.

P2 Medium **P3 Low Mi

src/cashet/store.py:275

P3 Low Missing documentation or changelog for changed last_accessed_at semantics

The PR changes last_accessed_at to always equal claimed_at, and removes the side-effect update on reads. This is a breaking change for users relying on the old behavior (e.g., for monitoring or custom eviction logic). Add a note to the changelog and update the README or API docs to explain the new semantics.

P2 Medium **P3 Low Po

src/cashet/store.py:275

P3 Low Potential timezone inconsistency with claimed_at.isoformat()

The call commit.claimed_at.isoformat() may include a timezone offset if claimed_at is timezone-aware with a non-UTC offset. Other timestamps in the store use datetime.now(UTC).isoformat() which always appends +00:00. This could cause comparison issues if the two ISO strings are compared lexicographically. Ensure claimed_at is always in UTC before calling isoformat().

P2 Medium **P3 Low De

src/cashet/store.py:331

P3 Low Dead variable now_iso in find_by_fingerprint

After removing the UPDATE last_accessed_at, the variable now_iso computed on line 332 is no longer used. This wastes a datetime formatting call and may trigger linting warnings. Remove the line.

Warning

Some findings could not be anchored to changed diff lines, so they were kept here instead of
being posted as plain timeline comments.

How To Recheck

Reply @ds-review recheck under the relevant inline finding after pushing a fix.

Comment thread src/cashet/store.py
now = datetime.now(UTC).isoformat()
try:
self._put_commit_row(conn, commit, now)
self._put_commit_row(conn, commit, commit.claimed_at.isoformat())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Medium Possible AttributeError if commit.claimed_at is None

The call commit.claimed_at.isoformat() will raise AttributeError if claimed_at is None. While claimed_at is typically set before put_commit, defensive code should verify it is not None to prevent crashes from malformed Commit objects.

Suggested change
self._put_commit_row(conn, commit, commit.claimed_at.isoformat())
assert commit.claimed_at is not None, "claimed_at must not be None"
self._put_commit_row(conn, commit, commit.claimed_at.isoformat())

Comment thread src/cashet/store.py
conn = self._connect()
now = datetime.now(UTC)
now_iso = now.isoformat()
now_iso = datetime.now(UTC).isoformat()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Low Dead variable now_iso in find_by_fingerprint

After removing the UPDATE last_accessed_at, the variable now_iso computed on line 332 is no longer used. This wastes a datetime formatting call and may trigger linting warnings. Remove the line.

Comment thread src/cashet/store.py
now = datetime.now(UTC).isoformat()
try:
self._put_commit_row(conn, commit, now)
self._put_commit_row(conn, commit, commit.claimed_at.isoformat())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Low Missing documentation or changelog for changed last_accessed_at semantics

The PR changes last_accessed_at to always equal claimed_at, and removes the side-effect update on reads. This is a breaking change for users relying on the old behavior (e.g., for monitoring or custom eviction logic). Add a note to the changelog and update the README or API docs to explain the new semantics.

Comment thread src/cashet/store.py
now = datetime.now(UTC).isoformat()
try:
self._put_commit_row(conn, commit, now)
self._put_commit_row(conn, commit, commit.claimed_at.isoformat())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Low Potential timezone inconsistency with claimed_at.isoformat()

The call commit.claimed_at.isoformat() may include a timezone offset if claimed_at is timezone-aware with a non-UTC offset. Other timestamps in the store use datetime.now(UTC).isoformat() which always appends +00:00. This could cause comparison issues if the two ISO strings are compared lexicographically. Ensure claimed_at is always in UTC before calling isoformat().

Suggested change
self._put_commit_row(conn, commit, commit.claimed_at.isoformat())
# Normalize to UTC before converting
claimed_utc = commit.claimed_at.astimezone(UTC)
self._put_commit_row(conn, commit, claimed_utc.isoformat())

@jolovicdev
Copy link
Copy Markdown
Owner Author

@ds-review review

@jolovicdev jolovicdev closed this May 7, 2026
@jolovicdev jolovicdev deleted the test/consolidate-timestamp-tracking branch May 10, 2026 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant