Two-phase, transaction-safe garbage collection (quarantine -> grace -> purge)

## Context

Follow-up to #1442 (data-loss bug in `dj.gc.scan` / `collect`, fixed in #1444). After that fix, `scan` correctly identifies referenced files; the present issue is about the broader concern that GC remains **not transaction-safe** even with the type-mismatch resolved.

There is a TOCTOU window between when `scan_*_references` enumerates references and when `collect` deletes orphans. A concurrent transaction that inserts a row referencing previously-orphaned content during that window will have its file deleted out from under it. This was raised by @dimitri-yatsenko during review of the #1442 root-cause analysis:

> In general, garbage collection is not 100% safe even after fixing since it's not transaction-safe. We need to implement a process of garbage retrieval for a bit and then a second step for complete removal.

## Proposed direction: two-phase API

Replace the single-call `collect()` with a quarantine → grace window → purge state machine. The single-call form can stay as a deprecated convenience, but production usage should be the two-phase form.

### Phase 1: `quarantine()`

- Identifies orphan candidates the same way `scan` does (now correct after #1442).
- Moves (or marks) candidates — does **not** delete.
- Idempotent: re-running picks up only newly-orphaned objects since last run.

Two options for state persistence — both worth a design pass:

- **Storage prefix (`_trash/`).** Move objects into a `_trash/` prefix under the same store. Atomic per object. Restore is a move-back. Storage layout encodes the quarantine state directly; no separate metadata table to keep in sync.
- **Database table (`_gc_quarantine`).** Record candidates in a project-level (cross-schema) table: `(path, store, quarantined_at, source_schema, source_table, source_attribute)`. Storage layout unchanged; restore is a leave-in-place + delete the row.

### Phase 2: `purge()`

- Operates only on candidates whose `quarantined_at` is older than `grace_seconds` (configurable, sensible default >= 24h, configurable via `dj.config["gc.grace_seconds"]` or similar).
- **Re-checks reference status before delete** (cheap, metadata-only after #1442). If a row inserted during the grace window now references the candidate, **unquarantine** it instead of deleting.
- Deletes only candidates that pass the re-check.

### Phase 0 / convenience: `restore(path_or_filter)`

- Explicit unquarantine for operator use: pulled-too-soon recovery, debugging, manual reclaim.

## Open design questions

- **Cross-schema scope.** Quarantine state spans every schema sharing the store. The `_gc_quarantine` table needs a project-level home (not per-schema), or each scan must look up quarantine state across schemas. The `_trash/` prefix variant sidesteps this naturally.
- **Concurrent-insert handling.** What happens if a row is inserted referencing a quarantined path during the grace window? Phase 2's re-check covers it, but should we also block the insert at write time? Probably no — the re-check is cheaper than coordination — but worth deciding explicitly.
- **Recovery from interrupted runs.** State machine must be resumable: a `quarantine()` killed mid-run should leave the system in a defined state, and the next call should pick up where the previous one stopped.
- **Storage backend uniformity.** The `_trash/` prefix needs atomic move semantics on every supported backend (local, S3, UC Volumes). Most fsspec backends provide this; should be verified per backend.
- **CLI ergonomics.** `dj.gc.quarantine(...)` / `dj.gc.purge(...)` / `dj.gc.restore(...)` / `dj.gc.format_quarantine_stats(...)`. Same `*schemas` + `store_name` shape as today.
- **Backwards compatibility.** Keep `collect(dry_run=False)` as a single-call shorthand that does quarantine + purge in sequence (with `grace_seconds=0`), but emit a `DeprecationWarning` recommending the two-phase form for production. Default `dry_run=True` already protects against accidental runs.
- **Operational visibility.** Quarantine listing / stats should be queryable: how much is currently quarantined, oldest item, stores touched, expected purge time.

## Industry references

The pattern is well-established for non-transactional GC across an external store + a transactional DB:

- **Cassandra tombstones** with `gc_grace_seconds` (default 10 days)
- **Databricks `VACUUM`** with retention period (default 7 days)
- **S3 lifecycle** soft-delete + permanent-delete
- **POSIX deferred unlink** when a file has open handles

In each case the grace window absorbs in-flight transactions that the GC can't see at scan time.

## Scope

This is a design + implementation request. The right next step is a written spec covering:

1. State persistence choice (`_trash/` prefix vs. `_gc_quarantine` table).
2. Public API surface (`quarantine` / `purge` / `restore` / config keys).
3. Concurrency model (re-check before purge, behavior on interrupted runs).
4. Migration / compatibility (what `collect()` does going forward).
5. Test plan including concurrent-insert race coverage.

Happy to draft that spec; flagging here so it doesn't get lost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two-phase, transaction-safe garbage collection (quarantine -> grace -> purge) #1445

Context

Proposed direction: two-phase API

Phase 1: `quarantine()`

Phase 2: `purge()`

Phase 0 / convenience: `restore(path_or_filter)`

Open design questions

Industry references

Scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Two-phase, transaction-safe garbage collection (quarantine -> grace -> purge) #1445

Description

Context

Proposed direction: two-phase API

Phase 1: quarantine()

Phase 2: purge()

Phase 0 / convenience: restore(path_or_filter)

Open design questions

Industry references

Scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Phase 1: `quarantine()`

Phase 2: `purge()`

Phase 0 / convenience: `restore(path_or_filter)`