Intermittent metadata write failures under load (possible connection pool exhaustion?)

Environment

- plex-postgresql version: v0.9.35  
- Install method: Docker (docker-compose.yml from repo)  
- Host OS: Ubuntu 22.04.4 LTS (kernel 5.15)  
- Docker version: 26.1.3  
- PostgreSQL: 15.6 (containerized, max_connections=200)  
- Plex Media Server: 1.41.0.8994 (linuxserver image)  
- Library size: ~42k movies, ~9k TV episodes  
- Storage backend: NFS mount (rclone → S3)

Summary

After migrating from SQLite using scripts/migrate_sqlite_to_pg.sh, the system works normally during idle or light playback, but under heavier activity (library scan + 3–4 concurrent streams), I’m seeing intermittent failures writing metadata.

This looks like either:

- connection pool starvation  
- retry/backoff not triggering consistently  
- or a translation edge-case between SQLite-style transactions and PostgreSQL

Expected Behavior

Writes should succeed without blocking, which is one of the main benefits described vs SQLite locking.

Actual Behavior

During scans or metadata refresh:

- Plex UI intermittently shows:
  Something went wrong while updating the library.

- Retrying usually succeeds.  
- No corruption observed, but failures are frequent enough to break automated scans.

Relevant Logs

Plex container:

[Req#8f3] ERROR - [DB] Failed to execute statement: database is locked (shim translated)
[Req#8f3] DEBUG - Retrying via pg retry policy...
[Req#8f3] ERROR - libpq: could not obtain connection from pool

Shim debug output (PLEX_PG_LOG_LEVEL=DEBUG):

PGPOOL: acquire timeout (50 connections in use)
pg_client.c: connection checkout exceeded 5000ms
Retryable step failed, escalating to Plex

PostgreSQL logs:

LOG:  connection authorized: user=plex database=plex
WARNING:  there is already a transaction in progress
STATEMENT:  BEGIN;

docker-compose.yml (sanitized)

environment:
  - PLEX_PG_HOST=/var/run/postgresql
  - PLEX_PG_DATABASE=plex
  - PLEX_PG_USER=plex
  - PLEX_PG_PASSWORD=plex
  - PLEX_PG_SCHEMA=plex
  - PLEX_PG_POOL_SIZE=50
  - PLEX_PG_LOG_LEVEL=DEBUG

Only change from default was increasing PostgreSQL max_connections.

What I’ve Tried

- Increasing pool size to 80 → same behavior  
- Switching from Unix socket to TCP → slightly worse latency  
- Verified schema with:
  ./scripts/doctor.sh --check
  No structural issues reported.

- Confirmed PostgreSQL itself is healthy:
  pg_isready -U plex

Questions

1. Is the pool intended to be long-lived per streaming query?  
2. Are there known cases where Plex opens nested SQLite transactions that don’t map cleanly?  
3. Would PgBouncer in transaction mode help, or break assumptions in the shim?  
4. Any recommended pool sizing relative to concurrent streams?

Additional Context

This is a fairly large remote-backed library (rclone mount), so I may be hitting a workload different from the local-disk assumptions.

Happy to run additional debug builds or capture traces if useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent metadata write failures under load (possible connection pool exhaustion?) #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Intermittent metadata write failures under load (possible connection pool exhaustion?) #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions