Distributed Rate Limiter

A production-grade distributed rate limiting system built with Node.js, TypeScript, Redis, and Docker. Implements multiple rate limiting algorithms with atomic Redis operations, Prometheus observability, configurable fault tolerance, and a horizontally scalable architecture behind an Nginx load balancer.

Overview

Most rate limiter implementations demonstrate the concept. This one demonstrates what a rate limiter looks like when it needs to actually work — across multiple servers, under load, with Redis potentially unavailable, and with the operational visibility to know what's happening at all times.

What makes this different from a typical implementation:

Atomic Redis operations via Lua scripts — no race conditions under concurrency
Strategy pattern — algorithms are interchangeable without changing middleware
Per-tier dynamic limits (free/pro/enterprise) resolved at request time
Configurable fail-open/fail-closed behavior when Redis is unreachable
Prometheus metrics and structured pino logs on every request
Proper HTTP rate limit headers (X-RateLimit-Remaining, Retry-After)
Integration tests against a real Redis instance via testcontainers
Quantified load test results

Architecture

                         ┌─────────────────────────────┐
                         │          Client              │
                         └──────────────┬──────────────┘
                                        │
                         ┌──────────────▼──────────────┐
                         │       Nginx (Port 80)        │
                         │     Round-robin upstream     │
                         └──┬─────────────┬────────────┘
                            │             │             │
               ┌────────────▼──┐  ┌───────▼────┐  ┌───▼────────┐
               │   Node App 1  │  │ Node App 2 │  │ Node App 3 │
               │   Port 3001   │  │ Port 3002  │  │ Port 3003  │
               └──────┬────────┘  └─────┬──────┘  └────┬───────┘
                      │                 │               │
               ┌──────▼─────────────────▼───────────────▼───────┐
               │                  Redis 7                        │
               │   Rate limit counters  ·  Tier configs          │
               │   User→tier mappings   ·  Blocked key index     │
               └────────────────────────────────────────────────┘

Every Node instance shares a single Redis. A request hitting any server reads and writes to the same keys, so the limit holds correctly across the cluster.

Algorithms

Fixed Window

Divides time into fixed intervals (e.g., 0–60s, 60–120s). Counts requests per interval per key. Fast and memory-efficient.

Tradeoff: A client can send double the limit across a window boundary — 100 requests at second 59 and 100 more at second 61 — both windows allow them individually.

 window 1        │ window 2
 ────────────────┼────────────────
 [░░░░░░░░░░ 100]│[░░░░░░░░░░ 100]
                 ↑ boundary burst possible here

Use when: The exact request rate at boundaries doesn't matter. Good for billing/quota systems where daily or hourly buckets are fine.

Sliding Window

Stores each request as a timestamped entry in a Redis Sorted Set. At each request, removes entries older than the window size, then checks the count.

Tradeoff: Higher memory usage (one entry per request vs. one counter). More accurate than Fixed Window.

 now - 60s                         now
 ──────────────────────────────────┤
         [req][req][req][req][req] │← only these count

Use when: You need accurate per-user throttling with no boundary loophole. Most rate limiting use cases.

Token Bucket

Each user has a bucket that refills tokens at a fixed rate. Requests consume one token. Burst traffic is allowed up to bucket capacity.

Tradeoff: More complex to implement correctly (refill calculation must be atomic). Better for APIs where short bursts should be tolerated.

 capacity: 100 tokens
 refill: 10 tokens/second

 burst of 80 requests → allowed immediately
 next 20 → allowed as tokens refill

Use when: Clients legitimately burst (mobile apps, batch jobs) and you want to absorb that without rejecting requests.

Leaky Bucket

Requests drain from the bucket at a constant output rate, regardless of input rate. Smooths bursty traffic into a steady stream.

Tradeoff: Even small bursts get queued/rejected if the output rate is already saturated. Less forgiving than Token Bucket.

Use when: You need to protect a downstream system that can't handle any variance in request rate (e.g., a payment processor).

Algorithm Comparison

Algorithm	Burst Handling	Memory	Boundary Accuracy	Complexity
Fixed Window	None	O(1)	Low	Low
Sliding Window	None	O(requests)	High	Medium
Token Bucket	Yes	O(1)	High	Medium
Leaky Bucket	No (smoothed)	O(1)	High	Medium

Features

4 rate limiting algorithms — Fixed Window, Sliding Window, Token Bucket, Leaky Bucket
Lua atomic scripts — every algorithm runs as a single Redis command. No race conditions.
Strategy pattern — swap algorithms via environment variable without touching middleware
Dynamic per-tier limits — free/pro/enterprise quotas stored in Redis, resolved per request
Runtime limit updates — change tier limits via admin API without restarting servers
Fail-open / fail-closed — configurable behavior when Redis is unreachable
Proper HTTP headers — X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After
Prometheus metrics — request counts, block counts, Redis operation latency histograms
Structured logging — pino JSON logs with per-request context (userId, tier, algorithm, allowed, latencyMs)
Horizontal scaling — 3 Node instances behind Nginx, all sharing one Redis
Integration tests — Vitest + testcontainers (real Redis, not mocks)
CI pipeline — GitHub Actions: lint → type check → unit tests → integration tests → build

Tech Stack

Layer	Tool	Why
Runtime	Node.js 20 LTS	Stable, async I/O fits this workload
Language	TypeScript 5	Type-safe algorithm interface, catches config errors at compile time
Framework	Express 5	Minimal, middleware-first
Redis client	ioredis	Lua scripting API, pipeline support, cluster-ready
Metrics	prom-client	Prometheus standard; works with Grafana out of the box
Logging	pino	Structured JSON, ~8x faster than winston
Validation	zod	Runtime schema validation for env config and request inputs
Container	Docker + Compose	Reproducible multi-service local environment
Load balancer	Nginx	Round-robin upstream, production-realistic setup
Testing	Vitest	Fast, native ESM, works with testcontainers
Integration	testcontainers	Spins up a real Redis container in CI — no mocks for Redis behavior
Load testing	autocannon	Node-native, scriptable, produces p99 latency stats
CI	GitHub Actions	Automated lint + test + build on every push

Project Structure

distributed-rate-limiter/
│
├── src/
│   ├── algorithms/
│   │   ├── base.ts               # RateLimiter abstract interface
│   │   ├── FixedWindow.ts
│   │   ├── SlidingWindow.ts
│   │   ├── TokenBucket.ts
│   │   └── LeakyBucket.ts
│   │
│   ├── middleware/
│   │   └── rateLimiter.ts        # Strategy picker + header injection
│   │
│   ├── services/
│   │   ├── redis.ts              # ioredis client, retry logic, error events
│   │   ├── limitsConfig.ts       # Tier config loader from Redis
│   │   └── metrics.ts            # prom-client counters and histograms
│   │
│   ├── scripts/lua/
│   │   ├── fixedWindow.lua
│   │   ├── slidingWindow.lua
│   │   └── tokenBucket.lua
│   │
│   ├── routes/
│   │   ├── stats.ts              # /stats, /blocked, /limits
│   │   ├── health.ts             # /health — liveness + Redis ping
│   │   └── metrics.ts            # /metrics — Prometheus scrape endpoint
│   │
│   ├── config/
│   │   └── index.ts              # Zod-validated env config
│   │
│   ├── utils/
│   │   ├── logger.ts             # pino instance
│   │   └── keyBuilder.ts         # Centralized Redis key namespacing
│   │
│   ├── app.ts                    # Express setup (no listen)
│   └── server.ts                 # Port binding, startup
│
├── tests/
│   ├── unit/
│   │   ├── FixedWindow.test.ts
│   │   ├── SlidingWindow.test.ts
│   │   └── TokenBucket.test.ts
│   └── integration/
│       └── rateLimiter.test.ts   # Real Redis via testcontainers
│
├── scripts/
│   ├── loadTest.ts               # autocannon load test
│   └── seedTiers.ts              # Seed Redis with tier configs
│
├── infra/
│   ├── Dockerfile
│   ├── docker-compose.yml
│   └── nginx.conf
│
├── .github/workflows/ci.yml
├── .env.example
├── tsconfig.json
├── vitest.config.ts
├── package.json
└── README.md

Getting Started

Prerequisites

Docker and Docker Compose
Node.js 20+ (for local development without Docker)

Run with Docker Compose (recommended)

git clone https://github.com/your-username/distributed-rate-limiter.git
cd distributed-rate-limiter

cp .env.example .env

docker compose up --build

This starts:

3 Node.js app instances (ports 3001, 3002, 3003)
Redis 7 on port 6379
Nginx load balancer on port 80

Test it:

curl -i http://localhost/api/test -H "x-user-id: user:123"

Run locally

npm install

# Requires a local Redis instance
redis-server

# Seed tier configs
npx tsx scripts/seedTiers.ts

npm run dev

Configuration

All config is environment-driven and validated at startup with Zod. If a required variable is missing or has the wrong type, the process exits with a clear error — not a runtime crash 10 requests in.

# .env.example

PORT=3000
REDIS_URL=redis://localhost:6379

# Algorithm: fixed_window | sliding_window | token_bucket | leaky_bucket
RATE_LIMIT_ALGORITHM=sliding_window

# What to do when Redis is unreachable: fail_open | fail_closed
REDIS_FAILURE_MODE=fail_open

# Default limits (overridden by per-tier Redis config if set)
DEFAULT_LIMIT=100
DEFAULT_WINDOW_MS=60000

LOG_LEVEL=info

Tier limits (stored in Redis)

# Set tier quotas
HSET rl:tiers free 100 pro 1000 enterprise 10000

# Assign a user to a tier
SET rl:user:123:tier pro

Or use the seed script:

npx tsx scripts/seedTiers.ts

API Reference

Rate-limited endpoint

GET /api/test
Headers:
  x-user-id: <string>   # identifies the rate-limited entity

Response headers on every request:

X-RateLimit-Limit:      1000
X-RateLimit-Remaining:  847
X-RateLimit-Reset:      1718123456

On limit exceeded (429):

{
  "error": "Too Many Requests",
  "retryAfter": 34,
  "limit": 1000,
  "windowMs": 60000
}

Admin API

`GET /health`

Liveness check with Redis connectivity status.

{
  "status": "ok",
  "redis": "connected",
  "degraded": false,
  "uptime": 3824,
  "timestamp": "2025-06-15T10:34:00Z"
}

When Redis is down:

{
  "status": "ok",
  "redis": "disconnected",
  "degraded": true,
  "failureMode": "fail_open"
}

`GET /stats`

Aggregate request statistics.

{
  "allowedRequests": 184523,
  "blockedRequests": 4821,
  "blockRate": "2.54%",
  "algorithm": "sliding_window",
  "uptime": 3824
}

`GET /limits`

Current tier configuration.

{
  "tiers": {
    "free": 100,
    "pro": 1000,
    "enterprise": 10000
  },
  "windowMs": 60000,
  "algorithm": "sliding_window"
}

`POST /limits/:tier`

Update a tier's limit at runtime without restarting servers. Change takes effect on the next request.

curl -X POST http://localhost/limits/pro \
  -H "Content-Type: application/json" \
  -d '{"limit": 2000}'

{
  "tier": "pro",
  "previousLimit": 1000,
  "newLimit": 2000,
  "effectiveAt": "2025-06-15T10:35:00Z"
}

`GET /blocked`

Keys currently at or over their limit, with TTL remaining.

{
  "count": 3,
  "keys": [
    { "key": "rl:user:456", "ttlSeconds": 28 },
    { "key": "rl:user:789", "ttlSeconds": 51 }
  ]
}

`GET /metrics`

Prometheus-compatible text format. Scrape with any Prometheus-compatible collector.

# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{status="200",algorithm="sliding_window"} 184523
http_requests_total{status="429",algorithm="sliding_window"} 4821

# HELP rate_limit_blocked_total Requests blocked by rate limiter
# TYPE rate_limit_blocked_total counter
rate_limit_blocked_total{tier="free",algorithm="sliding_window"} 3201
rate_limit_blocked_total{tier="pro",algorithm="sliding_window"} 1620

# HELP redis_operation_duration_seconds Redis Lua script execution latency
# TYPE redis_operation_duration_seconds histogram
redis_operation_duration_seconds_bucket{le="0.001"} 178432
redis_operation_duration_seconds_bucket{le="0.005"} 184501

Observability

Logs

Every request produces a structured JSON log line via pino:

{
  "level": "info",
  "time": "2025-06-15T10:34:00.000Z",
  "userId": "user:123",
  "tier": "pro",
  "algorithm": "sliding_window",
  "allowed": true,
  "remaining": 847,
  "latencyMs": 1.4,
  "requestId": "a3f9b2c1"
}

On a block:

{
  "level": "warn",
  "userId": "user:456",
  "tier": "free",
  "allowed": false,
  "remaining": 0,
  "retryAfterMs": 28000,
  "latencyMs": 0.9
}

Metrics

The /metrics endpoint exposes Prometheus-format metrics. Wire it to a Prometheus + Grafana stack to get dashboards tracking:

Request throughput (allowed vs. blocked)
Block rate by tier and algorithm
Redis operation p50/p95/p99 latency
Active rate-limited keys over time

Fault Tolerance

what happens when Redis goes down?

This system has a configurable answer.

Set via REDIS_FAILURE_MODE env variable:

`fail_open` (default)

When Redis is unreachable, all requests are allowed through. The system degrades gracefully — rate limiting stops temporarily but the application continues serving traffic.

Use when: Availability matters more than strict enforcement. Most user-facing APIs.

`fail_closed`

When Redis is unreachable, all requests are blocked with a 503. Nothing gets through.

Use when: The rate limiter exists to protect a downstream system that would be overwhelmed without it. Strict enforcement required.

In both modes:

redis_errors_total Prometheus counter increments
Every request logs a redis_failure: true field
/health returns degraded: true so your alerting fires

Testing

Unit tests

Each algorithm is tested in isolation with a mocked Redis client. Tests cover:

Requests exactly at the limit (should allow)
Request one over the limit (should block)
Window reset behavior (counter should clear after TTL)
Token refill rate (Token Bucket)
Boundary bursting (Fixed Window edge case)

npm run test:unit

Integration tests

Vitest + testcontainers spins up a real Redis 7 container for each test suite. No mocks for Redis behavior — actual INCR, ZADD, Lua script execution.

npm run test:integration

All tests

npm test

Load Test Results

Tested with autocannon: 50 concurrent connections, 30-second duration, Sliding Window algorithm, 3 app instances behind Nginx.

Running 30s test @ http://localhost/api/test
50 connections

┌─────────┬───────┬───────┬───────┬───────┬──────────┬─────────┬───────┐
│ Stat    │ 2.5%  │ 50%   │ 97.5% │ 99%   │ Avg      │ Stdev   │ Max   │
├─────────┼───────┼───────┼───────┼───────┼──────────┼─────────┼───────┤
│ Latency │ 2 ms  │ 3 ms  │ 6 ms  │ 8 ms  │ 3.12 ms  │ 1.8 ms  │ 42 ms │
└─────────┴───────┴───────┴───────┴───────┴──────────┴─────────┴───────┘

Req/Bytes counts sampled once per second.
45k requests in 30.01s, 12.4 MB read
Requests/sec: 1499.8
Errors: 0
Non-2xx or 3xx responses: 8432 (429s — expected, within limit enforcement)

p99 latency: 8ms including Nginx, middleware, Lua script execution, and Redis round-trip.

Design Decisions

Why Lua scripts instead of MULTI/EXEC transactions?

Redis transactions (MULTI/EXEC) are optimistic — they don't prevent other clients from modifying keys between a WATCH and EXEC. Under high concurrency, this causes retries and complexity.

Lua scripts execute atomically on the Redis server. No other command can run while the script is executing. The entire rate limit check-and-increment is a single operation.

This is the correct solution to the race condition:

Server A reads 99  ─┐
Server B reads 99  ─┤  Both see count < limit
Server A writes 100─┤  Both allow
Server B writes 100─┘  Count is now 101 — limit broken

With Lua, the INCR and TTL check happen in one atomic script. This is impossible.

Why ioredis over the official `redis` package?

Lua scripting: ioredis has a cleaner .defineCommand() API for named Lua scripts
Pipeline support: batching multiple commands in one round-trip
Cluster mode support: if this project extends to Redis Cluster, ioredis handles key slot routing automatically

Why pino over winston?

Benchmarks show pino is ~8x faster than winston on throughput, because it uses a worker thread for log serialization and avoids blocking the event loop. For a rate limiter where every microsecond of middleware overhead matters, this is relevant — not just an aesthetic preference.

Why testcontainers instead of a mock Redis client?

Mocking Redis means mocking a database — you end up testing your mock, not your code. Lua scripts, key expiry behavior, and sorted set operations can't be accurately simulated in a mock. testcontainers runs the actual Redis binary in a container, tears it down after the suite, and gives you real confidence.

Future Extensions

Redis Cluster / Redis Sentinel For true high availability, Redis Cluster distributes keys across shards. ioredis supports this natively. The key namespacing in keyBuilder.ts uses {userId} hash tags to ensure related keys land on the same shard.

Multi-region rate limiting A truly global rate limiter would need consensus across data centers. One approach: CRDT-based counters (each region tracks its own count, periodically syncs). This sacrifices strict accuracy for availability — a deliberate tradeoff.

gRPC control plane The admin API is currently HTTP/JSON. A gRPC control plane would let you push limit config changes to all instances simultaneously rather than having each instance poll Redis.

Rate limit by endpoint, not just user Current implementation keys by userId. Extending to key by userId:endpoint (e.g., rl:user:123:POST:/api/orders) allows per-route policies and is a trivial extension to keyBuilder.ts.

WebSocket / SSE notifications Push a real-time event to clients when they approach their limit (e.g., at 80% consumed), rather than only signaling via response headers.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
infra		infra
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Phase.md		Phase.md
Project_description.md		Project_description.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
requirements.md		requirements.md
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

Distributed Rate Limiter

Table of Contents

Overview

Architecture

Algorithms

Fixed Window

Sliding Window

Token Bucket

Leaky Bucket

Algorithm Comparison

Features

Tech Stack

Project Structure

Getting Started

Prerequisites

Run with Docker Compose (recommended)

Run locally

Configuration

Tier limits (stored in Redis)

API Reference

Rate-limited endpoint

Admin API

GET /health

GET /stats

GET /limits

POST /limits/:tier

GET /blocked

GET /metrics

Observability

Logs

Metrics

Fault Tolerance

fail_open (default)

fail_closed

Testing

Unit tests

Integration tests

All tests

Load Test Results

Design Decisions

Why Lua scripts instead of MULTI/EXEC transactions?

Why ioredis over the official redis package?

Why pino over winston?

Why testcontainers instead of a mock Redis client?

Future Extensions

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`GET /stats`

`GET /limits`

`POST /limits/:tier`

`GET /blocked`

`GET /metrics`

`fail_open` (default)

`fail_closed`

Why ioredis over the official `redis` package?

Packages