Skip to content

🚀 feat(queue): implement BullMQ + Redis for async event & webhook processing #19

@abhishek-nexgen-dev

Description

@abhishek-nexgen-dev

Introduce a production-grade queue system using BullMQ + Redis to handle all asynchronous workloads in CommDesk:

  • Event processing
  • Webhook delivery
  • Retry handling
  • Background jobs

This replaces synchronous execution with a scalable, fault-tolerant pipeline.


🎯 Problem

Current (sync):

Event → Webhook → Response

Issues:

  • API latency & timeouts
  • No retry mechanism
  • Poor scalability
  • Failure = data loss risk

✅ Proposed Solution

Event
→ Queue (BullMQ)
→ Redis
→ Worker
→ Delivery Engine
→ Logs + Retry

🧱 Tech Stack

  • Queue: BullMQ (open-source, production-proven)
  • Broker: Redis
  • Runtime: Node.js + Express

📦 Scope

1️⃣ Queue Setup

Path:

src/queue/

Files:

connection.ts
queues.ts
jobs.ts

Example:

import { Queue } from "bullmq";

export const webhookQueue = new Queue("webhook", {
  connection: {
    host: process.env.REDIS_HOST,
    port: Number(process.env.REDIS_PORT),
  },
});

2️⃣ Job Types

  • WEBHOOK_DELIVERY
  • EVENT_PROCESSING
  • RETRY_DELIVERY

3️⃣ Worker Implementation

Path:

src/workers/

Files:

webhook.worker.ts
event.worker.ts

Responsibilities:

Fetch job
→ Load event
→ Find matching webhooks
→ Trigger delivery
→ Log result
→ Retry if needed

4️⃣ Job Payload

{
  eventId: string,
  webhookId: string,
  attempt: number
}

5️⃣ Retry Strategy (Required)

1 → immediate
2 → 30 sec
3 → 2 min
4 → 10 min
5 → fail → Dead Letter Queue

6️⃣ Dead Letter Queue

  • Store permanently failed jobs
  • Support manual retry via API

7️⃣ Rate Limiting

  • Per webhook → 10 req/sec
  • Per community → configurable

8️⃣ Concurrency

concurrency: 10

9️⃣ Logging Integration (MANDATORY)

Must integrate with logging system defined in docs:

👉

Log events:

  • job queued
  • job started
  • job completed
  • job failed
  • retry triggered

🔟 Failure Handling

Handle:

  • timeouts
  • network failures
  • invalid webhook URLs
  • external API errors

📊 Observability

Track:

  • queue size
  • jobs/sec
  • failure rate
  • retry count

Alert on:

  • high failure rate
  • queue backlog spike
  • worker crash

📁 Folder Structure

src/
 ├── queue/
 ├── workers/
 ├── modules/
 ├── logs/

🧪 Testing

  • job enqueue works
  • worker executes jobs
  • retry logic triggers correctly
  • failure scenarios handled

🧨 Edge Cases

  • duplicate jobs
  • Redis downtime
  • worker crash
  • retry storms

⚡ Performance Targets

  • Handle 10k+ jobs/min
  • Non-blocking API
  • Horizontally scalable workers

🌍 Environment

docker run -d -p 6379:6379 redis
REDIS_HOST=localhost
REDIS_PORT=6379

✅ Acceptance Criteria

  • Queue integrated with event system
  • Workers process jobs correctly
  • Retry + backoff implemented
  • Dead Letter Queue implemented
  • Logs captured for all job states
  • Rate limiting enforced
  • System resilient to failures

🚫 Constraints

  • Do NOT process webhooks synchronously
  • Do NOT implement custom queue logic
  • Use BullMQ best practices only

🔥 Impact

After implementation:

Scalable ✔
Reliable ✔
Production-ready ✔

Without this:

System fails under load ❌

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions