Skip to content

feat(redis-worker): add oldest-message-age queue gauge#4086

Open
0ski wants to merge 2 commits into
mainfrom
oskar/feat-redis-worker-oldest-message-age
Open

feat(redis-worker): add oldest-message-age queue gauge#4086
0ski wants to merge 2 commits into
mainfrom
oskar/feat-redis-worker-oldest-message-age

Conversation

@0ski

@0ski 0ski commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Adds a redis_worker.queue.oldest_message_age_ms observable gauge (labeled
worker_name) and SimpleQueue.oldestMessageAge(), reporting the age of the
oldest overdue message in each queue. Generic queue-stall signal: 0 while a
queue drains healthily, rising only when due work sits undrained (blocked
dequeue, dead consumer, backpressure) — even when no items are being processed.

Adds a `redis_worker.queue.oldest_message_age_ms` observable gauge (labeled
`worker_name`) and `SimpleQueue.oldestMessageAge()`, reporting the age of the
oldest overdue message in each queue. Generic queue-stall signal: 0 while a
queue drains healthily, rising only when due work sits undrained (blocked
dequeue, dead consumer, backpressure) — even when no items are being processed.
@changeset-bot

changeset-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: e8aae8a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 28 packages
Name Type
@trigger.dev/redis-worker Patch
@internal/run-engine Patch
@internal/schedule-engine Patch
@trigger.dev/build Patch
@trigger.dev/core Patch
@trigger.dev/plugins Patch
@trigger.dev/python Patch
@trigger.dev/react-hooks Patch
@trigger.dev/rsc Patch
@trigger.dev/schema-to-json Patch
@trigger.dev/sdk Patch
@trigger.dev/database Patch
@trigger.dev/otlp-importer Patch
@trigger.dev/rbac Patch
@trigger.dev/sso Patch
trigger.dev Patch
@internal/dashboard-agent Patch
@internal/cache Patch
@internal/clickhouse Patch
@internal/llm-model-catalog Patch
@internal/redis Patch
@internal/replication Patch
@internal/run-store Patch
@internal/testcontainers Patch
@internal/tracing Patch
@internal/tsql Patch
@internal/zod-worker Patch
@internal/sdk-compat-tests Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@0ski 0ski marked this pull request as ready for review June 30, 2026 14:25
@0ski 0ski self-assigned this Jun 30, 2026
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5b592479-892a-4dd4-94f7-3f1b675e7fbc

📥 Commits

Reviewing files that changed from the base of the PR and between a50bb5d and e8aae8a.

📒 Files selected for processing (4)
  • .changeset/redis-worker-oldest-message-age.md
  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
  • packages/redis-worker/src/worker.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/redis-worker-oldest-message-age.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/redis-worker/src/worker.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout. (28)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (6, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (7, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (8, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (3, 12)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (11, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (10, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (9, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (4, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (12, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (5, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (1, 12)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (2, 12)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: e2e-webapp / 🧪 E2E Tests: Webapp
  • GitHub Check: packages / 🧪 Unit Tests: Packages (2, 3)
  • GitHub Check: packages / 🧪 Unit Tests: Packages (1, 3)
  • GitHub Check: packages / 🧪 Unit Tests: Packages (3, 3)
  • GitHub Check: Build and publish previews
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use vitest for all tests in the Trigger.dev repository

Files:

  • packages/redis-worker/src/queue.test.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs}: Use pnpm run typecheck for changes in apps (apps/*) and internal packages (internal-packages/*), and never use build to verify those changes.
Use Vitest for tests, and never mock anything; use testcontainers instead.
Prefer static imports over dynamic import(), and only use dynamic imports for unresolved circular dependencies, genuine code-splitting needs, or conditional runtime loading.

Files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
packages/**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs}

📄 CodeRabbit inference engine (CLAUDE.md)

Use pnpm run build to verify changes in public packages (packages/*).

Files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
**/*.{test,spec}.{ts,tsx,js,jsx,mts,cts,mjs,cjs}

📄 CodeRabbit inference engine (CLAUDE.md)

Place test files next to the source files they cover (for example, MyService.ts -> MyService.test.ts).

Files:

  • packages/redis-worker/src/queue.test.ts
**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs,md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

Always import from @trigger.dev/sdk when writing Trigger.dev tasks; never use @trigger.dev/sdk/v3 or deprecated client.defineJob.

Files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
**/*.{test,spec}.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{test,spec}.{ts,tsx,js,jsx}: Write unit tests with Vitest, keep test files beside the files under test, and use descriptive describe and it blocks.
Avoid mocks or stubs in tests; when Redis or Postgres are needed, use the helpers from @internal/testcontainers.

Files:

  • packages/redis-worker/src/queue.test.ts
packages/redis-worker/**/*@(job|queue|worker|background).{ts,tsx}

📄 CodeRabbit inference engine (packages/redis-worker/CLAUDE.md)

Use @trigger.dev/redis-worker for all new background job implementations, replacing graphile-worker and zodworker

Files:

  • packages/redis-worker/src/queue.ts
🧠 Learnings (12)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma error P1001 ("Can't reach database server") in TypeScript, don’t assume a single error shape. Prisma can surface P1001 via two different error classes/fields: `PrismaClientKnownRequestError` exposes it as `err.code === "P1001"` (common during mid-query connection drops), while `PrismaClientInitializationError` exposes it as `err.errorCode === "P1001"` (common on client startup failure). Therefore, predicates should use `err.code === "P1001" || err.errorCode === "P1001"`. Do not flag `err.code === "P1001"` as “unreachable/never matches,” as it is expected in production.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma errors for P1001 ("Can't reach database server"), do not assume it only appears under a single property name. Prisma may surface P1001 via either `PrismaClientKnownRequestError` (`err.code === "P1001"`, e.g., mid-query connection drops) or `PrismaClientInitializationError` (`err.errorCode === "P1001"`, e.g., client startup connection failure). To reliably detect the condition, check `err.code === "P1001" || err.errorCode === "P1001"`, and avoid review rules that would incorrectly flag `err.code === "P1001"` as unreachable/never-matching.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-06-13T19:53:13.759Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3937
File: packages/trigger-sdk/skills/realtime-and-frontend/SKILL.md:258-260
Timestamp: 2026-06-13T19:53:13.759Z
Learning: When reviewing code that uses `trigger.dev/react-hooks`’s `useRealtimeRun`, preserve the call signature where the first argument is the full realtime handle object (not `handle.id`). This is intentional to maintain type-safety and is consistent with the official docs; do not suggest changing the first argument from the handle object to `handle.id`.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-06-17T17:13:49.929Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3948
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.bulk-actions.$bulkActionParam/route.tsx:48-62
Timestamp: 2026-06-17T17:13:49.929Z
Learning: In triggerdotdev/trigger.dev, within `dashboardLoader`/`dashboardAction` (or similar context resolver code) whenever you resolve an organization ID from an organization slug for RBAC/enterprise authorization scope, always read from the primary Prisma client (`prisma`), not `$replica`. Using `$replica` can hit replica-lag and cause the RBAC lookup/authorization to run without the correct org scope (bypassing intended role enforcement). Implement the slug→org lookup with `prisma.organization.findFirst(...)` (or equivalent primary-client query) and add an inline comment documenting why the primary client is required (replica lag could lead to unscoped RBAC checks).

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-06-23T13:04:21.413Z
Learnt from: carderne
Repo: triggerdotdev/trigger.dev PR: 4023
File: apps/webapp/app/services/upsertBranch.server.ts:14-18
Timestamp: 2026-06-23T13:04:21.413Z
Learning: In TypeScript, it’s valid to `import { type X }` and then use `typeof X` in a type-only position, e.g. `type Alias = z.infer<typeof X>`. The `type` modifier suppresses the runtime import, but the type checker still has the full exported type so `z.infer<typeof X>` can resolve correctly. In code reviews, don’t flag this as a TypeScript compile error as long as `typeof X` is used in a type context (e.g., with `z.infer`, `type` aliases, generics), not as a runtime value.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-05-18T14:40:02.173Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3658
File: packages/core/src/v3/realtimeStreams/manager.test.ts:1-147
Timestamp: 2026-05-18T14:40:02.173Z
Learning: In this repo’s trigger.dev codebase, the “never mock — use testcontainers” guideline should only be applied to integration tests that talk to real external services (e.g., Redis, Postgres, S2). For unit tests that validate in-memory logic (e.g., deduplication/cache behavior in StandardRealtimeStreamsManager and similar module-boundary call counting), it is allowed to use Vitest mocks like `vi.fn()` and to stub/mock `ApiClient` objects to count calls or simulate in-process collaborators. Do not flag `vi.fn()`-based mocks as policy violations in these unit-test scenarios; reserve the rule for true external-service integration tests.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
📚 Learning: 2026-05-18T14:40:02.173Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3658
File: packages/core/src/v3/realtimeStreams/manager.test.ts:1-147
Timestamp: 2026-05-18T14:40:02.173Z
Learning: In the triggerdotdev/trigger.dev repo, the policy “Never mock anything — use testcontainers instead” should only be enforced for integration tests that interact with real external services (e.g., Redis, Postgres) via actual infrastructure. For unit tests that exercise pure in-memory logic (e.g., cache semantics) it is OK to stub collaborators such as `ApiClient` using Vitest (`vi.fn()`) to assert call counts or control behavior. Do not flag `vi.fn()`-based `ApiClient` stubs in unit tests as violations of the testcontainers policy.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
📚 Learning: 2026-06-04T18:16:35.386Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3836
File: apps/supervisor/src/backpressure/backpressureMonitor.ts:3-5
Timestamp: 2026-06-04T18:16:35.386Z
Learning: When reviewing TypeScript in this repo, apply the rule “prefer type aliases over interfaces” only to data/object shapes and union/intersection type modeling. If an interface is being used as a behavioral contract for collaborators to implement (e.g., method-shape interfaces that define required behavior, such as `BackpressureLogger` / `BackpressureSignalSource` in `apps/supervisor/src/backpressure/backpressureMonitor.ts`), keep it as an `interface` and do not flag it as a type-alias-vs-interface violation.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-06-09T17:58:04.699Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3879
File: apps/webapp/app/models/vercelIntegration.server.ts:619-630
Timestamp: 2026-06-09T17:58:04.699Z
Learning: In this codebase, outbound raw `fetch` calls should typically rely on Node/undici’s default request timeout (about ~300s) rather than adding a per-call `AbortController` + `setTimeout` wrapper inside individual functions (e.g. in files like `apps/webapp/app/models/vercelIntegration.server.ts`). During code review, do not flag the absence of a per-call timeout on a single `fetch` as an issue; if per-call timeouts are needed, they should be implemented via a codebase-wide convention (e.g., a shared fetch wrapper or documented pattern) rather than ad-hoc per-function changes.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
  • packages/redis-worker/src/queue.ts
📚 Learning: 2026-06-16T09:19:47.637Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3960
File: apps/webapp/test/prismaInfrastructureErrorCapture.test.ts:0-0
Timestamp: 2026-06-16T09:19:47.637Z
Learning: In this repo’s Vitest setup, `vitest.config.ts` uses `globals: true`, so identifiers like `vi`, `describe`, `it`, and `expect` are available as globals in Vitest test files. During code review, do not flag missing `vi`/`describe`/`it`/`expect` imports as a runtime error or correctness issue when they’re used in `*.test.ts/tsx` or `*.spec.ts/tsx` files. Explicit imports are still preferred for consistency, but they’re not required for runtime behavior.

Applied to files:

  • packages/redis-worker/src/queue.test.ts
🔇 Additional comments (4)
packages/redis-worker/src/queue.ts (3)

523-545: The orphan-resolution against the items hash and the bounded LIMIT 0 100 scan were already raised and confirmed addressed in a previous review, so no new concern here.


322-341: LGTM!


759-766: LGTM!

packages/redis-worker/src/queue.test.ts (1)

218-284: LGTM!


Walkthrough

A new oldestMessageAge(): Promise<number> method is added to SimpleQueue in packages/redis-worker/src/queue.ts. It checks Redis for the oldest due queue entry with an existing payload and returns its age in milliseconds, or 0 for empty, future-only, orphan-only, or error cases. The worker registers a new observable gauge for this value and records it with the worker_name label. Tests cover empty, future-scheduled, overdue, orphaned, and post-dequeue states, and a changeset entry is added.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description omits the required template sections, including Closes #issue, checklist, Testing, Changelog, and Screenshots. Add the missing template sections and fill in the issue reference, checklist items, testing steps, changelog note, and screenshots placeholders.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding an oldest-message-age queue gauge for redis-worker.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch oskar/feat-redis-worker-oldest-message-age

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@pkg-pr-new

pkg-pr-new Bot commented Jun 30, 2026

Copy link
Copy Markdown

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@e8aae8a

trigger.dev

npm i https://pkg.pr.new/trigger.dev@e8aae8a

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@e8aae8a

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@e8aae8a

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@e8aae8a

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@e8aae8a

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@e8aae8a

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@e8aae8a

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@e8aae8a

commit: e8aae8a

coderabbitai[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

- Drop the _ms suffix from the metric name (keep unit "ms") so the OTLP to
  Prometheus exporter doesn't double-suffix it to _ms_milliseconds.
- Resolve the oldest-overdue candidate against the items hash (new getOldestDueScore
  Lua) so an orphaned queue entry can't report a phantom stall for work that can't
  be dequeued.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant