feat: add security hardening and monitoring updates#4708
Conversation
Что: - Усилил authorization boundaries в router/service/WSS слоях: service access, assigned-server access, organization scoping, permission checks и safer deploy-source ownership checks. - Добавил переиспользуемые access helpers для Docker/local server, monitoring targets, deploy-source credentials, destination binding, backup/schedule targets и provider ownership. - Усилил command/path safety: shell argument quoting, Git clone/build commands, custom compose command validation, Dockerfile/build command boundaries, bind/restore path normalization и SSRF-aware URL checks. - Добавил secret redaction для API responses, nested service reads, notification/error paths, Git providers, SSH keys, registry data, deployment/rollback context и logs. - Закрыл Git provider/webhook gaps: GitHub App OAuth state signing, provider read/update access checks, webhook payload binding и safer custom Git URL handling. - Закрепил schedule/deployment worker jobs signed metadata: scoped queue payloads, immutable job claims, stale/foreign job rejection и explicit package export для production schedules build. - Улучшил monitoring UX: resource breakdown by container, process parsing, swap reporting, Docker Disk Usage details, image/container/volume context, visible detail limits и responsive layout. - Добавил safe application env upsert: dry-run metadata, revision checks, secret-aware result summaries и conflict protection. - Обновил DB/schema/tooling surface для SSO/Gitea scoping, refreshed dependencies, TypeScript/Next build path и server package export switch scripts with stable final newline output. - Добавил и актуализировал regression coverage across auth, backup, deploy, Docker, Git providers, monitoring, schedule signing, permissions, redaction, safe paths and webhooks. Зачем: - Для владельца Dokploy это снижает риск IDOR, cross-server access, SSRF, command injection, unsafe path access, leaked secrets и stale worker job execution. - Для maintainer review это превращает большой набор security/UX gaps в проверяемые helpers, negative fixtures, focused tests и повторяемые package/build boundaries. - Для операторов Monitoring становится понятнее: видно, какие containers/images/volumes/processes используют ресурсы, а Docker Disk Usage раскрывается без потери контекста. - Для integrations safe env upsert дает менее рискованный путь точечно обновлять application environment variables без silent overwrite. Риски: - PR commit остается большим и squashed; maintainer может попросить разделить security, monitoring и dependency/tooling parts на несколько PR. - Dependency/tooling refresh широкий, поэтому upstream CI после push остается финальным source of truth. - Browser smoke был локальным unauthenticated boot/render smoke: login page, `/` HTTP 200 и screenshot; authenticated dashboard flows локально не проверялись. - Локальная среда использовала Node v24.18.0 при `.nvmrc` 24.4.0; engine range допускает 24.x, но это отличается от exact project recommendation. Проверки: - Команды и результаты: после rebase на `upstream/canary` `8b6481501e6e379b9ce32c4da4201fcb7a65364a` команда `git diff --check $(git merge-base upstream/canary HEAD)..HEAD` passed; `CI=true corepack pnpm run format-and-lint` passed with existing 37 warnings and 26 infos; `CI=true corepack pnpm run typecheck` passed for server, dokploy, api and schedules; `CI=true corepack pnpm run server:build` passed; `corepack pnpm run server:script` restored source exports; first `CI=true corepack pnpm run build` hit a stale ignored Next/Webpack cache failure at `WasmHash._updateWithBuffer`; after moving `apps/dokploy/.next` to `/tmp/dokploy-next-cache-rebase-20260630095440`, `CI=true corepack pnpm --dir apps/dokploy run build-next` passed and full `CI=true corepack pnpm run build` passed; `corepack pnpm run server:script` restored source exports after build; `CI=true corepack pnpm run test` passed 152 files / 1331 tests / 1 skipped; browser smoke started the built app with dist exports, ran migrations, returned `HEAD /` and `GET /` 200, and saved a readable 1440x900 screenshot at `output/playwright/agenthits-pr-ready-smoke/home-1440x900.png`; independent Agent Flow reviewer.qa returned pass-with-risks with no local blocker for force-with-lease push after PR body refresh. - Ограничения: upstream CI, maintainer review, authenticated browser flows and production deployment were not run locally. What: - Strengthened authorization boundaries across router/service/WSS layers: service access, assigned-server access, organization scoping, permission checks, and safer deploy-source ownership checks. - Added reusable access helpers for Docker/local server, monitoring targets, deploy-source credentials, destination binding, backup/schedule targets, and provider ownership. - Hardened command/path safety: shell argument quoting, Git clone/build commands, custom compose command validation, Dockerfile/build command boundaries, bind/restore path normalization, and SSRF-aware URL checks. - Added secret redaction for API responses, nested service reads, notification/error paths, Git providers, SSH keys, registry data, deployment/rollback context, and logs. - Closed Git provider/webhook gaps: GitHub App OAuth state signing, provider read/update access checks, webhook payload binding, and safer custom Git URL handling. - Bound schedule/deployment worker jobs with signed metadata: scoped queue payloads, immutable job claims, stale/foreign job rejection, and an explicit package export for the production schedules build. - Improved monitoring UX: resource breakdown by container, process parsing, swap reporting, Docker Disk Usage details, image/container/volume context, visible detail limits, and responsive layout. - Added safe application env upsert with dry-run metadata, revision checks, secret-aware result summaries, and conflict protection. - Updated DB/schema/tooling surface for SSO/Gitea scoping, refreshed dependencies, TypeScript/Next build path, and server package export switch scripts with stable final-newline output. - Added and updated regression coverage across auth, backup, deploy, Docker, Git providers, monitoring, schedule signing, permissions, redaction, safe paths, and webhooks. Why: - For the Dokploy owner, this lowers the risk of IDOR, cross-server access, SSRF, command injection, unsafe path access, leaked secrets, and stale worker job execution. - For maintainer review, this turns a large set of security and UX gaps into reviewable helpers, negative fixtures, focused tests, and repeatable package/build boundaries. - For operators, Monitoring becomes clearer: containers/images/volumes/processes show resource ownership, and Docker Disk Usage can expand without losing context. - For integrations, safe env upsert provides a lower-risk way to patch application environment variables without silent overwrite. Risks: - The PR commit remains large and squashed; the maintainer may ask to split security, monitoring, and dependency/tooling parts into separate PRs. - The dependency/tooling refresh is broad, so upstream CI after push remains the final source of truth. - Browser smoke was a local unauthenticated boot/render smoke: login page, `/` HTTP 200, and screenshot; authenticated dashboard flows were not checked locally. - The local environment used Node v24.18.0 while `.nvmrc` is 24.4.0; the engine range accepts 24.x, but it differs from the exact project recommendation. Checks: - Commands and results: after rebasing onto `upstream/canary` `8b6481501e6e379b9ce32c4da4201fcb7a65364a`, `git diff --check $(git merge-base upstream/canary HEAD)..HEAD` passed; `CI=true corepack pnpm run format-and-lint` passed with existing 37 warnings and 26 infos; `CI=true corepack pnpm run typecheck` passed for server, dokploy, api, and schedules; `CI=true corepack pnpm run server:build` passed; `corepack pnpm run server:script` restored source exports; the first `CI=true corepack pnpm run build` hit a stale ignored Next/Webpack cache failure at `WasmHash._updateWithBuffer`; after moving `apps/dokploy/.next` to `/tmp/dokploy-next-cache-rebase-20260630095440`, `CI=true corepack pnpm --dir apps/dokploy run build-next` passed and the full `CI=true corepack pnpm run build` passed; `corepack pnpm run server:script` restored source exports after build; `CI=true corepack pnpm run test` passed 152 files / 1331 tests / 1 skipped; browser smoke started the built app with dist exports, ran migrations, returned `HEAD /` and `GET /` 200, and saved a readable 1440x900 screenshot at `output/playwright/agenthits-pr-ready-smoke/home-1440x900.png`; independent Agent Flow reviewer.qa returned pass-with-risks with no local blocker for force-with-lease push after PR body refresh. - Limitations: upstream CI, maintainer review, authenticated browser flows, and production deployment were not run locally.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f89a8a8ac4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "./utils/schedules/signed-job": { | ||
| "import": "./src/utils/schedules/signed-job.ts", | ||
| "require": "./dist/utils/schedules/signed-job.js" | ||
| } |
There was a problem hiding this comment.
Export the deployment signing module
This exports map only adds the schedules signer, but the commit also imports @dokploy/server/utils/deployments/signed-job from the deployment API/client paths. Because @dokploy/server has an exports map and no source-mode entry for ./utils/deployments/signed-job, Node rejects that subpath at runtime while TypeScript path aliases can still typecheck it, so the signed deployment API/dev server fails before it can process cloud deployment jobs. Add the matching deployment signer export and mirror it in the switch scripts.
Useful? React with 👍 / 👎.
| @@ -2,10 +2,23 @@ ALTER TABLE "schedule" DROP CONSTRAINT "schedule_userId_user_id_fk"; | |||
| --> statement-breakpoint | |||
| ALTER TABLE "schedule" ADD COLUMN "organizationId" text;--> statement-breakpoint | |||
| ALTER TABLE "schedule" ADD CONSTRAINT "schedule_organizationId_organization_id_fk" FOREIGN KEY ("organizationId") REFERENCES "public"."organization"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint | |||
| WITH owner_memberships AS ( | |||
There was a problem hiding this comment.
Move the schedule backfill into a new migration
This changes migration 0169 even though later migrations already exist; databases that have already applied 0169 will have it recorded as applied and will never execute these new UPDATEs. Upgrades from an existing canary install therefore keep the old dokploy-server schedule organizationId/enabled state, while only fresh installs get the corrected backfill. Put this corrective data migration in a new migration instead.
Useful? React with 👍 / 👎.
What is this PR about?
This PR collects security hardening, monitoring UX, safe env update, and regression coverage work into one upstream review branch for
Dokploy/dokploy:canary.Fresh readiness pass:
upstream/canary8b6481501e6e379b9ce32c4da4201fcb7a65364af89a8a8ac446993b7d55b1d1327eba36ce5177b1canaryoutput/playwright/agenthits-pr-ready-smoke/home-1440x900.pngChecklist
Before submitting this PR, please make sure that:
canarybranch.Issues related (if applicable)
Not linked.
Screenshots (if applicable)
No generated screenshot files are added to this upstream diff. A local readable browser-smoke screenshot was captured at
output/playwright/agenthits-pr-ready-smoke/home-1440x900.pngand shows the Dokploy sign-in page rendered from the built app.Detailed Changes
Что:
Зачем:
Проверки:
git diff --check $(git merge-base upstream/canary HEAD)..HEADpassed.CI=true corepack pnpm run format-and-lintpassed with existing 37 warnings and 26 infos.CI=true corepack pnpm run typecheckpassed for server, dokploy, api and schedules.CI=true corepack pnpm run server:buildpassed;corepack pnpm run server:scriptrestored source exports.WasmHash._updateWithBuffer; after movingapps/dokploy/.nextto/tmp/dokploy-next-cache-rebase-20260630095440,CI=true corepack pnpm --dir apps/dokploy run build-nextpassed and fullCI=true corepack pnpm run buildpassed.CI=true corepack pnpm run testpassed: 152 files / 1331 tests passed / 1 skipped.HEAD /andGET /200, and saved a readable 1440x900 login-page screenshot.Риски:
.nvmrcis 24.4.0; the project engine range accepts 24.x, but this differs from the exact project recommendation.What:
Why: