Deployment Guide
This document covers the full deployment pipeline for BlockbotX, including Docker builds, production infrastructure, CI/CD, database migrations, and monitoring.
Docker Production Build
The production image uses a multi-stage Dockerfile to minimize the final image size while keeping all necessary runtime dependencies.
Stage 1: deps
Installs dependencies.
- Base image:
node:20-alpine - Installs system packages:
libc6-compat,openssl - Enables corepack and activates
[email protected] - Copies
package.json,pnpm-lock.yaml, anddrizzle/directory - Runs
pnpm install --frozen-lockfilefor reproducible installs
Stage 2: builder
Compiles the Next.js application with standalone output.
- Copies
node_modulesfrom the deps stage - Copies the full source tree
- Disables Next.js telemetry (
NEXT_TELEMETRY_DISABLED=1) - Provides minimal build-time environment variables (overridden at runtime):
DATABASE_URL(placeholder connection string)NEXTAUTH_SECRET(placeholder secret)NEXTAUTH_URL(localhost)
- Runs
pnpm buildto produce the standalone output
Stage 3: runner
Minimal production runtime with a non-root user.
- Creates a system group
nodejs(GID 1001) and usernextjs(UID 1001) - Enables corepack with
[email protected] - Copies the following from the builder stage (owned by
nextjs:nodejs):public/-- static assets.next/standalone/-- standalone Next.js server.next/static/-- static build outputnode_modules/-- runtime dependenciesdrizzle/-- schema and migrationsserver.ts-- custom server with Socket.iopackage.json-- package metadatalib/-- runtime library code (needed by custom server via tsx)docker-entrypoint.sh-- startup script
- Creates
/app/logsdirectory for Winston logging - Switches to the
nextjsuser before running - Exposes port 3000
Health Check
HEALTHCHECK --interval=30s --timeout=10s --start-period=90s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/api/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
The 90-second start period allows time for database migrations to complete before health checks begin failing.
Entrypoint
The container uses docker-entrypoint.sh as its entrypoint. This script:
- Waits for the database to be reachable (up to 30 attempts, 2 seconds apart)
- Runs
pnpm db:migrateto apply any pending migrations - Starts the application with
tsx server.ts
Docker Compose Production
The production stack is defined in docker-compose.prod.yml (which extends docker-compose.yml).
Services
| Service | Image | Purpose |
|---|---|---|
| postgres | postgres:16-alpine | Primary database |
| pgbouncer | bitnami/pgbouncer:1 | Connection pooler (transaction mode) |
| redis | redis:7-alpine | Caching and pub/sub |
| app | Built from Dockerfile | Next.js application |
| db-backup | postgres:16-alpine | Scheduled database backups |
PostgreSQL
- Credentials sourced from environment variables:
POSTGRES_USER,POSTGRES_PASSWORD,POSTGRES_DB - Data persisted to the
postgres_datanamed volume - Health check:
pg_isreadyevery 10 seconds
PgBouncer
- Pool mode:
transaction - Max client connections: 200
- Default pool size: 20, minimum: 5, reserve: 5
- Listens on port 6432
- Depends on PostgreSQL being healthy
Redis
- Runs with append-only file persistence (
--appendonly yes) - Password-protected via
REDIS_PASSWORDenvironment variable - Memory limit:
--maxmemory 256mb - Eviction policy:
--maxmemory-policy allkeys-lru - Data persisted to the
redis_datanamed volume
App Service
- Resource limits: 1G memory, 1.0 CPU
restart: alwayspolicyDATABASE_URLpoints to PgBouncer on port 6432- Additional environment loaded from
.env.production - Depends on PgBouncer and Redis being healthy
Database Backup Service
- Uses
postgres:16-alpinewithdcroninstalled at runtime - Mounts
./scripts/backupas read-only for the backup script - Stores backups in the
db-backupsnamed volume - Cron schedule: daily at 02:00 UTC
- Runs
/scripts/backup/backup-database.sh --local-only - 30-day retention (
BACKUP_RETENTION_DAYS=30) - Depends on PostgreSQL being healthy
Networking
All services communicate over a custom bridge network named blockbotx-network.
Environment Variables for Production
The following environment variables must be configured for a production deployment. Secrets should never be committed to source control.
Core Application
| Variable | Description |
|---|---|
NODE_ENV | Must be set to production |
PORT | Application port (default: 3000) |
HOSTNAME | Bind address (default: 0.0.0.0) |
Database
| Variable | Description |
|---|---|
DATABASE_URL | Connection string pointing to PgBouncer (e.g., postgresql://user:pass@pgbouncer:6432/dbname?schema=public) |
DIRECT_URL | Connection string pointing directly to PostgreSQL (used for migrations, bypasses PgBouncer) |
POSTGRES_USER | PostgreSQL username |
POSTGRES_PASSWORD | PostgreSQL password |
POSTGRES_DB | PostgreSQL database name |
Redis
| Variable | Description |
|---|---|
REDIS_HOST | Redis hostname (e.g., redis) |
REDIS_PORT | Redis port (default: 6379) |
REDIS_PASSWORD | Redis authentication password |
Authentication and Security
| Variable | Description |
|---|---|
JWT_SECRET | Strong random value for signing JWTs (minimum 64 characters) |
NEXTAUTH_SECRET | Strong random value for NextAuth session signing |
NEXTAUTH_URL | Public-facing application URL (e.g., https://app.blockbotx.com) |
ENCRYPTION_KEY | Exactly 32 characters for AES-256-GCM encryption of API keys |
Stripe
| Variable | Description |
|---|---|
STRIPE_SECRET_KEY | Stripe live secret key (sk_live_...) |
STRIPE_WEBHOOK_SECRET | Stripe webhook signing secret (whsec_...) for HMAC-SHA256 verification |
Monitoring
| Variable | Description |
|---|---|
SENTRY_DSN | Sentry project DSN for error tracking |
SENTRY_AUTH_TOKEN | Sentry auth token for source map uploads |
SENTRY_ORG | Sentry organization slug |
SENTRY_PROJECT | Sentry project slug |
Public Client Variables
| Variable | Description |
|---|---|
NEXT_PUBLIC_APP_URL | Public application URL for client-side code |
NEXT_PUBLIC_WS_URL | WebSocket URL for Socket.io connections |
CI/CD Pipeline
Test Pipeline (test.yml)
Triggered on pushes and pull requests to the main and develop branches. Runs 7 parallel jobs, all of which must pass for the all-tests-passed gate to succeed.
Job 1: Unit Tests
- Services: PostgreSQL 16 + Redis 7
- Steps:
- Install pnpm and Node.js 20 with caching
pnpm install --frozen-lockfile- Set up test environment variables
pnpm db:pushto create schemapnpm test:unit-- run all unit tests- Run Decimal precision tests (
__tests__/models/decimal-precision.test.ts) -- critical for financial accuracy - Generate coverage report (
pnpm test:unit --coverage) - Upload coverage to Codecov
- Comment coverage on PR (via
lcov-reporter-action)
Job 2: API Tests
- Services: PostgreSQL 16 + Redis 7
- Steps:
- Install dependencies and set up environment
pnpm db:pushpnpm build-- full Next.js production build- Upload Sentry source maps (main branch only, if
SENTRY_AUTH_TOKENis set) - Start the server (
pnpm start), wait for/api/health(up to 60 seconds) pnpm test:api --testTimeout=30000with a 20-minute CI timeout- Stop the application
Job 3: E2E Tests
- Services: PostgreSQL 16 + Redis 7
- Steps:
- Install dependencies
npx playwright install --with-deps chromium-- Chromium only in CI- Set up environment (including
PLAYWRIGHT_BASE_URL) - Build and start the application
pnpm test:e2ewith a 120-minute CI timeout- Upload Playwright HTML report and test results as artifacts (30-day retention)
Job 4: Code Quality Checks
- Set up dummy DATABASE_URL for type checking
pnpm tsc --noEmit-- TypeScript type checking (zero errors expected)pnpm lint-- ESLint
Job 5: Security Audit
pnpm audit --production --audit-level=high-- npm dependency auditnpx snyk test --severity-threshold=high-- Snyk scan (optional, runs ifSNYK_TOKENis set)
Job 6: Build Check
pnpm buildwith production environment- Verifies the
.next/directory exists after build
Job 7: Docker Build
- Depends on the build-check job
docker buildwithNEXT_TELEMETRY_DISABLED=1- Runs the container with test environment variables
- Verifies the container starts (logs are printed regardless of status)
- Cleans up: stops container, removes image
Gate: all-tests-passed
- Runs after all 7 jobs complete
- Checks that every job reported
success - If any job failed, this gate fails and blocks the merge
Deploy Pipeline (deploy.yml)
Triggered by:
workflow_dispatch(manual trigger)- Successful completion of the Test Suite workflow on the
mainbranch
Uses a concurrency group (deploy-production) with cancel-in-progress: false to prevent overlapping deployments.
Build and Push
- Checks out the repository
- Logs into GitHub Container Registry (
ghcr.io) usingGITHUB_TOKEN - Extracts Docker metadata for image tagging
- Builds and pushes the Docker image with:
- Tags: commit SHA +
latest - Cache: GitHub Actions layer cache (
type=gha) - Build args:
NEXT_TELEMETRY_DISABLED=1
- Tags: commit SHA +
Deploy to VPS
Uses SSH (appleboy/ssh-action@v1) to connect to the production server:
cd /opt/blockbotx
echo "$GITHUB_TOKEN" | docker login ghcr.io -u $ACTOR --password-stdin
docker pull ghcr.io/$IMAGE:latest
docker compose -f docker-compose.prod.yml up -d --no-deps app
sleep 30
curl -sf http://localhost:3000/api/health?quick=true || \
(docker compose -f docker-compose.prod.yml logs --tail=50 app && exit 1)
The deploy:
- Pulls the latest image from GHCR
- Restarts only the
appservice (--no-depsavoids restarting database or Redis) - Waits 30 seconds for the application to start and run migrations
- Performs a health check smoke test against
/api/health?quick=true - If the health check fails, prints the last 50 lines of app logs and exits with failure
Required secrets for deployment:
VPS_HOST-- server hostname or IPVPS_USER-- SSH usernameVPS_SSH_KEY-- SSH private key
Database Migrations
Startup Migration Flow
The docker-entrypoint.sh script handles migrations automatically on container startup:
- Wait for database: Attempts to connect via TCP up to 30 times (2-second intervals)
- Apply migrations: Runs
pnpm db:migrateto apply all pending migrations - Start server: Launches the application via
tsx server.ts
Important Notes
- Always use
DIRECT_URL(pointing directly to PostgreSQL) for migrations. PgBouncer in transaction mode does not support the advisory locks used during migration - Schema is defined in
drizzle/schema/*.tsfiles, with migrations generated todrizzle/migrations/ - New migrations should be created in development with
pnpm db:generateand committed to the repository
Health Checks and Monitoring
/api/health Endpoint
The application exposes a health check endpoint at /api/health that verifies the application is running and can handle requests. The ?quick=true parameter is used during deployment for a lightweight check.
Docker HEALTHCHECK
The Dockerfile includes a built-in health check:
- Interval: 30 seconds
- Timeout: 10 seconds
- Start period: 90 seconds (allows time for migrations)
- Retries: 3
- Command: HTTP GET to
http://localhost:3000/api/health, expects status 200
Docker will mark the container as unhealthy if 3 consecutive checks fail after the start period.
Sentry Error Tracking
The application integrates Sentry for error tracking across three runtimes:
- Client: Browser-side errors
- Server: Node.js server errors
- Edge: Edge runtime errors (proxy/middleware)
Source maps are uploaded to Sentry during the API tests CI job on the main branch, enabling readable stack traces in production error reports.
Winston Structured Logging
The application uses Winston for structured logging:
- Logs are written to the
/app/logsdirectory inside the container - A
redactFormatfilter automatically removes sensitive fields (key, secret, password, token) from log output - In Docker, logs are persisted via the
app_logsvolume
Database Backups
Backup Schedule
The db-backup service in docker-compose.prod.yml handles automated backups:
- Engine:
dcron(installed in thepostgres:16-alpinecontainer) - Schedule: Daily at 02:00 UTC (
0 2 * * *) - Script:
/scripts/backup/backup-database.sh --local-only - Storage: Backups are stored in the
db-backupsnamed Docker volume mounted at/backups
Retention
Old backups are automatically cleaned up based on the BACKUP_RETENTION_DAYS environment variable, which defaults to 30 days.
Manual Backup
To trigger a manual backup:
docker compose -f docker-compose.prod.yml exec db-backup \
/bin/bash /scripts/backup/backup-database.sh --local-only