Self-Hosted Health Checks & Probes

Self-hosted Databrain exposes built-in health endpoints that you can use for Kubernetes probes, Docker health checks, and external monitoring. This guide explains:

Which endpoints to use
How to configure liveness and readiness probes (Kubernetes)
How to configure Docker health checks
How graceful shutdown (SIGTERM) works in the backend and forecast services

This is a DevOps configuration guide, not an API reference.

Health endpoints overview

Databrain services expose simple JSON health endpoints:

Backend (Express API) – default port 3000
- GET /health/live – liveness
- GET /health/ready – readiness (checks Hasura, Postgres via Hasura, and Keycloak if configured)
Forecast service (FastAPI) – default port 8082
- GET /health/live
- GET /health/ready
Hasura
- GET /healthz – built-in Hasura health
App (frontend)
- GET / – static app root

For self-hosted deployments, replace host and ports with your actual values. If you run the backend behind a prefix (e.g. /api), the health paths become /api/health/live and /api/health/ready.

Kubernetes configuration

This section shows typical probe configuration for a Kubernetes deployment. Adjust ports and paths to match your manifests.

Hasura deployment

Use the built-in /healthz endpoint for both liveness and readiness:

livenessProbe:
  httpGet:
    path: /healthz
    port: 8082
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /healthz
    port: 8082
  initialDelaySeconds: 5
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

Note: the backend uses Hasura’s healthz?strict=true internally when evaluating Postgres health. Do not add query strings like ?strict=true directly to Kubernetes httpGet probes – some Kubernetes versions URL-encode ? which can break routing. Use plain /healthz in the probe and rely on the backend’s /health/ready for strict DB checks.

Backend deployment (Express API)

The backend mounts health endpoints at /health. Recommended probes:

livenessProbe:
  httpGet:
    path: /health/live
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health/ready
    port: 3000
  initialDelaySeconds: 20
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

terminationGracePeriodSeconds: 35

Liveness (/health/live)
- Returns { "status": "live" } with HTTP 200 when the process is running.
- Does not perform dependency checks.
Readiness (/health/ready)
- Returns HTTP 200 and { "status": "ready", "checks": { ... } } when:
  - Hasura is reachable (/healthz)
  - Postgres is healthy via Hasura strict health
  - Keycloak is healthy (if configured)
- Returns HTTP 503 and { "status": "not_ready", "checks": { ... } } when any check is failing.

Forecast deployment (FastAPI)

The forecast service is a separate FastAPI app with its own health endpoints:

livenessProbe:
  httpGet:
    path: /health/live
    port: 8082
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8082
  initialDelaySeconds: 20
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

terminationGracePeriodSeconds: 35

GET /health/live returns { "status": "live" }.
GET /health/ready returns { "status": "ready" } when the app is initialized and can accept requests.

App deployment (frontend)

For the frontend container, it’s usually enough to probe the root URL:

livenessProbe:
  httpGet:
    path: /
    port: 80
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

Docker / Docker Compose health checks

For non-Kubernetes self-hosted setups, use Docker health checks with the same endpoints.

Backend (Express API)

services:
  backend:
    image: your-backend-image
    ports:
      - "3000:3000"
    healthcheck:
      test: ["CMD-SHELL", "curl -fsS http://localhost:3000/health/ready || exit 1"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 20s
    stop_grace_period: 35s

Forecast service (FastAPI)

services:
  forecast:
    image: your-forecast-image
    ports:
      - "8082:8082"
    healthcheck:
      test: ["CMD-SHELL", "curl -fsS http://localhost:8082/health/ready || exit 1"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 20s
    stop_grace_period: 35s

Hasura

services:
  hasura:
    image: hasura/graphql-engine
    ports:
      - "8082:8080"
    healthcheck:
      test: ["CMD-SHELL", "curl -fsS http://localhost:8080/healthz || exit 1"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 20s

App (frontend)

services:
  app:
    image: your-frontend-image
    ports:
      - "80:80"
    healthcheck:
      test: ["CMD-SHELL", "curl -fsS http://localhost/ || exit 1"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 20s

Graceful shutdown & SIGTERM behavior

Recent backend changes added graceful shutdown handling for the Express API and improved lifecycle hooks for the forecast service.

Backend (Express API)

Behavior (from serverless/express/src/index.ts):

On SIGTERM or SIGINT:
- Logs: <signal> signal received: closing HTTP server.
- Calls server.close():
  - Stops accepting new connections.
  - Allows in-flight requests to complete.
- When the server is closed:
  - Logs HTTP server closed.
  - Calls process.exit(0).
A 30-second timeout is started:
- If shutdown is not complete within 30 seconds:
  - Logs Could not close connections in time, forcefully shutting down.
  - Calls process.exit(1).

Kubernetes recommendation

Set terminationGracePeriodSeconds to at least 35 seconds for the backend pod.
- This gives the app enough time to drain connections and exit cleanly after receiving SIGTERM.

Forecast service (FastAPI)

Behavior (from forecast-timeseries/api.py):

Uses a FastAPI lifespan context manager:
- On startup: prints Starting up Forecast service....
- On shutdown: prints Shutting down Forecast service....
- This is where you can extend logic to close external resources (LLM clients, DB connections, etc.).

Kubernetes / Docker

Use the same terminationGracePeriodSeconds / stop_grace_period (35s) pattern as the backend.

Summary

Use /health/live for liveness, /health/ready for readiness.
For strict database checks, rely on the backend’s /health/ready, which already queries Hasura and Postgres.
Configure Kubernetes and Docker health checks against these endpoints.
Ensure termination grace periods are ≥ 35s so graceful shutdown on SIGTERM can complete without forced kills.

​Health endpoints overview

​Kubernetes configuration

​Hasura deployment

​Backend deployment (Express API)

​Forecast deployment (FastAPI)

​App deployment (frontend)

​Docker / Docker Compose health checks

​Backend (Express API)

​Forecast service (FastAPI)

​Hasura

​App (frontend)

​Graceful shutdown & SIGTERM behavior

​Backend (Express API)

​Forecast service (FastAPI)

​Summary

Health endpoints overview

Kubernetes configuration

Hasura deployment

Backend deployment (Express API)

Forecast deployment (FastAPI)

App deployment (frontend)

Docker / Docker Compose health checks

Backend (Express API)

Forecast service (FastAPI)

Hasura

App (frontend)

Graceful shutdown & SIGTERM behavior

Backend (Express API)

Forecast service (FastAPI)

Summary