> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognisafe.uk/llms.txt
> Use this file to discover all available pages before exploring further.

# Kubernetes / Railway

> Production deployment on Kubernetes using Railway or your own cluster.

## Railway (recommended)

Railway is the recommended production host for Cognisafe. Each service is deployed as an independent Railway service from this monorepo, giving you per-service scaling, logs, and metrics.

### One-time setup

<Steps>
  <Step title="Create a Railway project">
    Go to [railway.app](https://railway.app) and create a new project.
  </Step>

  <Step title="Add managed services">
    In your Railway project, add:

    * **PostgreSQL** — use the TimescaleDB plugin
    * **Redis** — standard Redis service

    Copy the connection strings Railway provides — you'll need them for the service env vars below.
  </Step>

  <Step title="Create the four application services">
    Create one Railway service per component. Use the settings from the table below:

    | Service         | Root directory | Dockerfile          | Notes                                    |
    | --------------- | -------------- | ------------------- | ---------------------------------------- |
    | `api`           | `api/`         | `Dockerfile`        | Pre-deploy: `alembic upgrade head`       |
    | `proxy`         | `proxy/`       | `Dockerfile`        | —                                        |
    | `web`           | `web/`         | `Dockerfile`        | —                                        |
    | `safety_worker` | `api/`         | `Dockerfile.worker` | Start: `python workers/safety_scorer.py` |
  </Step>

  <Step title="Set environment variables">
    Configure the env vars for each service as shown in the sections below.
  </Step>

  <Step title="Connect your GitHub repo">
    In Railway: **Settings** → **Source** → connect your GitHub repository. Set the target branch (`main` or `platform`). Railway redeploys all services on every push to that branch.
  </Step>
</Steps>

<Warning>
  The `api` service must run `alembic upgrade head` **before** it starts serving traffic. This is configured as a pre-deploy command in `api/railway.toml`. Do not remove or skip it — the API will crash on startup if migrations have not been applied.
</Warning>

### Environment variables per service

#### api

```bash theme={null}
POSTGRES_URL=postgresql+asyncpg://<railway-postgres-url>
REDIS_URL=redis://<railway-redis-url>
STRIPE_SECRET_KEY=sk_live_...
STRIPE_WEBHOOK_SECRET=whsec_...
STRIPE_PRICE_PRO=price_...
STRIPE_PRICE_TEAM=price_...
INTERNAL_API_SECRET=<long-random-secret>
```

#### proxy

```bash theme={null}
UPSTREAM_URL=https://api.openai.com
API_BACKEND_URL=https://<your-api-railway-domain>
PROXY_API_KEY=<same as a cognisafe API key>
```

#### web

```bash theme={null}
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_live_...
CLERK_SECRET_KEY=sk_live_...
INTERNAL_API_SECRET=<same value as api service>
API_URL=https://<your-api-railway-domain>
NEXT_PUBLIC_API_URL=https://<your-api-railway-domain>
```

#### safety\_worker

```bash theme={null}
POSTGRES_URL=<same as api>
REDIS_URL=<same as api>
OPENAI_API_KEY=sk-...
SCORER_MODEL=gpt-4o-mini
```

### Custom domains

In Railway: **Settings** → **Networking** → **Generate Domain** (or add your own CNAME).

Recommended mapping:

| Domain               | Railway service |
| -------------------- | --------------- |
| `cognisafe.uk`       | `web`           |
| `api.cognisafe.uk`   | `api`           |
| `proxy.cognisafe.uk` | `proxy`         |

### Scaling

Scale the `safety_worker` service by increasing its replica count in Railway's service settings. The workers are stateless — each reads independently from the Redis queue.

***

## Bare Kubernetes (Helm)

A Helm chart is available for teams running their own Kubernetes cluster.

```bash theme={null}
helm repo add cognisafe https://charts.cognisafe.uk
helm repo update

helm install cognisafe cognisafe/cognisafe \
  --namespace cognisafe \
  --create-namespace \
  --set api.postgresUrl="postgresql+asyncpg://..." \
  --set api.redisUrl="redis://..." \
  --set proxy.upstreamUrl="https://api.openai.com" \
  --set api.stripeSecretKey="sk_live_..." \
  --set safetyWorker.openaiApiKey="sk-..."
```

The chart deploys all four application services as separate `Deployment` resources, plus `Service` and `Ingress` objects. PostgreSQL and Redis are expected to be provided externally (managed cloud databases or separate Helm releases).

### Values reference

| Value                       | Description                  | Default                  |
| --------------------------- | ---------------------------- | ------------------------ |
| `api.postgresUrl`           | asyncpg connection string    | —                        |
| `api.redisUrl`              | Redis connection string      | —                        |
| `api.replicaCount`          | API pod replicas             | `2`                      |
| `proxy.upstreamUrl`         | LLM provider upstream URL    | `https://api.openai.com` |
| `proxy.replicaCount`        | Proxy pod replicas           | `2`                      |
| `safetyWorker.replicaCount` | Worker pod replicas          | `3`                      |
| `safetyWorker.openaiApiKey` | OpenAI key for PyRIT scorers | —                        |
| `safetyWorker.scorerModel`  | PyRIT scoring model          | `gpt-4o-mini`            |
| `web.clerkPublishableKey`   | Clerk publishable key        | —                        |
| `web.clerkSecretKey`        | Clerk secret key             | —                        |

### Running migrations on Kubernetes

The Helm chart includes a pre-upgrade `Job` that runs `alembic upgrade head`:

```bash theme={null}
helm upgrade cognisafe cognisafe/cognisafe ...
# The migration job runs automatically before the API Deployment rolls out
```

To run migrations manually:

```bash theme={null}
kubectl run alembic-migrate \
  --image=ghcr.io/cognisafe/api:latest \
  --env="POSTGRES_URL=postgresql+asyncpg://..." \
  --restart=Never \
  -- alembic upgrade head
```
