> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognisafe.uk/llms.txt
> Use this file to discover all available pages before exploring further.

# Ollama (self-hosted)

> Using Cognisafe with Ollama and local models.

## How it works

Ollama exposes an OpenAI-compatible HTTP API on port 11434. Because the Cognisafe proxy speaks the same protocol, you can observe all Ollama calls by pointing the proxy's `UPSTREAM_URL` at your Ollama instance — no changes to Ollama itself.

This is particularly useful in **air-gapped environments**: all traffic stays on your internal network, and Cognisafe's safety scoring (if using local scoring) never leaves your infrastructure.

## Proxy configuration

Set `UPSTREAM_URL` on the Cognisafe proxy to your Ollama instance:

```bash theme={null}
# Proxy service env vars
UPSTREAM_URL=http://localhost:11434
```

If Ollama runs on a different host:

```bash theme={null}
UPSTREAM_URL=http://ollama-host.internal:11434
```

## SDK setup

Use `patch_openai()` — the OpenAI client speaks the same protocol as Ollama's API:

```python theme={null}
import cognisafe
from openai import OpenAI

cognisafe.configure(
    api_key="csk_your_key_here",
    project_id="my-app",
    proxy_url="http://localhost:8080",  # Cognisafe proxy
)
cognisafe.patch_openai()

# No real OpenAI API key needed — Ollama ignores it
client = OpenAI(api_key="ollama")

response = client.chat.completions.create(
    model="llama3.2",   # any model you have pulled in Ollama
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response.choices[0].message.content)
```

## Air-gapped safety scoring

By default, Cognisafe's safety worker uses `gpt-4o-mini` (via `OPENAI_API_KEY`) to score requests. In an air-gapped environment, you have two options:

1. **Disable scoring**: if `OPENAI_API_KEY` is not set, the worker falls back gracefully with `score_label: "unscored"`. Requests are still logged and cost/latency data is captured.

2. **Use a local scoring model**: configure an Ollama-backed scoring model by pointing the safety worker at a local OpenAI-compatible endpoint:

   ```bash theme={null}
   # safety_worker env vars
   OPENAI_API_KEY=ollama        # any non-empty value
   SCORER_MODEL=llama3.2        # local model name
   # Override the OpenAI base URL used by PyRIT:
   OPENAI_BASE_URL=http://localhost:11434/v1
   ```

<Tip>
  Models like `llama3.2` or `mistral` pulled into Ollama work well as scoring models for `content_safety` and `pii_detection`. For `jailbreak_detection`, larger models (70B+) produce more reliable results.
</Tip>

## Supported Ollama models

Any model available in the [Ollama library](https://ollama.com/library) works — Cognisafe does not constrain the model field. The proxy passes `model` through to Ollama unchanged.

```bash theme={null}
# Pull a model before use
ollama pull llama3.2
ollama pull mistral
ollama pull phi4
```