> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognisafe.uk/llms.txt
> Use this file to discover all available pages before exploring further.

# Severity scale

> How Cognisafe rates threat severity on a 1–5 Likert scale.

## The 1–5 Likert scale

Cognisafe uses a five-point severity scale for all safety scores. Likert scoring allows the safety worker to express degrees of concern — not just a binary safe/unsafe flag — which is more useful for triage and alerting.

| Score | Label    | Dashboard colour | Meaning                                                                         |
| ----- | -------- | ---------------- | ------------------------------------------------------------------------------- |
| 1     | Benign   | Green            | No evidence of the threat. Normal traffic.                                      |
| 2     | Low      | Blue             | Ambiguous signal; unlikely to be a genuine threat. Monitor if volume increases. |
| 3     | Medium   | Yellow           | Probable match. Warrants review. May be a false positive in some contexts.      |
| 4     | High     | Orange           | Strong match. Likely a genuine threat. Review and consider action.              |
| 5     | Critical | Red              | Definitive match. Immediate attention required.                                 |

## How scores are generated

Each scorer sends the prompt or response text to the scoring model (default: `gpt-4o-mini`) with a structured evaluation prompt. The model returns a numeric score and a natural-language rationale explaining the rating.

The scoring model is configured via the `SCORER_MODEL` environment variable on the `safety_worker` service:

```bash theme={null}
SCORER_MODEL=gpt-4o-mini   # default — fast and cost-effective
SCORER_MODEL=gpt-4o        # higher accuracy, higher cost
```

PyRIT wraps the scoring model call and normalises the output into a structured `SafetyScore` object with:

* `score_value`: integer 1–5
* `score_label`: `safe` | `unsafe` | `unscored`
* `rationale`: free-text explanation from the scoring model

## Fallback behaviour

If `OPENAI_API_KEY` is not set on the safety worker, PyRIT falls back gracefully:

* `score_value`: `null`
* `score_label`: `unscored`
* `rationale`: `"Scoring skipped: no OPENAI_API_KEY configured"`

This ensures the worker never crashes due to missing credentials — requests continue to be logged and observed even without scoring.

## Alerting thresholds

The dashboard allows you to configure alert thresholds per scorer. For example: send a Slack notification when any `jailbreak_detection` score reaches 4 or above, or when the rolling average `content_safety` score for a project exceeds 2.5.

Alert configuration is available on the Pro tier and above.

## Interpreting scores

<Tip>
  A single score-4 event is not necessarily cause for alarm — it may reflect an edge case in the scoring prompt or an ambiguous input. Look for patterns: repeated high scores from the same user, a spike in 4–5 scores over a short window, or consistently elevated scores on a particular endpoint.
</Tip>

The `rationale` field from the scoring model is visible in the dashboard on the per-request detail view. It explains why the model assigned that score, which helps distinguish genuine threats from scoring artefacts.
