# Sample-to-Alarm Bridge: Mathematical Derivation

This document derives the analytic bounds implemented in
`scitex_seizure_metrics.bridge.sample_to_alarm` and its inverse
`bridge.alarm_to_sample`. The bounds let a per-window classification
report (sensitivity, specificity, prevalence) be translated into a
per-seizure alarm report (alarm sensitivity, false-positive rate per
hour) under an explicit alarm policy — and vice versa.

The motivation, after Andrade et al. 2024, is that the two evaluation
regimes can give very different verdicts on the same model on the same
data, and a paper that reports only one is not comparable to a paper
that reports only the other. This bridge gives the *analytic* envelope
that the other regime must live inside, given what the first regime
reported, without re-running anything.

## Setup and notation

Let a classifier emit a binary prediction $\hat{y} \in \{0, 1\}$ for
each fixed-length prediction window, with cadence $\Delta$ seconds
between consecutive predictions. Ground truth $y \in \{0, 1\}$ labels
each window as pre-ictal ($y = 1$) or non-preictal ($y = 0$).

Define the per-window classification quantities:

$$
s = \Pr(\hat{y} = 1 \mid y = 1) \quad \text{(sample sensitivity)}
$$

$$
\alpha = \Pr(\hat{y} = 1 \mid y = 0) = 1 - \text{specificity} \quad \text{(per-window FPR)}
$$

$$
\pi = \Pr(y = 1) \quad \text{(prevalence)}
$$

The alarm policy fixes:

| Symbol | Meaning | SSM field |
| ------ | ------- | --------- |
| $\Delta$ | seconds between prediction windows | `cadence_seconds` |
| $\text{SOP}$ | Seizure Occurrence Period (s) | `sop_seconds` |
| $R$ | refractory: minimum gap between alarms (s) | `refractory_seconds` |
| $T$ | total observation time (s) | (input to `evaluate`) |

We say "an alarm fires for seizure $i$" if at least one prediction
window inside the seizure's SOP is above threshold.

## Sample → Alarm

### Step 1. Number of independent prediction windows per SOP

In one SOP of duration $\text{SOP}$ seconds with prediction cadence
$\Delta$ seconds, the number of distinct prediction windows that could
fire an alarm for that seizure is

$$
K = \left\lceil \frac{\text{SOP}}{\Delta} \right\rceil .
$$

### Step 2. Prevalence-adjusted effective K

Low-prevalence streams may not actually *contain* $K$ pre-ictal-labelled
windows inside a given SOP — many of the $K$ windows could be
unlabelled as pre-ictal. The bridge therefore uses an effective $K$
that adjusts for prevalence:

$$
K_{\text{eff}} = \min\!\Big( K,\ \max(1, \operatorname{round}(K \cdot \pi)) \Big) .
$$

When $\pi = 1$ (every window inside the SOP is pre-ictal) this reduces
to $K_{\text{eff}} = K$. When $\pi$ is small (most windows inside the
SOP carry no pre-ictal label), $K_{\text{eff}}$ shrinks toward 1.

### Step 3. Alarm sensitivity bounds

Let "alarm fires for a given seizure" be the event that at least one of
the $K_{\text{eff}}$ candidate windows is above threshold. Under the
**independent-errors** assumption — the optimistic envelope — the
probability that none of the $K_{\text{eff}}$ windows fires is
$(1 - s)^{K_{\text{eff}}}$, so

$$
\boxed{\ \text{alarm\_sens}_{\text{upper}} = 1 - (1 - s)^{K_{\text{eff}}} \ }
$$

Under **fully-clustered errors** — the pessimistic envelope, where the
classifier's window-level decisions inside one SOP are perfectly
correlated, so either all $K_{\text{eff}}$ fire or none does — the
alarm probability collapses to the per-window sensitivity:

$$
\boxed{\ \text{alarm\_sens}_{\text{lower}} = s \ }
$$

These two bounds are tight envelopes: any real classifier whose
window-level errors have positive but partial correlation will fall
between them.

### Step 4. False-positive rate per hour

The naive predictions-per-hour budget is

$$
N_{\text{preds/h}} = \frac{3600}{\Delta} .
$$

Of these, a fraction $(1 - \pi)$ are non-pre-ictal-labelled, and the
classifier mis-fires on each of them with probability $\alpha$. So the
expected naive FP count per hour is

$$
\text{FP/h}_{\text{naive}} = \alpha \cdot N_{\text{preds/h}} \cdot (1 - \pi) .
$$

A refractory period $R > 0$ caps the maximum number of distinct alarms
per hour at $3600 / R$. The **upper bound** is the minimum of the two:

$$
\boxed{\ \text{FP/h}_{\text{upper}} = \min\!\Big(\, \alpha \cdot \tfrac{3600}{\Delta} \cdot (1 - \pi),\ \tfrac{3600}{R}\, \Big) \ }
$$

(When $R = 0$ the cap is dropped and the upper bound is just the naive
expression.)

The **lower bound** is reported as

$$
\boxed{\ \text{FP/h}_{\text{lower}} = 0 \ }
$$

by convention: a non-trivial lower bound requires an FP-correlation
parameter (autocorrelation length, burst statistics) that the per-window
metrics alone do not pin down.

## Alarm → Sample (inverse)

Given a published alarm-based pair $(\text{alarm\_sens}, \text{FP/h})$,
the same constraint runs in reverse to give the feasible per-window
metric ranges.

### Sensitivity bounds

Inverting Step 3's upper bound (independent errors) gives the smallest
per-window $s$ consistent with the reported alarm sensitivity:

$$
s_{\text{lower}} = 1 - (1 - \text{alarm\_sens})^{1 / K_{\text{eff}}}
$$

The trivial upper bound is obtained from Step 3's lower bound (fully
clustered errors), where $s = \text{alarm\_sens}$:

$$
s_{\text{upper}} = \min(1, \text{alarm\_sens}) .
$$

### Specificity bounds

Inverting the naive $\text{FP/h}$ expression — first applying the
refractory cap to the input —

$$
\text{FP/h}_{\text{eff}} = \min\!\Big(\, \text{FP/h},\ \tfrac{3600}{R}\, \Big)
$$

then solving for $\alpha$:

$$
\alpha_{\min} = \frac{\text{FP/h}_{\text{eff}}}{\, N_{\text{preds/h}} \cdot (1 - \pi)\, } .
$$

The upper bound on specificity is therefore

$$
\text{specificity}_{\text{upper}} = 1 - \alpha_{\min} ,
$$

and we report $\text{specificity}_{\text{lower}} = 0$ for the same
correlation-uncertainty reason as the FP/h lower bound.

## Properties and edge cases

- **Width of the alarm-sensitivity band.** The gap
  $\text{alarm\_sens}_{\text{upper}} - \text{alarm\_sens}_{\text{lower}}$
  grows with $K_{\text{eff}}$. For $K_{\text{eff}} = 1$ the bounds
  coincide at $s$ (one chance per seizure → no opportunity for
  independent retries). For large $K_{\text{eff}}$ the upper bound
  approaches 1 even for modest $s$, which is exactly the Andrade-2024
  observation that the two regimes can disagree.

- **Refractory dominance.** If $\alpha \cdot N_{\text{preds/h}} \cdot
  (1 - \pi) > 3600 / R$, the refractory cap binds and the classifier's
  $\alpha$ becomes invisible to the FP/h reporter: any further
  degradation in specificity is absorbed by the cap. The bridge
  surfaces this case via the `notes` field on
  `SampleToAlarmBounds`.

- **Prevalence assumptions.** $\pi$ is a single per-window scalar in
  this bridge. Real streams may have time-varying prevalence (e.g.,
  the diurnal seizure-rate periodicity Karoly 2017 documents). The
  bridge does not model that variation. A conservative practice is to
  evaluate the bridge under both the empirical $\pi$ and a low-$\pi$
  alternative ($\pi = 0.01$ for fully-streaming evaluation) and
  report the broader of the two envelopes.

- **`fp_per_hour_lower = 0` is a convention, not a theorem.** A
  classifier whose errors are independent will sit close to
  $\text{FP/h}_{\text{upper}}$; one whose errors are tightly clustered
  can produce far fewer distinct alarms. Without an extra parameter
  describing the clustering, "0" is the only lower bound we can
  justify analytically.

## Worked example

Suppose a PAC-based classifier reports $s = 0.6$, specificity $0.85$
($\alpha = 0.15$), prevalence $\pi = 0.5$ (balanced seizure /
interictal-control windows per ADR-0007 of the consuming project), with
$\text{SOP} = 1800\,\text{s}$, $\Delta = 30\,\text{s}$,
$R = 1800\,\text{s}$.

- $K = \lceil 1800 / 30 \rceil = 60$.
- $K_{\text{eff}} = \min(60,\ \max(1,\ \operatorname{round}(60 \cdot 0.5))) = 30$.
- $\text{alarm\_sens}_{\text{upper}} = 1 - 0.4^{30} \approx 1.000$.
- $\text{alarm\_sens}_{\text{lower}} = 0.6$.
- $N_{\text{preds/h}} = 120$.
- $\text{FP/h}_{\text{naive}} = 0.15 \cdot 120 \cdot 0.5 = 9.0$.
- $\text{FP/h}_{\text{cap}} = 3600 / 1800 = 2.0$.
- $\text{FP/h}_{\text{upper}} = \min(9.0,\ 2.0) = 2.0$ — *refractory
  dominates*.
- $\text{FP/h}_{\text{lower}} = 0$.

The bridge therefore reports: alarm sensitivity in $[0.6,\ \approx 1.0]$,
$\text{FP/h}$ in $[0,\ 2.0]$. The 40-point gap on alarm sensitivity is
the regime-disagreement region — exactly what Andrade 2024 says you
need to surface in any honest report.

## References

- Andrade et al. 2024 — *Sample- vs alarm-based perspectives on
  seizure-prediction performance.*
- Mormann et al. 2007 — *Seizure prediction: the long and winding
  road.* (false-prediction rate definition).
- Code: `src/scitex_seizure_metrics/bridge.py::sample_to_alarm` /
  `::alarm_to_sample`.
- Companion: `src/scitex_seizure_metrics/policy.py::AlarmPolicy` (the
  policy object that fixes $\Delta$, SOP, $R$, denominator
  convention).