Conformal Prediction

最新推荐文章于 2024-05-21 23:16:53 发布

FoerKent

最新推荐文章于 2024-05-21 23:16:53 发布

阅读量397

点赞数 4

文章标签：算法人工智能机器学习

本文链接：https://blog.csdn.net/Falcont/article/details/136914496

版权

Conformal Prediction in Classification

Conformal Coverage Guarantee

Given the calibration data set ${\left \{\left ( X_i, {Y}_{i}^{*}\right ) \right \}}_{i=1}^{n}$ and pretrained model $\hat{f}\left ( \cdot\right )$ ( $\hat{f}\left ( X_i\right ) \in {\left [ 0, 1\right ]}^{\left ( K\right )}$ ).
The probability (or confidence) assigned to the true label is ${\hat{f}\left ( X_i\right ) }_{{Y}_{i}^{*}}$ .

Calculate and sort the conformal scores: $s_i= s\left ( X_i, {Y}_{i}^{*}\right ) =1-{\hat{f}\left ( X_i\right ) }_{{Y}_{i}^{*}}$ ( $\left \{s_1 \leq \cdots \leq s_n \right \}$ ).

Obtain the $\frac{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}{n}$ quantile of ${\left \{ s_i\right \}}_{i=1}^{n}$ : $\hat{q}=\inf \left \{ q:\frac{\left | \left \{ i:s_i \leq q\right \}\right |}{n} \geq \frac{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}{n} \right \} = {s}_{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}$ .

Construct the prediction set of $\left ( {X}_{test}, {Y}_{test}^{*}\right )$ : $\mathcal{C}\left ( {X}_{test}\right )=\left \{ y: {\hat{f}\left ( {X}_{test}\right )}_{y} \geq 1-\hat{q} \right \}=\left \{ y: s\left ( {X}_{test}, y\right )\leq \hat{q}\right \}$ .

The event $\left \{ {Y}_{test}^{*} \in \mathcal{C}\left ( {X}_{test}\right ) \right \}$ is equivalent to $\left \{ s\left ( {X}_{test}, {Y}_{test}^{*}\right )\leq \hat{q}\right \}$ .

By the exchangeability of $\left ( X_1, Y_1\right ), \cdots ,\left ( X_n, Y_n\right ), \left ( {X}_{test}, {Y}_{test}^{*}\right )$ , we have $\mathcal{P}\left ( {s}_{test} \leq s_i \right )=\frac{i}{n+1}$ .

Then we get the probability of conformal coverage: $\mathcal{P}\left ( {Y}_{test}^{*} \in \mathcal{C}\left ( {X}_{test}\right ) \right )=\mathcal{P}\left ( {s}_{test} \leq \hat{q} \right )=\frac{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}{n+1}$ .
The lower bound is $1-\alpha$ , and the upper bound is $1-\alpha + \frac{1}{n+1}$ .

Classification with Adaptive Prediction Set

Given ${\left \{ {\pi}_{k}\left ( X_i\right )\right \}}_{k=1}^{K}$ as the permutation of ${\left \{ k\right \}}_{k=1}^{K}$ that sorts $\hat{f}\left ( X_i\right )$ ( $\left \{ \hat{{f\left (X_i \right )}_{{\pi}_{1}\left ( X_i\right )}} \geq \cdots \geq \hat{{f\left (X_i \right )}_{{\pi}_{K}\left ( X_i\right )}} \right \}$ ).

Calculate the conformal scores: $s_i= s\left ( X_i, {Y}_{i}^{*}\right ) =\textstyle\sum_{j=1}^{k}{\hat{f}\left ( X_i\right )}_{{\pi}_{j}\left ( X_i\right )}$ , where ${\pi}_{k}\left ( X_i\right )={Y}_{i}^{*}$ .

Obtain the $\frac{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}{n}$ quantile of ${\left \{ s_i\right \}}_{i=1}^{n}$ : $\hat{q}$ .

Prediction set: $\mathcal{C}\left ( {X}_{test}\right )=\left \{ {\pi}_{1}\left ( {X}_{test} \right ), \cdots , {\pi}_{k}\left ( {X}_{test} \right )\right \}$ , where $k=\sup \left \{ {k}^{\prime}:\textstyle\sum_{j=1}^{{k}^{\prime}} {\hat{f}\left ({X}_{test} \right )}_{{\pi}_{j}\left ( {X}_{test}\right )} < \hat{q} \right \} + 1$ .

(proof of bound not completed)

LLMs with Conformal Factualiy Guarantees

Given an input ${X}_{test} \in \mathcal{X}$ , we get an output $\mathcal{L}\left ( {X}_{test}\right ) \in \mathcal{Y}$ . The goal is $\mathcal{P}\left ( \mathcal{L}\left ( {X}_{test}\right ) ; is ; correct \right ) \geq 1-\alpha $.

The correctness of $\mathcal{L} \left ( {X}_{test}\right )$ is equivalent to the entailment relation ${Y}_{test}^{*}\Rightarrow \mathcal{L}\left ( {X}_{test}\right )$ .

Define the entailment set of $\mathcal{L}\left ( {X}_{test}\right ) $: $\mathcal{E}\left ( \mathcal{L}\left ( {X}_{test}\right ) \right ) = \left \{ y\in\mathcal{Y}:y\Rightarrow \mathcal{L}\left ( {X}_{test}\right ) \right \}$ , then $\left\{ {Y}_{test}^{*}\Rightarrow \mathcal{L}\left ( {X}_{test}\right )\right\}$ is equivalent to $\left\{{Y}_{test}^{*} \in \mathcal{E}\left ( \mathcal{L}\left ( {X}_{test}\right ) \right )\right\}$ .

Construct ${\left \{ {\mathcal{F}}_{t}\left ( X_i\right )\right \}}_{t\in \mathcal{T}}$ following the \textit{nested property} (i.e., $\forall t_1,t_2 \in \mathcal{T}, t_1 \leq t_2\rightarrow {\mathcal{F}}_{t_1}\subseteq {\mathcal{F}}_{t_2}$ ), where ${\mathcal{F}}_{t}\left ( X_i\right )$ is the entailment set of $ {F}{t}\left ( X_i, \mathcal{L}\left ( X_i\right )\right )$ (i.e., ${\mathcal{F}}_{t}\left ( X_i\right )=\mathcal{E}\left ( {F}_{t}\left ( X_i, \mathcal{L}\left ( X_i\right )\right )\right )$ ) and ${F}_{t}\left ( X_i, \mathcal{L}\left ( X_i\right )\right )$ is the output calibrated by the ``back off’’ function $F_t(\cdot)$ with the safe threshold $t$ from the base output $\mathcal{L}\left ( X_i\right )$ (by removing unreliable sub-claims). (${F}{\sup \mathcal{T}}\left ( X_i, \mathcal{L}\left ( X_i\right )\right )=\varnothing $, $F_0\left(X_i, \mathcal{L}\left( X_i\right) \right)=\mathcal{L}\left(X_i \right)$ )

Define the conformal scores: $r\left (X_i,{Y}_{i}^{*} \right )=\inf \left \{t:\forall j\geq t,{Y}_{i}^{*} \in\mathcal{F}_j\left ( X_i\right ) \right \}=\inf \left \{t:\forall j\geq t,{Y}_{i}^{*} \Rightarrow F_j\left (X_i,\mathcal{L}\left ( X_i \right ) \right ) \right \}$
(min safe threshold that holds the true label).

Obtain the $\frac{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}{n}$ quantile of ${\left \{ r\left (X_i,{Y}_{i}^{*} \right )\right \}}_{i=1}^{n}$ : $\hat{q}={r}_{\left \lceil \left ( n+1\right )\left ( 1-\alpha\right )\right \rceil}$ (sort $r_1 \leq \cdots \leq r_n$ ).

$\left \{ {r}_{test} \leq \hat{q} \right \}$ is equivalent to $\left \{ {Y}_{test}^{*}\Rightarrow {F}_{\hat{q}}\left ( {X}_{test}, \mathcal{L}\left ( {X}_{test}\right )\right )\right \}$ (i.e., $\hat{q}$ is a safe threshold).

Then $\mathcal{P}\left ( {F}_{\hat{q}}\left ( {X}_{test}, \mathcal{L}\left ( {X}_{test}\right )\right ) \; is \; correct\right )=\mathcal{P}\left ( {r}_{test} \leq \hat{q}\right )=\frac{\left \lceil \left ( n+1\right )\left ( 1-\alpha \right )\right \rceil}{n+1} \in \left[ 1-\alpha, 1-\alpha + \frac{1}{n+1} \right]$ .

We can evaluate the entailment of the current output controlled by $t$ by only evaluating the sub-claims of the base output once and computing the \textit{supremum} (safe threshold) over the sub-claims within the current output.

For sub-claims ${\left \{ c_m\right \}}_{m=1}^{{M}_{it}}$ and ${Y}_{i}^{*} \in \mathcal{Y}$ , $\left \{ {Y}_{i}^{*}\Rightarrow F_t\left ( X_i,\mathcal{L}\left ( X_i\right )\right ) \right \}\Leftrightarrow \left \{\forall m \in {M}_{it}, {Y}_{i}^{*}\Rightarrow c_m \right \}$ , where ${M}_{it}$ is the current number of the sub-claims within the $i$ -th output controlled (or accepted) by $t$ .

Then the conformal scores are: $r\left (X_i,{Y}_{i}^{*} \right )=\inf \left \{t:\forall j\geq t, \forall c\in F_j\left (X_i,\mathcal{L}\left ( X_i\right ) \right ),{Y}_{i}^{*} \Rightarrow c \right \}$ .

As for the entailment set, $\mathcal{E}\left ( F_t\left ( X_i, \mathcal{L}\left ( X_i\right )\right )\right )=\textstyle\bigcap_{m}^{{M}_{it}}\mathcal{E}\left ( c_m\right )$ , and then the conformal scores are $r\left (X_i,{Y}_{i}^{*} \right )=\inf \left \{t:\forall j\geq t, {Y}_{i}^{*} \in \textstyle\bigcap_{m}^{{M}_{ij}}\mathcal{E}\left ( c_m\right ) \right \}$ .

Instead of guaranteeing full factuality, we want $\in \left [ 0,1\right ]$ fraction of the accepted sub-claims (i.e., ${F}_{t}\left ( {X}_{i}, \mathcal{L}\left ( {X}_{i}\right )\right )$ ) to be factual (Partial entailment keeps the min safe threshold small, which mitigates the issue of $\hat{q}$ being so large that the accepted sub-claims are uninformative or even empty).

Then the conformal scores with acceptable entailment level $\in \left [ 0,1\right ]$ are:\ $r_a\left (X_i, {Y}_{i}^{*}\right )=\inf \left \{t \in \mathcal{T}:\forall j\geq t, {\mathcal{M}}_{{Y}_{i}^{*}}\left ( F_j\left ( X_i, \mathcal{L}\left ( X_i\right )\right )\right ) \geq a \right \}$ , where ${\mathcal{M}}_{{Y}_{i}^{*}}\left ( F_j\left ( X_i, \mathcal{L}\left ( X_i\right )\right )\right )=\frac{1}{{M}_{ij}}\textstyle\sum_{m}^{{M}_{ij}}{\textbf{1}}_{{Y}_{i}^{*}\Rightarrow c_m}$ , and ${M}_{ij}$ is the current number of the $i$ -th output controlled by threshold $j$ .

The event $\left \{ r_a\left ( {X}_{test}, {Y}_{test}^{*} \right ) \leq \hat{q}\right \}$ implies $\left \{ {\mathcal{M}}_{{Y}_{test}^{*}}\left ( {F}_{\hat{q}}\left ( {X}_{test}, \mathcal{L}\left ( {X}_{test}\right )\right )\right ) \geq a\right \}$ (not equivalent).

FoerKent

关注

4
点赞
踩
10

收藏

觉得还不错? 一键收藏
1
评论
Conformal Prediction

Given the calibration data set {(Xi,Yi∗)}i=1n{\left \{\left ( X_i, {Y}_{i}^{*}\right ) \right \}}_{i=1}^{n}{(Xi,Yi∗)}i=1n and pretrained model f^(⋅)\hat{f}\left ( \cdot\right )f^(⋅) (f^(Xi)∈[0,1](K)\hat{f}\left ( X_i\right ) \in {\left [ 0, 1\right ]}^
复制链接

扫一扫