Content
-
Preliminaries
前导知识,没怎么看懂,不过好像不影响后面
-
Techniques for Conditional Indistinguishability
-
Counterfactual Epistemic Operators
介绍了两个操作符,主要为形式化公平这个属性做准备
-
Conditional Indistinguishability via Counterfactual Knowledge
如何用上文描述的两个操作符去表示’Conditional Indistinguishability’
-
-
Formal Model for Statistical Classification
-
Statistical Classification Problems
给出了一些定义:
- C : D → L C: D \rightarrow L C:D→L, L be a finite set of class labels, D be the finite set of input data (called feature vectors) that we want to classify.
- f : D × L → R f: D \times L \rightarrow R f:D×L→R: be a scoring function that gives a score f(v, ℓ) of predicting the class of an input datum (feature vector) v as a label ℓ.
- H ( v ) = l H(v) = l H(v)=l: to represent that a label ℓ maximizes f(v, ℓ).
-
Modeling the Behaviours of Classifiers
给出了两个公式
s ∣ = ψ ( x , y ) iff C ( σ s ( x ) ) = σ s ( y ) s ∣ = h ( x , y ) iff H ( σ s ( x ) ) = σ s ( y ) \begin{array}{ll}{s |= \psi(x, y)} & {\text { iff } C\left(\sigma_{s}(x)\right)=\sigma_{s}(y)} \\ {s |= h(x, y)} & {\text { iff } H\left(\sigma_{s}(x)\right)=\sigma_{s}(y)}\end{array} s∣=ψ(x,y)s∣=h(x,y) iff C(σs(x))=σs(y) iff H(σs(x))=σs(y)
ψ(x, y) to represent that C classifies a given input x as a class y.
h(x, y) to represent that y is the actual class of an input x.
-
-
Formalizing the Classification Performance
-
形式化 correctness
-
-
true positive: s ∣ = ψ ℓ ( x ) ∧ h ℓ ( x ) s |= \psi_{\ell}(x) \wedge h_{\ell}(x) s∣=ψℓ(x)∧hℓ(x).
-
the precision being within an interval I is given by:
$\operatorname{Pr}\left[v \stackrel{$}{\leftarrow} \sigma_{w_{\mathrm{real}}}(x) : H(v)=\ell | C(v)=\ell\right] \in I$ or $\operatorname{Pr}\left[s \stackrel{$}{\leftarrow} w_{\mathrm{real}} : s|=h_{\ell}(x) \ | \ s |= \psi_{\ell}(x)\right] \in I$.
-
Precisione ℓ , I ( x ) = def ψ ℓ ( x ) ⊃ P I h ℓ ( x ) \text {Precisione}_{\ell, I}(x) \stackrel{\text { def }}{=} \psi_{\ell}(x) \supset \mathbb{P}_{I} h_{\ell}(x) Precisioneℓ,I(x)= def ψℓ(x)⊃PIhℓ(x) and precision = t p t p + f p \text { precision }=\frac{t p}{t p+f p} precision =tp+fptp.
-
Recall ℓ , I ( x ) = def h ℓ ( x ) ⊃ P I ψ ℓ ( x ) \operatorname{Recall}_{\ell, I}(x) \stackrel{\text { def }}{=} h_{\ell}(x) \supset \mathbb{P}_{I} \psi_{\ell}(x) Recallℓ,I(x)= def hℓ(x)⊃PIψℓ(x) and recall = t p t p + f n \text {recall}=\frac{t p}{t p+f n} recall=tp+fntp.
-
Accuracy ℓ , I ( x ) = def P I ( tp ( x ) ∨ tn ( x ) ) \begin{array}{l}{\text{Accuracy}_{\ell, I}(x) \stackrel{\text { def }}{=}} {\mathbb{P}_{I}(\operatorname{tp}(x) \vee \operatorname{tn}(x))}\end{array} Accuracyℓ,I(x)= def PI(tp(x)∨tn(x)) and T P + T N T P + T N + F P + F N \frac{T P+T N}{T P+T N+F P+F N} TP+TN+FP+FNTP+TN.
-
-
Formalizing the Robustness of Classifiers
- Probabilistic Robustness against Targeted Attacks
- 定义:When a robustness attack aims at misclassifying an input as a specific target label, then it is called a targeted attack.
- K ε D φ \mathrm{K}_{\varepsilon}^{D} \varphi KεDφ represents that the classifier C is confident that ϕ is true as far as it classifies the test data that are perturbed by a level ε of noise.
- D defined by D ( σ w ( x ) ∥ σ w ′ ( x ) ) = max v , v ′ ∥ v − v ′ ∥ p D\left(\sigma_{w}(x) \| \sigma_{w^{\prime}}(x)\right)=\max _{v, v^{\prime}}\left\|v-v^{\prime}\right\|_{p} D(σw(x)∥σw′(x))=maxv,v′∥v−v′∥p where v and v′ range over the datasets supp(σw(x)) and supp(σw′ (x)) respectively.
- 以下是给出的公式:
- h panda ( x ) ⊃ K ε D P 0 ψ gibon ( x ) h_{\text {panda }}(x) \supset K_{\varepsilon}^{D} \mathbb{P}_{0} \psi_{\text {gibon }}(x) hpanda (x)⊃KεDP0ψgibon (x), which represents that a panda’s photo x will not be recognized as a gibbon at all after the photo is perturbed by noise.
- Target Robust p a n d a , δ ( x , gibbon ) = def K ε D ( h panda ( x ) ⊃ P [ 0 , δ ] ψ gibbon ( x ) ) \text { Target Robust}_{panda, \delta}(x, \text { gibbon }) \stackrel{\text { def }}{=} K_{\varepsilon}^{D}\left(h_{\text {panda }}(x) \supset \mathbb{P}_{[0, \delta]} \psi_{\text {gibbon }}(x)\right) Target Robustpanda,δ(x, gibbon )= def KεD(hpanda (x)⊃P[0,δ]ψgibbon (x)).
- Probabilistic Robustness against Non-Targeted Attacks
- TotalRobust ℓ , I ( x ) = def K ε D ( h ℓ ( x ) ⊃ P I ψ ℓ ( x ) ) = K ε D Recall ℓ , I ( x ) \text { TotalRobust}_{\ell, I}(x) \stackrel{\text { def }}{=} K_{\varepsilon}^{D}\left(h_{\ell}(x) \supset \mathbb{P}_{I} \psi_{\ell}(x)\right)=K_{\varepsilon}^{D} \operatorname{Recall}_{\ell, I}(x) TotalRobustℓ,I(x)= def KεD(hℓ(x)⊃PIψℓ(x))=KεDRecallℓ,I(x).
- 结论:
- TotalRobust p a n d a , I ( x ) implies TargetRobust panda , δ ( x , gibbon ) \text { TotalRobust}_{panda,I}(x) \text { implies TargetRobust }_{\text {panda }, \delta}(x, \text { gibbon }) TotalRobustpanda,I(x) implies TargetRobust panda ,δ(x, gibbon ).
- robustness can be regarded as recall in the presence of perturbed noise.
- Probabilistic Robustness against Targeted Attacks
-
Formalizing the Fairness of Classifiers
- 符号定义
- s ∣ = η G ( x ) iff σ s ( x ) ∈ G s |= \eta_{G}(x) \text { iff } \sigma_{s}(x) \in G s∣=ηG(x) iff σs(x)∈G.
- w ∣ = ξ d iff σ w ( x ) = d w|=\xi_{d} \text { iff } \sigma_{w}(x)=d w∣=ξd iff σw(x)=d.
- Group Fairness (Statistical Parity)
- 定义:the property that the output distributions of the classifier are identical for different groups.
- R ε = def { ( w , w ′ ) ∈ W × W ∣ D ( σ w ( y ) ∥ σ w ′ ( y ) ) ≤ ε } \mathcal{R}_{\varepsilon} \stackrel{\text { def }}{=}\left\{\left(w, w^{\prime}\right) \in \mathcal{W} \times \mathcal{W} | D\left(\sigma_{w}(y) \| \sigma_{w^{\prime}}(y)\right) \leq \varepsilon\right\} Rε= def {(w,w′)∈W×W∣D(σw(y)∥σw′(y))≤ε}.
- M , w = P ε ‾ φ iff there exists a w ′ s.t. ( w , w ′ ) ∉ R ε and M , w ′ ∣ = φ \mathfrak{M}, w=\overline{\mathrm{P}_{\varepsilon}} \varphi \text { iff there exists a } w^{\prime} \text { s.t. }\left(w, w^{\prime}\right) \notin \mathcal{R}_{\varepsilon} \text { and } \mathfrak{M}, w^{\prime} |= \varphi M,w=Pεφ iff there exists a w′ s.t. (w,w′)∈/Rε and M,w′∣=φ.
- GrpFair ( x , y ) = def ( η G 0 ( x ) ∧ ψ ( x , y ) ) ⊃ ¬ P ε t v ‾ P 1 ( ξ d ∧ η G 1 ( x ) ∧ ψ ( x , y ) ) \text { GrpFair }(x, y) \stackrel{\text { def }}{=}\left(\eta_{G_{0}}(x) \wedge \psi(x, y)\right) \supset \neg \overline{\operatorname{P}_\varepsilon ^{\mathrm{tv}}} \mathbb{P}_{1}\left(\xi_{d} \wedge \eta_{G_{1}}(x) \wedge \psi(x, y)\right) GrpFair (x,y)= def (ηG0(x)∧ψ(x,y))⊃¬PεtvP1(ξd∧ηG1(x)∧ψ(x,y)).
- Individual Fairness (as Lipschitz Property)
- the property that the classifier outputs similar labels given similar inputs.
- R ε r , D = def { ( w , w ′ ) ∈ W × W ∣ v ∈ supp ( σ w ( x ) ) , v ′ ∈ supp ( σ w ′ ( x ) ) D ( σ w ( y ) ∥ σ w ′ ( y ) ) ≤ ε ⋅ r ( v , v ′ ) } \mathcal{R}_{\varepsilon}^{r, D} \stackrel{\text { def }}{=}\left\{\left(w, w^{\prime}\right) \in \mathcal{W} \times \mathcal{W} | \begin{array}{c}{v \in \operatorname{supp}\left(\sigma_{w}(x)\right), v^{\prime} \in \operatorname{supp}\left(\sigma_{w^{\prime}}(x)\right)} \\ {D\left(\sigma_{w}(y) \| \sigma_{w^{\prime}}(y)\right) \leq \varepsilon \cdot r\left(v, v^{\prime}\right)}\end{array}\right\} Rεr,D= def {(w,w′)∈W×W∣v∈supp(σw(x)),v′∈supp(σw′(x))D(σw(y)∥σw′(y))≤ε⋅r(v,v′)}.
- lndFair ( x , y ) = def ψ ( x , y ) ⊃ ¬ P ε r , D ‾ P 1 ( ξ d ∧ ψ ( x , y ) ) \operatorname{lndFair}(x, y) \stackrel{\text { def }}{=} \psi(x, y) \supset \neg \overline{\mathrm{P}_{\varepsilon}^{r, D}} \mathbb{P}_{1}\left(\xi_{d} \wedge \psi(x, y)\right) lndFair(x,y)= def ψ(x,y)⊃¬Pεr,DP1(ξd∧ψ(x,y)).
- Equal Opportunity
- the property that the recall (true positive rate) is the same for all the groups.
- E q O p p ( x ) = def ( η G ( x ) ∧ ψ ( x , y ) ) ⊃ ¬ P 0 t v ‾ P 1 ( ξ d ∧ ¬ η G ( x ) ∧ ψ ( x , y ) ) \mathrm{EqOpp}(x) \stackrel{\text { def }}{=}\left(\eta_{G}(x) \wedge \psi(x, y)\right) \supset \neg\overline{\mathrm{P}_{0}^{\mathrm{tv}}} \mathbb{P}_{1}\left(\xi_{d} \wedge \neg \eta_{G}(x) \wedge \psi(x, y)\right) EqOpp(x)= def (ηG(x)∧ψ(x,y))⊃¬P0tvP1(ξd∧¬ηG(x)∧ψ(x,y)).
- 符号定义