机器学习学习笔记 PRML Chapter 1.5 : Decision Theory

最新推荐文章于 2023-07-07 19:54:06 发布

GloryOfFamily

最新推荐文章于 2023-07-07 19:54:06 发布

阅读量1.4k

点赞数

分类专栏： machine learning 机器学习 PRML 文章标签：机器学习模式识别 PRML教材

本文链接：https://blog.csdn.net/ccj5351/article/details/51749451

版权

本文详细介绍了PRML中的决策理论，包括概率理论、决策理论和信息理论在模式识别中的应用。通过医学诊断问题举例，阐述了如何在不确定性中进行最优决策，探讨了最小化误分类率和期望损失的方法。讨论了两种错误类型的损失函数，并引入了拒绝选项来避免在不确定情况下做决策。最后，比较了生成模型、判别模型和判别函数在决策问题上的优缺点，并介绍了回归问题中的损失函数。

摘要由CSDN通过智能技术生成

Chapter 1.5 : Decision Theory

PRML, OXford University Deep Learning Course, Machine Learning, Pattern Recognition
Christopher M. Bishop, PRML, Chapter 1 Introdcution

1. PRML所需要的三论:

Probability theory: provides us with a consistent mathematical framework for quantifying and manipulating uncertainty.
Decision theory: allows us to make optimal decisions in situations involving uncertainty such as those encountered in pattern recognition.
Information theory:

Inference step & Decision step
- The joint probability distribution $p(x, t)$ provides a complete summary of the uncertainty associated with these variables. Determination of $p(x, t)$ from a set of training data is an example of inference and is typically a very difficult problem whose solution forms the subject of much of this book.
- In a practical application, however, we must often make a specific prediction for the value of t, or more generally take a specific action based on our understanding of the values $t$ is likely to take, and this aspect is the subject of decision theory.

2. An example

Problem Description:

Consider, for example, a medical diagnosis problem in which we have taken an X-ray image of a patient, and we wish to determine whether the patient has cancer or not.

Representation: choose t to be a binary variable such that t=0 corresponds to class C1 and t=1 corresponds to class C2 .
- Inference Step: The general inference problem then involves determining the joint distribution $p(x, C_k )$ , or equivalently $p(x, t)$ , which gives us the most complete probabilistic description of the situation.
- Decision Step: In the end we must decide either to give treatment to the patient or not, and we would like this choice to be optimal in some appropriate sense. This is the decision step, and it is the subject of decision theory to tell us how to make optimal decisions given the appropriate probabilities.
- How to predict?
  
  Using Bayes’ theorem, these probabilities can be expressed in the form
  
  $P o s t e r i o r p (C k ∣ x) = L i k e l i h o o d \cdot P r i o r E v i d e n c e ⟺ = p ( x ∣ C k ) p ( C k ) p ( x ) = p ( x ∣ C k ) p ( C k ) \sum 2 j = 1 p ( x ∣ C j ) p ( C j ) = p ( x ∣ C k ) p ( C k ) p ( x ∣ C 1 ) p ( C 1 ) + p ( x ∣ C 2 ) p ( C 2 )$ $\begin{split} Posterior &= \frac{Likelihood \cdot Prior}{Evidence} \iff \\ p(C_k \mid x) &= \frac{p(x \mid C_k)p(C_k)}{p(x)}\\ &= \frac{p(x \mid C_k)p(C_k)}{\sum_{j=1}^2{p(x \mid C_j)p(C_j)} } \\ &= \frac{p(x \mid C_k)p(C_k)}{{p(x \mid C_1)p(C_1) + p(x \mid C_2)p(C_2)} } \end{split}$
  If our aim is to minimize the chance of assigning $x$ to the wrong class $C_k, k = 1, 2$ , then intuitively we would choose the class having the higher posterior probability. We now show that this intuition is correct, and we also discuss more general criteria for making decisions.
  
  Our objectives vary among those:
  - Minimizing the misclassification rate;
  - Minimizing the expected loss;
  
  补充：Criteria for making decisions【Ref -1】
  1) Minimizing the misclassification rate.
  2) minimizing the expected loss: 两类错误的后果可能是不同的,例如“把癌症诊断为无癌症”的后果比“把无癌症诊断为癌症”的后果更严重,又如“把正常邮件诊断为垃圾邮件”的后果比“把垃圾邮件诊断为正常邮件”的后果更严重;这时候,少犯前一错误比少犯后一错误更有意义。为此需要 loss function 对不同的错误的代价做量化。
  设集合 A={