概率论：p(x|theta)和p(x;theta)的区别_p(x;θ)是什么意思-CSDN博客

求解最大似然估计时发现有两种表示方法

有上述两种方法表示的原因

p(x|theta)不总是代表条件概率；也就是说p(x|theta)不代表条件概率时与p(x;theta)等价

而一般地

写竖杠表示条件概率，是随机变量；

写分号p(x; theta)表示待估参数（是固定的，只是当前未知）,应该可以直接认为是p(x)，加了;是为了说明这里有个theta的参数，p(x; theta)意思是随机变量X=x的概率。在贝叶斯理论下又叫X=x的先验概率。

对于P(y|x;theta)

对于两种表示法，频率派和贝叶斯派的分歧

频率派认为参数为固定的值，是指真实世界中，参数值就是某个定值。

贝叶斯派认为参数是随机变量，是指取这个值是有一定概率的

I believe the origin of this is the likelihood paradigm (though I have not checked the actual historical correctness of the below, it is a reasonable way of understanding how iot came to be).

Let's say in a regression setting, you would have a distribution: p(Y | x, beta) Which means: the distribution of Y if you know (conditional on) the x and beta values.

If you want to estimate the betas, you want to maximize the likelihood: L(beta; y,x) = p(Y | x, beta) Essentially, you are now looking at the expression p(Y | x, beta) as a function of the beta's, but apart from that, there is no difference (for mathematical correct expressions that you can properly derive, this is a necessity --- although in practice noone bothers).

Then, in bayesian settings, the difference between parameters and other variables soon fades, so one started to you use both notations intermixedly.

So, in essence: there is no actual difference: they both indicate the conditional distribution of the thing on the left, conditional on the thing(s) on the right.

例子：

P(y=1|x;θ) 是给定x，θ的条件下y=1的概率，分号用于区别参数
这个的意思是：当参数Θ=θ时,X=x的概率依赖于参数的x的分布或者概率密度p(x;θ)
p(x;2),就是当参数是2的时候,X=x的概率
比方说：
10个球,其中θ个1球,10-θ个0球
从中取一个球,
p(X|θ)=xθ/10+(1-x) (10-θ)/10
就是 x=1,p=θ/10
x=0,p=(10-θ)/10
θ不同,同样x值的概率随之变动。