时间:2020-12-02
目的:掌握图像分割的基础算法
1. 算法原理
图像分割,顾名思义将图像中的目标和背景进行区分。通常我们使用固定的阈值进行二值化,但是阈值如何迭代筛选。这个过程可不可以使用程序帮我们迭代呢?迭代条件又是什么呢?
大津算法,其根据直方图(假定前景和背景在直方图上呈现出两峰的情况),计算能将两类分开的最佳阈值(前景和背景类间方差最大),然后根据求得的最佳阈值对图像进行全局二值化。大津算法满足我们前述的疑问,只要找到一个直方图基本符合双峰情况,且存在满足类间方差最大的灰阶即可。
方差(英语:Variance),应用数学里的专有名词。在概率论和统计学中,一个随机变量的方差描述的是它的离散程度,也就是该变量离其期望值的距离。一个实随机变量的方差也称为它的二阶矩或二阶中心矩,恰巧也是它的二阶累积量。这里把复杂说白了,就是将各个误差之平方(而非取绝对值,使之肯定为正数),相加之后再除以总数,透过这样的方式来算出各个数据分布、零散(相对中心点)的程度。
首先,对后续的公式变量进行说明。
ω 0 \omega_{0} ω0:前景类概率, ω 1 \omega_{1} ω1:背景类概率
μ 0 \mu_{0} μ0:前景类均值, μ 1 \mu_{1} μ1:背景类均值
σ w \sigma_{w} σw:类方差, t t t:直方图阈值
L L L:灰阶数( 0 ∼ 255 0\sim255 0∼255)
首先,定义两个类的方差加权和:
σ
w
2
(
t
)
=
ω
0
(
t
)
σ
0
2
(
t
)
+
ω
1
(
t
)
σ
1
2
(
t
)
\sigma_{w}^{2}(t)=\omega_{0}(t) \sigma_{0}^{2}(t)+\omega_{1}(t) \sigma_{1}^{2}(t)
σw2(t)=ω0(t)σ02(t)+ω1(t)σ12(t)
然后,给出前景的类概率,即前景像素点个数:
ω
0
(
t
)
=
∑
i
=
0
t
−
1
p
(
i
)
\omega_{0}(t)=\sum_{i=0}^{t-1} p(i)
ω0(t)=i=0∑t−1p(i)
然后,给出背景的类概率,即背景像素点个数:
ω
1
(
t
)
=
∑
i
=
t
L
−
1
p
(
i
)
\omega_{1}(t)=\sum_{i=t}^{L-1} p(i)
ω1(t)=i=t∑L−1p(i)
将公式(2~3)代入公式(1)有类间方差公式:
σ
b
2
(
t
)
=
σ
2
−
σ
w
2
(
t
)
=
ω
0
(
μ
0
−
μ
T
)
2
+
ω
1
(
μ
1
−
μ
T
)
2
=
ω
0
(
t
)
ω
1
(
t
)
[
μ
0
(
t
)
−
μ
1
(
t
)
]
2
\begin{aligned} \sigma_{b}^{2}(t) &=\sigma^{2}-\sigma_{w}^{2}(t)=\omega_{0}\left(\mu_{0}-\mu_{T}\right)^{2}+\omega_{1}\left(\mu_{1}-\mu_{T}\right)^{2} \\ &=\omega_{0}(t) \omega_{1}(t)\left[\mu_{0}(t)-\mu_{1}(t)\right]^{2} \end{aligned}
σb2(t)=σ2−σw2(t)=ω0(μ0−μT)2+ω1(μ1−μT)2=ω0(t)ω1(t)[μ0(t)−μ1(t)]2
最后,补充类均值计算公式,即前景(背景或者全局)对应像素点的灰度均值:
μ
0
(
t
)
=
∑
i
=
0
t
−
1
i
p
(
i
)
ω
0
(
t
)
μ
1
(
t
)
=
∑
i
=
t
L
−
1
i
p
(
i
)
ω
1
(
t
)
\begin{aligned} \mu_{0}(t) &=\frac{\sum_{i=0}^{t-1} i p(i)}{\omega_{0}(t)} \\ \mu_{1}(t) &=\frac{\sum_{i=t}^{L-1} i p(i)}{\omega_{1}(t)} \end{aligned}
μ0(t)μ1(t)=ω0(t)∑i=0t−1ip(i)=ω1(t)∑i=tL−1ip(i)
需要注意,灰阶:
L
:
0
∼
255
L:{0\sim 255}
L:0∼255 不一定是固定在该范围,需要根据实际情况进行迭代。典型诸如图像所在全局范围为:
50
∼
220
50\sim 220
50∼220,则循环时要注意换算到
0
∼
255
0\sim 255
0∼255。
2. 算法流程
- 计算每个强度级的直方图和概率
- 设置 ω i , μ i \omega_{i}, \mu_{i} ωi,μi 的初始值;
- 遍历所有可能的阈值(完整灰阶
L
:
t
=
m
i
n
∼
m
a
x
L: t = min \sim max
L:t=min∼max)
- 更新 ω i , μ i \omega_{i}, \mu_{i} ωi,μi;
- 计算 σ b 2 ( t ) \sigma_{b}^{2}(t) σb2(t);
- 所需的阈值对应于最大的 σ b 2 ( t ) \sigma_{b}^{2}(t) σb2(t) 的灰阶 t t t,即最终的阈值;
- 如果算力允许,可以求出最大和次大的 σ b 2 ( t ) \sigma_{b}^{2}(t) σb2(t) 对应的 t m a x 、 t m a x − 1 t_{max}、t_{max-1} tmax、tmax−1;
- 然后再取两者的均值即可。
3. 代码实现
代码参考:
WIKE Code
typedef struct img_size_s {
unsigned int width;
unsigned int height;
unsigned int channels;
unsigned int stride;
} img_size_t;
/**
* Cal image histogram
* \param: input image.
* \param: image's hist.
* \param: input image's param.
*/
int calImageHist(
unsigned char *src,
unsigned int *hist,
img_size_t *src_para
) {
// param and buffer check.
if (!src || !hist || (src_para->width == 0 || src_para->height == 0)) {
return -1;
}
long int img_pixel_num = src_para->height * src_para->width;
int i = 0;
// init.
for (i = 0; i < MAX_INTENSITY; ++i)
hist[i] = 0;
if (src_para->channels == 3) {
unsigned char *y_src = (unsigned char *)malloc(img_pixel_num * sizeof(unsigned char));
img_size_t y_para;
y_para.channels = 1;
y_para.width = src_para->width;
y_para.height = src_para->height;
y_para.stride = src_para->width;
// BGR: colorStyle = 1; RGB: colorStyle = 2.
rgb2Y(src, y_src, src_para, &y_para, 1);
for (i = 0; i < img_pixel_num; ++i) {
hist[y_src[i]]++;
}
free(y_src);
} else if (src_para->channels == 1) {
// normal process.
for (i = 0; i < img_pixel_num; ++i) {
hist[src[i]]++;
}
} else {
return -1;
}
return 0;
}
/**
* Binary image by threshold.
* \param: OTSU generate a std threshold.
*/
int segmentImg(
unsigned char *src,
unsigned char *dst,
img_size_t *src_para,
unsigned char *threshold
) {
long int num_pixels = src_para->width * src_para->height;
if (src_para->channels != 1 && src_para->channels != 3) {
return -1;
}
if (src_para->channels == 3) {
img_size_t dst_para;
dst_para.channels = 1;
dst_para.height = src_para->height;
dst_para.width = src_para->width;
dst_para.stride = src_para->width;
// convert rgb to Y.
rgb2Y(src, dst, src_para, &dst_para, 1);
for (int i = 0; i < num_pixels; ++i) {
if (dst[i] > *threshold) {
dst[i] = MAX_INTENSITY;
} else {
dst[i] = 0;
}
}
} else {
for (int i = 0; i < num_pixels; ++i) {
if (src[i] > *threshold) {
dst[i] = MAX_INTENSITY;
} else {
dst[i] = 0;
}
}
}
return 0;
}
/**
* OTSU algorithm to cal threshold.
* foreground: w1(t) = \sum_{i}^{t}{p(i)}
* background: w2(t) = \sum_{i}^{t}{p(i)} = img_w * img_h - w1(t)
* mean value of foreground: u1(t) = [\sum_{i}^{t}{p(i)} * i] / w1
* mean value of background: u2(t) = [\sum_{i}^{t}{p(i)} * i] / w2
* delta: g = w1 * w2 * (u1 - u2) ^ 2
* iter i, ensure g is max.
*/
int calOtsuSegmentation(
unsigned char *src,
unsigned char *dst,
unsigned int *hist,
img_size_t *src_para,
unsigned char *threshold
) {
long int num_pixels = src_para->height * src_para->width;
int th = 0; // iter threshold.
if (*threshold != 0) {
th = *threshold;
}
// init param: w1, w2, u1, u2
float sum = 0;
float sumF = 0;
int w1 = 0;
int w2 = 0;
float varMax = 0;
for (int i = 0; i <= MAX_INTENSITY; ++i) {
sum += i * ((int)hist[i]);
}
for (int i = 0; i <= MAX_INTENSITY; ++i) {
w1 += hist[i]; // cal w1(t)
if (w1 == 0)
continue;
w2 = num_pixels - w1; // cal w2(t)
if (w2 == 0)
break;
sumF += (float)(i * ((int)hist[i]));
float u1 = sumF / w1; // cal u1(t)
float u2 = (sum - sumF) / w2; // cal u2(t)
float varDelta = ((float)w1 * (float)w2 * (u1 - u2) * (u1 - u2));
if (varDelta > varMax) {
varMax = varDelta;
*threshold = i;
}
}
segmentImg(src, dst, src_para, threshold);
return 0;
}