图像算法之九：混合高斯模型GMM

最新推荐文章于 2022-08-23 11:44:01 发布

少林达摩祖师

最新推荐文章于 2022-08-23 11:44:01 发布

阅读量914

点赞数

分类专栏：机器学习

机器学习专栏收录该内容

84 篇文章 5 订阅

订阅专栏

一、原理
混合高斯背景建模是基于像素样本统计信息的背景表示方法，利用像素在较长时间内大量样本值的概率密度等统计信息(如模式数量、每个模式的均值和标准差)表示背景，然后使用统计差分(如3σ原则)进行目标像素判断，可以对复杂动态背景进行建模，计算量较大。
在混合高斯背景模型中，认为像素之间的颜色信息互不相关，对各像素点的处理都是相互独立的。对于视频图像中的每一个像素点，其值在序列图像中的变化可看作是不断产生像素值的随机过程，即用高斯分布来描述每个像素点的颜色呈现规律【单模态(单峰)，多模态(多峰)】。

对于多峰高斯分布模型，图像的每一个像素点按不同权值的多个高斯分布的叠加来建模，每种高斯分布对应一个可能产生像素点所呈现颜色的状态，各个高斯分布的权值和分布参数随时间更新。当处理彩色图像时，假定图像像素点R、G、B三色通道相互独立并具有相同的方差。对于随机变量X的观测数据集{x1,x2,…,xN}，xt=(rt,gt,bt)为t时刻像素的样本，则单个采样点xt其服从的混合高斯分布概率密度函数：

其中k为分布模式总数，η(xt,μi,t, τi,t)为t时刻第i个高斯分布，μi,t为其均值，τi,t为其协方差矩阵，δi,t为方差，I为三维单位矩阵，ωi,t为t时刻第i个高斯分布的权重。

详细算法流程：

每个 GMM 由一个 Gaussian 分布组成，每个 Gaussian 称为一个“Component”，这些 Component 线性加成在一起就组成了 GMM 的概率密度函数：

根据上面的式子，如果我们要从 GMM 的分布中随机地取一个点的话，实际上可以分为两步：首先随机地在这个 Component 之中选一个，每个 Component 被选中的概率实际上就是它的系数，选中了 Component 之后，再单独地考虑从这个 Component 的分布中选取一个点就可以了──这里已经回到了普通的 Gaussian 分布，转化为了已知的问题。

那么如何用 GMM 来做 clustering 呢？其实很简单，现在我们有了数据，假定它们是由 GMM 生成出来的，那么我们只要根据数据推出 GMM 的概率分布来就可以了，然后 GMM 的个 Component 实际上就对应了个 cluster 了。根据数据来推算概率密度通常被称作 density estimation ，特别地，当我们在已知（或假定）了概率密度函数的形式，而要估计其中的参数的过程被称作“参数估计”。

二、OpenCV实现
OpenCV中实现了两个版本的高斯混合背景/前景分割方法（Gaussian Mixture-based Background/Foreground Segmentation Algorithm），调用接口很明朗，效果也很好。

BackgroundSubtractorMOG 使用示例

[cpp] view plain copy
int main(){
    VideoCapture video("1.avi");
    Mat frame,mask,thresholdImage, output;
    video>>frame;
    BackgroundSubtractorMOG bgSubtractor(20,10,0.5,false);
    while(true){
        video>>frame;
        ++frameNum;
        bgSubtractor(frame,mask,0.001);
        imshow("mask",mask);
        waitKey(10);
    }
    return 0;
}
构造函数可以使用默认构造函数或带形参的构造函数：

[cpp] view plain copy
BackgroundSubtractorMOG::BackgroundSubtractorMOG()
BackgroundSubtractorMOG::BackgroundSubtractorMOG(int history, int nmixtures,
double backgroundRatio, double noiseSigma=0)
其中history为使用历史帧的数目，nmixtures为混合高斯数量，backgroundRatio为背景比例，noiseSigma为噪声权重。
而调用的接口只有重载操作符()：

[cpp] view plain copy
void BackgroundSubtractorMOG::operator()(InputArray image, OutputArray fgmask, double learningRate=0)
其中image为当前帧图像，fgmask为输出的前景mask，learningRate为背景学习速率。
以下是使用BackgroundSubtractorMOG进行前景/背景检测的一个截图。

BackgroundSubtractorMOG2 使用示例

[cpp] view plain copy
int main(){
    VideoCapture video("1.avi");
    Mat frame,mask,thresholdImage, output;
    //video>>frame;
    BackgroundSubtractorMOG2 bgSubtractor(20,16,true);

    while(true){
        video>>frame;
        ++frameNum;
        bgSubtractor(frame,mask,0.001);
        cout<<frameNum<<endl;
        //imshow("mask",mask);
        //waitKey(10);
    }
    return 0;
}

同样的，构造函数可以使用默认构造函数和带形参的构造函数

[cpp] view plain copy
BackgroundSubtractorMOG2::BackgroundSubtractorMOG2()
BackgroundSubtractorMOG2::BackgroundSubtractorMOG2(int history,
float varThreshold, bool bShadowDetection=true )
history同上，varThreshold表示马氏平方距离上使用的来判断是否为背景的阈值（此值不影响背景更新速率），bShadowDetection表示是否使用阴影检测（如果开启阴影检测，则mask中使用127表示阴影）。
使用重载操作符()调用每帧检测函数：

[cpp] view plain copy
void BackgroundSubtractorMOG2::operator()(InputArray image, OutputArray fgmask, double learningRate=-1)
参数意义同BackgroundSubtractorMOG中的operator()函数。
同时BackgroundSubtractorMOG2提供了getBackgroundImage()函数用以返回背景图像：

[cpp] view plain copy
void BackgroundSubtractorMOG2::getBackgroundImage(OutputArray backgroundImage)
另外OpenCV的refman中说新建对象以后还有其他和模型油有关的参数可以修改，不过比较坑的是opencv把这个这些函数参数声明为protected，同时没有提供访问接口，所以要修改的话还是要自己修改源文件提供访问接口。

[cpp] view plain copy
protected:
    Size frameSize;
    int frameType;
    Mat bgmodel;
    Mat bgmodelUsedModes;//keep track of number of modes per pixel
    int nframes;
    int history;
    int nmixtures;
    //! here it is the maximum allowed number of mixture components.
    //! Actual number is determined dynamically per pixel
    double varThreshold;
    // threshold on the squared Mahalanobis distance to decide if it is well described
    // by the background model or not. Related to Cthr from the paper.
    // This does not influence the update of the background. A typical value could be 4 sigma
    // and that is varThreshold=4*4=16; Corresponds to Tb in the paper.
    /
    // less important parameters - things you might change but be carefull

    float backgroundRatio;
    // corresponds to fTB=1-cf from the paper
    // TB - threshold when the component becomes significant enough to be included into
    // the background model. It is the TB=1-cf from the paper. So I use cf=0.1 => TB=0.
    // For alpha=0.001 it means that the mode should exist for approximately 105 frames before
    // it is considered foreground
    // float noiseSigma;
    float varThresholdGen;
    //correspondts to Tg - threshold on the squared Mahalan. dist. to decide
    //when a sample is close to the existing components. If it is not close
    //to any a new component will be generated. I use 3 sigma => Tg=3*3=9.
    //Smaller Tg leads to more generated components and higher Tg might make
    //lead to small number of components but they can grow too large
    float fVarInit;
    float fVarMin;
    float fVarMax;
    //initial variance  for the newly generated components.
    //It will will influence the speed of adaptation. A good guess should be made.
    //A simple way is to estimate the typical standard deviation from the images.
    //I used here 10 as a reasonable value
    // min and max can be used to further control the variance
    float fCT;//CT - complexity reduction prior
    //this is related to the number of samples needed to accept that a component
    //actually exists. We use CT=0.05 of all the samples. By setting CT=0 you get
    //the standard Stauffer&Grimson algorithm (maybe not exact but very similar)
    //shadow detection parameters
    bool bShadowDetection;//default 1 - do shadow detection
    unsigned char nShadowDetection;//do shadow detection - insert this value as the detection result - 127 default value
    float fTau;
    // Tau - shadow threshold. The shadow is detected if the pixel is darker
    //version of the background. Tau is a threshold on how much darker the shadow can be.
    //Tau= 0.5 means that if pixel is more than 2 times darker then it is not shadow
    //See: Prati,Mikic,Trivedi,Cucchiarra,"Detecting Moving Shadows...",IEEE PAMI,2003.
以下是使用BackgroundSubtractorMOG2检测的前景和背景：

采用GMM前景提取的方法对车辆进行跟踪：
#include <opencv2/opencv.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/legacy/legacy.hpp>

using namespace cv;
int main()
{
   VideoCapture capture("parking.avi");
   if (!capture.isOpened())
       return 0;
   Mat frame;
   Mat foreground;
   double delay = 1000. / capture.get(CV_CAP_PROP_FPS);
   namedWindow("Original Video");
   namedWindow("Extracted Foreground");
   BackgroundSubtractorMOG mog;
   bool stop(false);
   while (!stop)
   {
       if (!capture.read(frame))
           break;
       imshow("Original Video", frame);
       mog(frame, foreground, 0.01);
       //对图像取反
       threshold(foreground, foreground, 128, 255, THRESH_BINARY_INV);
       imshow("Extracted Foreground",foreground);
       if (waitKey(delay) >= 0)
           stop = true;
   }
}
运行结果：