提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
前言
EMGU官方文档中有几种机器学习的案例,但是由于EMGU算法跟随OpenCV算法在升级,官方文档中Expectation-Maximization in CSharp描述的EM算法中多个方法已经重写(弃用),使用较高版本的EMGU并不能复现例程。本文参考OpenCv中的EM算法,修改了部分源码并测试通过。
一、测试环境
- EMGU版本:3.4.1
- .NetFrameWork:4.7.1
二、使用步骤
1.引入库
using Emgu.CV;
using Emgu.CV.ML;
using Emgu.CV.Structure;
using System;
using System.Collections.Generic;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
2.创建控制台程序
基础方法:
/// <summary>
/// 机器学习之EM算法,基础
/// </summary>
private static void EmBase()
{
int N = 4; //number of clusters
int N1 = (int)Math.Sqrt((double)4);
//定义四种颜色,每一类用一种颜色表示
Bgr[] colors = new Bgr[] {
new Bgr(0, 0, 255),
new Bgr(0, 255, 0),
new Bgr(255, 255, 0),
new Bgr(255, 0, 255)};
int nSamples = 100;//100个样本点
Matrix<float> samples = new Matrix<float>(nSamples, 2);//样本矩阵,100行2列,即100个坐标点
Matrix<Int32> labels = new Matrix<int>(nSamples, 1); //标注结果,不需要事先知道
Image<Bgr, Byte> img = new Image<Bgr, byte>(500, 500);//待测数据,每一个坐标点为一个待测数据
Matrix<float> sample = new Matrix<float>(1, 2);
Matrix<float> means0 = new Matrix<float>(N, 2);//储存初始化均值
Matrix<float> probs0 = new Matrix<float>(nSamples, 1); //输出一个矩阵,里面包含每个隐性变量的后验概率
CvInvoke.cvReshape(samples.Ptr, samples.Ptr, 2, 0);
//循环生成四个类别样本数据,共样本100个,每类样本25个
for (int i = 0; i < N; i++)
{
Matrix<float> rows = samples.GetRows(i * nSamples / N, (i + 1) * nSamples / N, 1);
double scaleX = ((i % N1) + 1.0) / (N1 + 1);
double scaleY = ((i / N1) + 1.0) / (N1 + 1);
//设置均值
MCvScalar mean = new MCvScalar(scaleX * img.Width, scaleY * img.Height);
Console.WriteLine($"mean = {mean.V0}");
//设置标准差
MCvScalar sigma = new MCvScalar(30, 30);
Console.WriteLine($"sigma = {sigma.V0}");
ulong seed = (ulong)DateTime.Now.Ticks;
Console.WriteLine($"seed = {seed}");
//根据均值和标准差,随机生成25个正态分布坐标点作为样本
CvInvoke.Randn(rows, mean, sigma);
}
CvInvoke.cvReshape(samples.Ptr, samples.Ptr, 1, 0);
//using (EM emModel1 = new EM())
using (EM emModel1 = new EM())
{
emModel1.ClustersNumber = 4;
emModel1.CovarianceMatrixType = EM.CovarianMatrixType.Spherical;
emModel1.TermCriteria = new MCvTermCriteria(300, 0.000001);
//emModel1.Train(samples,Emgu.CV.ML.MlEnum.DataLayoutType.RowSample, labels);
emModel1.trainE(samples, means0, null,null,null,labels,probs0);
emModel1.TrainM(samples, probs0, null, labels, probs0);
#region Classify every image pixel
for (int i = 0; i < img.Height; i++)
for (int j = 0; j < img.Width; j++)
{
sample.Data[0, 0] = i;
sample.Data[0, 1] = j;
MCvPoint2D64f mCvPoint2D64F = emModel1.Predict(sample, null);
//这里做测试,看预测结果的分类
//Console.WriteLine($"{j},{i}|{mCvPoint2D64F.X},{mCvPoint2D64F.Y}");
int response = (int)(mCvPoint2D64F.X);
//Console.WriteLine($"response = {response}");
Bgr color = colors[response];
img.Draw(
new CircleF(new PointF(i, j), 1),
new Bgr(color.Blue * 0.5, color.Green * 0.5, color.Red * 0.5),
//color,
0);
}
#endregion
#region draw the clustered samples
for (int i = 0; i < nSamples; i++)
{
img.Draw(new CircleF(new PointF(samples.Data[i, 0], samples.Data[i, 1]), 1), colors[labels.Data[i, 0]], 0);
}
#endregion
Emgu.CV.UI.ImageViewer.Show(img,"EM Image Result");
}
}
调用:
static void Main(string[] args)
{
EmBase();
}
结果展示
提出问题
根据What is an Expectaion-Maximization Classifier的解释,EM迭代方法交替执行期望(E)步骤和一个最大化(M)的步骤,即EM类中的TrianE和TrianM方法,EM类继承自EM : UnmanagedObject, IStatModel, IAlgorithm
,其重写了IStatModel中的Train方法,该方法是TrianE和TrianM的合并,所以理论上直接执行emModel1.Train即可,但是本文测试时,直接在使用Trian方法并未获取到样本的标签分类结果,导致后面的样本点分布不正确,所以才分布进行训练,希望大家指正!