版权声明:作者:金良(golden1314521@gmail.com) csdn博客: http://blog.csdn.net/u012176591
作者:金良(golden1314521@gmail.com) csdn博客: http://blog.csdn.net/u012176591
需要整理后的代码文件和数据请移步 http://download.csdn.net/detail/u012176591/8748673
1.高斯分布公式及图像示例
定义在D-维连续空间的高斯分布概率密度的表达式
N(x|μ,Σ)=1(2π)D/21|Σ|1/2exp{−12(x−μ)TΣ−1(x−μ)}
其等高线所形成的形状与协方差矩阵 Σ 密切相关,如下所示,后面的代码中有各个图像的对应的高斯分布的参数。
2. 高斯分布概率密度热力图
代码如下:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">fig,axes = plt.subplots(nrows=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>,ncols=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>,figsize=(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">12</span>)) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 标准圆形</span> mean = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] cov = [[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]] x,y = np.random.multivariate_normal(mean,cov,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5000</span>).T axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].plot(x,y,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'x'</span>) axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].set_xlim(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>) axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].set_ylim(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 椭圆,椭圆的轴向与坐标平行</span> mean = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] cov = [[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.5</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>]] x,y = np.random.multivariate_normal(mean,cov,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5000</span>).T axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].plot(x,y,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'x'</span>) axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].set_xlim(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>) axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].set_ylim(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 椭圆,但是椭圆的轴与坐标轴不一定平行</span> mean = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] cov = [[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.3</span>], [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.3</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.4</span>]] x,y = np.random.multivariate_normal(mean,cov,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5000</span>).T axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>].plot(x,y,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'x'</span>); plt.axis(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'equal'</span>) axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>].set_xlim(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>) axes[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>].set_ylim(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>) </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li></ul>
我们在下面的高斯混合模型中采用用第三种协方差矩阵,即概率密度的等高线是椭圆,且轴向不一定与坐标轴平行。
下图是高斯密度函数的热图:
以下是作图代码
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 自定义的高维高斯分布概率密度函数</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">gaussian</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(x,mean,cov)</span>:</span> dim = np.shape(cov)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#维度</span> covdet = np.linalg.det(cov+np.eye(dim)*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.01</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#协方差矩阵的秩</span> covinv = np.linalg.inv(cov+np.eye(dim)*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.01</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#协方差矩阵的逆</span> xdiff = x - mean <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#概率密度</span> prob = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/np.power(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*np.pi,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>)/np.sqrt(np.abs(covdet))*np.exp(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*np.dot(np.dot(xdiff,covinv),xdiff)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> prob <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#作二维高斯概率密度函数的热力图</span> mean = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] cov = [[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.3</span>], [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.3</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.4</span>]] x,y = np.random.multivariate_normal(mean,cov,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5000</span>).T cov = np.cov(x,y) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#由真实数据计算得到的协方差矩阵,而不是自己任意设定</span> n=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span> x = np.linspace(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,n) y = np.linspace(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>,n) xx,yy = np.meshgrid(x, y) zz = np.zeros((n,n)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> j <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n): zz[i][j] = gaussian(np.array([xx[i][j],yy[i][j]]),mean,cov) gci = plt.imshow(zz,origin=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'lower'</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 选项origin='lower' 防止tuixan图像颠倒</span> plt.xticks([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">195</span>],[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>]) plt.yticks([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">195</span>],[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>]) plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'高斯函数的热力图'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li></ul>
3.高斯混合模型实现代码:
下面是几个功能函数,在主函数中被调用
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 计算概率密度,</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 参数皆为array类型,过程中参数不变</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">gaussian</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(x,mean,cov)</span>:</span> dim = np.shape(cov)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#维度</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#之所以加入单位矩阵是为了防止行列式为0的情况</span> covdet = np.linalg.det(cov+np.eye(dim)*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.01</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#协方差矩阵的行列式</span> covinv = np.linalg.inv(cov+np.eye(dim)*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.01</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#协方差矩阵的逆</span> xdiff = x - mean <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#概率密度</span> prob = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/np.power(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*np.pi,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*dim/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>)/np.sqrt(np.abs(covdet))*np.exp(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*np.dot(np.dot(xdiff,covinv),xdiff)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> prob <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#获取初始协方差矩阵</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">getconvs</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(data,K)</span>:</span> convs = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]*K <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 初始的协方差矩阵源自于原始数据的协方差矩阵,且每个簇的初始协方差矩阵相同</span> convs[i] = np.cov(data.T) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> convs <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">isdistinct</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(means,criter=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.03</span>)</span>:</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#检测初始中心点是否靠得过近</span> K = len(means) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> j <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(i+<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>,K): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> criter > np.linalg.norm(means[i]-means[j]): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">True</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#获取初始聚簇中心</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">getmeans</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(data,K,criter)</span>:</span> means = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]*K dim = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] minmax = [] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#各个维度的极大极小值</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(dim): minmax.append(np.array([min(data[:,i]),max(data[:,i])])) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">True</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#生成初始点的坐标</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K): means[i] = [] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> j <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(dim): means[i].append(np.random.random()*(minmax[j][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-minmax[j][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>])+minmax[j][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]) means[i] = np.array(means[i]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> isdistinct(means,criter): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> means <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># k-means算法的实现函数。</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#用K-means算法输出的聚类中心,作为高斯混合模型的输入</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">kmeans</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(data,K)</span>:</span> N = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#样本数目</span> dim = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#维度</span> means = getmeans(data,K,criter=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>) means_old = [np.zeros(dim) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> np.sum([np.linalg.norm(means_old[k]-means[k]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)]) > <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.01</span>: means_old = cp.deepcopy(means) numlog = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]*K sumlog = [np.zeros(dim) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> n <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N): distlog = [np.linalg.norm(data[n]-means[k]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)] toK = distlog.index(np.min(distlog)) numlog[toK] += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> sumlog[toK] += data[n] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K): means[k] = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/numlog[k]*sumlog[k] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> means <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#对程序结果进行可视化,注意这里的K只能取2,否则该函数运行出错</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">visualresult</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(data,gammas,K)</span>:</span> N = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#样本数目</span> dim = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#维度</span> minmax = [] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#各个维度的极大极小值</span> xy = [] n=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(dim): delta = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.05</span>*(np.max(data[:,i])-np.min(data[:,i])) xy.append(np.linspace(np.min(data[:,i])-delta,np.max(data[:,i])+delta,n)) xx,yy = np.meshgrid(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>], xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]) zz = np.zeros((n,n)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> j <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n): zz[i][j] = np.sum(gaussian(np.array([xx[i][j],yy[i][j]]),means[k],convs[k]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)) gci = plt.imshow(zz,origin=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'lower'</span>,alpha = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.8</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 选项origin='lower' 防止tuixan图像颠倒</span> plt.xticks([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,len(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>])-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],[xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>],xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]]) plt.yticks([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,len(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],[xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>],xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> gammas[i][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] ><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.5</span>: plt.plot((data[i][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-np.min(data[:,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]))/(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]),(data[i][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-np.min(data[:,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]))/(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]),<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r.'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>: plt.plot((data[i][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-np.min(data[:,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]))/(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]),(data[i][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-np.min(data[:,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]))/(xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]),<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'k.'</span>) deltax = xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] deltay = xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] plt.plot((means[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>])/deltax,(means[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>])/deltay,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'*r'</span>,markersize=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>) plt.plot((means[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>])/deltax,(means[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-xy[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>])/deltay,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'*k'</span>,markersize=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>) plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'高斯混合模型图'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li><li style="box-sizing: border-box; padding: 0px 5px;">69</li><li style="box-sizing: border-box; padding: 0px 5px;">70</li><li style="box-sizing: border-box; padding: 0px 5px;">71</li><li style="box-sizing: border-box; padding: 0px 5px;">72</li><li style="box-sizing: border-box; padding: 0px 5px;">73</li><li style="box-sizing: border-box; padding: 0px 5px;">74</li><li style="box-sizing: border-box; padding: 0px 5px;">75</li><li style="box-sizing: border-box; padding: 0px 5px;">76</li><li style="box-sizing: border-box; padding: 0px 5px;">77</li><li style="box-sizing: border-box; padding: 0px 5px;">78</li><li style="box-sizing: border-box; padding: 0px 5px;">79</li><li style="box-sizing: border-box; padding: 0px 5px;">80</li><li style="box-sizing: border-box; padding: 0px 5px;">81</li><li style="box-sizing: border-box; padding: 0px 5px;">82</li><li style="box-sizing: border-box; padding: 0px 5px;">83</li><li style="box-sizing: border-box; padding: 0px 5px;">84</li><li style="box-sizing: border-box; padding: 0px 5px;">85</li><li style="box-sizing: border-box; padding: 0px 5px;">86</li><li style="box-sizing: border-box; padding: 0px 5px;">87</li><li style="box-sizing: border-box; padding: 0px 5px;">88</li><li style="box-sizing: border-box; padding: 0px 5px;">89</li><li style="box-sizing: border-box; padding: 0px 5px;">90</li><li style="box-sizing: border-box; padding: 0px 5px;">91</li><li style="box-sizing: border-box; padding: 0px 5px;">92</li><li style="box-sizing: border-box; padding: 0px 5px;">93</li><li style="box-sizing: border-box; padding: 0px 5px;">94</li><li style="box-sizing: border-box; padding: 0px 5px;">95</li><li style="box-sizing: border-box; padding: 0px 5px;">96</li><li style="box-sizing: border-box; padding: 0px 5px;">97</li><li style="box-sizing: border-box; padding: 0px 5px;">98</li><li style="box-sizing: border-box; padding: 0px 5px;">99</li><li style="box-sizing: border-box; padding: 0px 5px;">100</li><li style="box-sizing: border-box; padding: 0px 5px;">101</li><li style="box-sizing: border-box; padding: 0px 5px;">102</li><li style="box-sizing: border-box; padding: 0px 5px;">103</li><li style="box-sizing: border-box; padding: 0px 5px;">104</li><li style="box-sizing: border-box; padding: 0px 5px;">105</li><li style="box-sizing: border-box; padding: 0px 5px;">106</li><li style="box-sizing: border-box; padding: 0px 5px;">107</li><li style="box-sizing: border-box; padding: 0px 5px;">108</li><li style="box-sizing: border-box; padding: 0px 5px;">109</li><li style="box-sizing: border-box; padding: 0px 5px;">110</li></ul>
高斯混合模型的主函数
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">N = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#样本数目</span> dim = np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#维度</span> K = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 聚簇的个数</span> means = kmeans(data,K) convs = getconvs(data,K) pis = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/K]*K gammas = [np.zeros(K) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N)] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#*N 注意不能用 *N,否则N个array只指向一个地址</span> loglikelyhood = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> oldloglikelyhood = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> np.abs(loglikelyhood - oldloglikelyhood)> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.0001</span>: oldloglikelyhood = loglikelyhood <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># E_step</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> n <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N): respons = [pis[k]*gaussian(data[n],means[k],convs[k]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)] sumrespons = np.sum(respons) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K): gammas[n][k] = respons[k]/sumrespons <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># M_step</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K): nk = np.sum([gammas[n][k] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> n <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N)]) means[k] = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/nk * np.sum([gammas[n][k]*data[n] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> n <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N)],axis=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) xdiffs = data - means[k] convs[k] = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/nk * np.sum([gammas[n][k]*xdiffs[n].reshape(dim,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)*xdiffs[n] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> n <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N)],axis=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>) pis[k] = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*nk/N <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 计算似然函数值</span> loglikelyhood =np.sum( [np.log(np.sum([pis[k]*gaussian(data[n],means[k],convs[k]) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> k <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(K)])) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> n <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(N) ]) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#print means</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#print loglikelyhood</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#print '=='*10</span> visualresult(data,gammas,K)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li></ul>
4.高斯混合模型聚簇效果图
5.参考文献:
- K-means聚类和EM思想
http://www.cnblogs.com/jerrylead/archive/2011/04/06/2006910.html - (EM算法)The EM Algorithm
http://www.cnblogs.com/jerrylead/archive/2011/04/06/2006936.html - 从决策树学习谈到贝叶斯分类算法、EM、HMM
http://blog.csdn.net/v_july_v/article/details/7577684 - EM算法学习(Expectation Maximization Algorithm)
http://www.vjianke.com/XUHV3.clip - EM算法——理论与应用
http://blog.sina.com.cn/s/blog_a8fead9b01014p6k.html - EM算法 一个简单的例子
http://blog.sina.com.cn/s/blog_a7da5cda010158b3.html - 高斯混合模型(GMM)
http://www.cnblogs.com/mindpuzzle/archive/2013/04/24/3036447.html - 高斯混合模型
http://www.cnblogs.com/zhangchaoyang/articles/2624882.html - CS229 Lecture notes,Andrew Ng讲义
http://cs229.stanford.edu/notes/cs229-notes7b.pdf - https://github.com/lzhang10/maxent/blob/master/doc/manual.pdf
- http://nlp.stanford.edu/software/classifier.shtml
- https://github.com/juandavm/em4gmm
用最大期望算法求解高斯混合模型,dat文件夹里的压缩文件是数据 - http://insideourminds.net/python-unsupervised-learning-using-em-algorithm-implementation/
Python实现的期望最大化算法,数据是鸢尾花数据