《Machine Learning for OpenCV》学习笔记:降维

本文是《Machine Learning for OpenCV》的学习笔记,重点探讨了降维的重要性和三种主要方法:PCA(主成分分析)、ICA(独立成分分析)和NMF(非负矩阵分解)。PCA通过映射数据到较低维度来优化分类器性能,ICA关注数据成分的独立性,而NMF则在非负数据中寻找分解。文中还介绍了在OpenCV和scikit-learn中如何实现这些方法。
摘要由CSDN通过智能技术生成

一 . 降维

1. 为什么要降维

一个分类器,没有以分布在整个特征空间中的数据点进行训练,那么这个分类器在面对一个离前面遇到过的数据点位置很远的数据点时,将不知道如何去分类。但是,随着空间维度数的增加,需要填充空间的数据点会呈现指数级增长。当数据点增长超过一个最大值时,分类器的性能就会下降("性能"只是一个抽象描述,可具体化为很多方面)。所以需要寻找一个最优的维度数(特征数量)。这就是所谓的降维。

2. 实现降维的方法

2.1 PCA(主成分分析法,Principal Component Analysis)

2.2 ICA(独立成分分析法, Independent Component Analysis)

2.3 NMF(非负矩阵分解法,Non-negative Matrix Factorization)

二. PCA

2.1 PCA原理

PCA的原理是将N维特征映射到K维特征上(N>K)。这不是从N维特征中去除(N-K)维特征,而是在原来的N维特征中重新构造出来K维特征。

2.2 PCA的OpenCV实现

# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
from pylab import mpl
import cv2


# 设置中文字体
mpl.rcParams['font.sans-serif'] = ['SimHei']

plt.style.use('ggplot')
mean = [20, 20]  # 均值
cov = [[5, 0], [25, 25]]    # 协方差矩阵
x, y = np.random.multivariate_normal(mean, cov, 1000).T

# 将特征向量组合成一个新的特征矩阵X
X = np.vstack((x, y)).T

# mu: 投影前减去的的平均值mean, eig: 协方差矩阵的特征向量
mu, eig = cv2.PCACompute(X, np.array([]))

# 使用cv2.PCAProject旋转数据
X2 = cv2.PCAProject(X, mu, eig)



三. ICA

3.1 ICA原理

ICA同PCA进行一样的数学操作,但它选择哪些分解后相互之间尽可能独立的成分。

3.2 ICA的scikit-learn实现

在scikit-learn中,ICA由decomposition.FastICA()函数实现。

# -*- coding:utf-8 -*-
import numpy as np
from sklearn import decomposition

mean = [20, 20]
cov = [[5, 0], [25, 25]]
x, y = np.random.multivariate_normal(mean, cov, 1000).T
X = np.vstack((x, y)).T

ica = decomposition.FastICA()
X2 = ica.fit_transform(ica)

四. NMF

4.1 NMF原理

和PCA与ICA进行同样的数学操作,只是多了一个限制条件——操作的数据是非负的。

4.2 NMF的scikit-learn实现

在scikit-learn中,NMF由decomposition.NMF()函数实现。

# -*- codign:utf-8 -*-

import numpy as np
from sklearn import decomposition

mean = [20, 20]
cov = [[5, 0], [25, 25]]
x, y = np.random.multivariate_normal(mean, cov, 1000).T
X = np.vstack((x, y)).T


ica = decomposition.NMF()
X2 = ica.fit_transform(ica)

五. 参考资料

[1]Michael Beyeler《Machine Learning for OpenCV》,https://github.com/mbeyeler/opencv-machine-learning

六. 其它

Chapter 1, A Taste of Machine Learning, will gently introduce you to the different subfields of machine learning, and explain how to install OpenCV and other essential tools in the Python Anaconda environment. Chapter 2, Working with Data in OpenCV and Python, will show you what a typical machine learning workflow looks like, and where data comes in to play. I will explain the difference between training and test data, and show you how to load, store, manipulate, and visualize data with OpenCV and Python. Chapter 3, First Steps in Supervised Learning, will introduce you to the topic of supervised learning by reviewing some core concepts, such as classification and regression. You will learn how to implement a simple machine learning algorithm in OpenCV, how to make predictions about the data, and how to evaluate your model. Chapter 4, Representing Data and Engineering Features, will teach you how to get a feel for some common and well-known machine learning datasets and how to extract the interesting stuff from your raw data. Chapter 5, Using Decision Trees to Make a Medical Diagnosis, will show you how to build decision trees in OpenCV, and use them in a variety of classification and regression problems. Chapter 6, Detecting Pedestrians with Support Vector Machines, will explain how to build support vector machines in OpenCV, and how to apply them to detect pedestrians in images. Chapter 7, Implementing a Spam Filter with Bayesian Learning, will introduce you to probability theory, and show you how you can use Bayesian inference to classify emails as spam or not. Chapter 8, Discovering Hidden Structures with Unsupervised Learning, will talk about unsupervised learning algorithms such as k-means clustering and Expectation-Maximization, and show you how they can be used to extract hidden structures in simple, unlabeled datasets. Chapter 9, Using Deep Learning to Classify Handwritten Digits, will introduce you to the exciting field of deep learning. Starting with the perceptron and multi-layer perceptrons, you will learn how to build deep neural networks in order to classify handwritten digits from the extensive MNIST database. Chapter 10, Combining Different Algorithms into an Ensemble, will show you how to effectively combine multiple algorithms into an ensemble in order to overcome the weaknesses of individual learners, resulting in more accurate and reliable predictions. Chapter 11, Selecting the Right Model with Hyper-Parameter Tuning, will introduce you to the concept of model selection, which allows you to compare different machine learning algorithms in order to select the right tool for the task at hand. Chapter 12, Wrapping Up, will conclude the book by giving you some useful tips on how to approach future machine learning problems on your own, and where to find information on more advanced topics.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值