【机器学习】统计学习方法笔记一:统计学习基础概念

统计学习基础

什么是学习

Herbert A. Simon定义:如果一个系统能够通过执行某个过程改进它的性能,这就是学习

什么是统计学习

统计学习(statistical learning)是关于计算机基于数据构建概率统计模型并运用模型对数据进行预测与分析的一门学科。也称统计机器学习(statistical machine learning)

统计学习的特点
  • 以计算机及网络为平台
  • 以数据为研究对象,是数据驱动的学科
  • 目的是对数据进行预测与分析
  • 以方法为中心,构建模型并应用模型进行预测与分析
  • 是概率论、统计学、信息论、计算理论、最优化理论及计算机科学等多个领域的交叉学科
统计学习分类
  • 监督学习(supervised learning)
  • 非监督学习(unsupervised learning)
  • 半监督学习(semi-supervised learning)
  • 强化学习(reinforcement learning)
  • 等等
监督学习的统计学习方法的定义

从给定的、有限的、用于学习的训练数据(training data)几个出发,假设数据是独立同分布的;并且假设要学习的模型属于某个函数的集合,称为假设空间(hypothesis space);应用某个评价准则(evaluation criterion),从假设空间中选取一个最优的模型,使它对已知训练数据及未知测试数据(test data)在给定的评价准则下有最优的预测;最优模型的选取由算法实现。

统计学习方法的三要素
  • 模型(model):模型的假设空间(函数集合)
  • 策略(strategy):模型选择的准则
  • 算法(algorithm):模型学习的算法
统计学习方法的步骤:
  • 得到一个有限的训练数据集合
  • 确定包含所有可能的模型的假设空间,即学习模型的集合
  • 确定模型选择的准则,即学习的策略
  • 实现求解最优模型的算法,即学习的算法
  • 通过学习方法选择最优模型
  • 利用学习的最优模型对新数据进行预测和分析

参考文献

[1] 李航. (2012). 统计学习方法.

Pratap Dangeti, "Statistics for Machine Learning" English | ISBN: 1788295757 | 2017 | EPUB | 311 pages | 12 MB Key Features Learn about the statistics behind powerful predictive models with p-value, ANOVA, F-statistics. Implement statistical computations programmatically for supervised and unsupervised learning through K-means clustering. Master the statistical aspect of machine learning with the help of this example-rich guide in R & Python. Book Description Complex statistics in machine learning worries a lot of developers. Knowing statistics helps in building strong machine learning models that are optimized for a given problem statement. This book will teach you all it takes to perform complex statistical computations required for machine learning. You will gain information on statistics behind supervised learning, unsupervised learning, reinforcement learning, and more. You will see real-world examples that discuss the statistical side of machine learning and make you comfortable with it. You will come across programs for performing tasks such as model, parameters fitting, regression, classification, density collection, working with vectors, matrices, and more.By the end of the book, you will understand concepts of required statistics for Machine Learning and will be able to apply your new skills to any sort of industry problems. What you will learn Understanding Statistical & Machine learning fundamentals necessary to build models Understanding major differences & parallels between statistics way of solving problem & machine learning way of solving problem Know how to prepare data and "feed" the models by using the appropriate machine learning algorithms from the adequate R & Python packages Analyze the results and tune the model appropriately to his or her own predictive goals Understand concepts of required statistics for Machine Learning Draw parallels between statistics and machine learning Understand each component of machine learning models and see impact of changing them
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值