统计机器学习-1-统计机器学习基础

一、统计机器学习的研究内容

网络 算法 机器 优化 概率 统计
数据 矩阵 信息 模型 推理
获知识 靠学习

We are drowning in information and starving for knowledge. -John Naisbitt

Data -> Model -> Knowledge

二、机器学习与应用统计学对比

MLSTATISTICS备注
NetworksGraphs Models网络、图/模型
Weightsparameters权重/参数
learning fitting or estimating学习/拟合、估计
generalization 泛化 Test set可信度
superised learningregression/classification回归 分类
unsuperised learningdensity estimating clustering聚类

三、Data Science的三个能力

  • infrastructure 底层架构

  • coding 代码能力

  • math (解决问题的能力)

统计机器学习–SML:

A field that bridges computation and statistics, with ties to information theory,
signal processing,algorithm, control theory, and optimization theory。

SML = Matrix +Optimization+Algorithm+statistics

矩阵+优化+算法+统计,本质是一个最优化问题

N个数据 每个数据有P个特征

X = (
X11 X12 … X1P,
X21 X22 … X2P,

Xn1, Xn2 … Xnp
)

X1 = (X11 X12 … X1P)

1.降维 X1^P --> X1^Q 由P维降到Q维

线性降维

  1. 聚类

3.分类

binary
x1 -> input
x2 -> output

分类问题,数据分三类:
1.训练集
training data

模型+参数
e(y–>f(x,a))+c P(b)

2.validation data

验证数据估c

  1. 测试数据(只有输入)

4.regression 回归

y 属于R
回归是一个特殊的分类问题

5.Ranking

四、机器学习的基本方法:

1.频率派
The frequent.st approach views the model params as unknown
constants and estimates them by matching the model to the training data
using an appropritate metric.

(Xi,Yi)
least square estimation 最小二乘估计

i->n (Yi-Xi*a)^2

最大似然估计

高斯分布

2.Bayesiam Approach
y~N(XT*a,b2)

Pratap Dangeti, "Statistics for Machine Learning" English | ISBN: 1788295757 | 2017 | EPUB | 311 pages | 12 MB Key Features Learn about the statistics behind powerful predictive models with p-value, ANOVA, F-statistics. Implement statistical computations programmatically for supervised and unsupervised learning through K-means clustering. Master the statistical aspect of machine learning with the help of this example-rich guide in R & Python. Book Description Complex statistics in machine learning worries a lot of developers. Knowing statistics helps in building strong machine learning models that are optimized for a given problem statement. This book will teach you all it takes to perform complex statistical computations required for machine learning. You will gain information on statistics behind supervised learning, unsupervised learning, reinforcement learning, and more. You will see real-world examples that discuss the statistical side of machine learning and make you comfortable with it. You will come across programs for performing tasks such as model, parameters fitting, regression, classification, density collection, working with vectors, matrices, and more.By the end of the book, you will understand concepts of required statistics for Machine Learning and will be able to apply your new skills to any sort of industry problems. What you will learn Understanding Statistical & Machine learning fundamentals necessary to build models Understanding major differences & parallels between statistics way of solving problem & machine learning way of solving problem Know how to prepare data and "feed" the models by using the appropriate machine learning algorithms from the adequate R & Python packages Analyze the results and tune the model appropriately to his or her own predictive goals Understand concepts of required statistics for Machine Learning Draw parallels between statistics and machine learning Understand each component of machine learning models and see impact of changing them
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

esc_ai

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值