Machine Learning Concepts

参考: Machine Learning Concepts ,周志华的西瓜书 《机器学习》。

Machine learning (ML) can help you use historical data to make better business decisions. ML algorithms discover patterns in data, and construct mathematical models using these discoveries. Then you can use the models to make predictions on future data. For example, one possible application of a machine learning model would be to predict how likely a customer is to purchase a particular product based on their past behavior.

Building a Machine Learning Application

Building ML applications is an iterative process that involves a sequence of steps. To build an ML application, follow these general steps:

  1. Frame the core ML problem(s) in terms of what is observed and what answer you want the model to predict.

  2. Collect, clean, and prepare data to make it suitable for consumption by ML model training algorithms. Visualize and analyze the data to run sanity checks to validate the quality of the data and to understand the data.

  3. Often, the raw data (input variables) and answer (target) are not represented in a way that can be used to train a highly predictive model. Therefore, you typically should attempt to construct more predictive input representations or features from the raw variables.

  4. Feed the resulting features to the learning algorithm to build models and evaluate the quality of the models on data that was held out from model building.

  5. Use the model to generate predictions of the target answer for new data instances.

Created with Raphaël 2.1.0 数据 学习算法 数学模型
Created with Raphaël 2.1.0 新数据 数学模型 预测的结果
概念含义
data set 数据集
instance 示例, sample 样本,feature vector 特征向量数据集的一条记录
attribute 属性, feature 特征
attribute space 属性空间,sample space 样本空间,输入空间
dimensionality 维度
learning 学习,training 训练通过执行学习算法从数据中学得模型的过程
training data 训练数据训练过程中使用的数据
training sample 训练样本训练数据中的一个样本
training set 训练集训练样本组成的集合
hypothesis 假设
ground-truth 真相或真实
prediction 预测
label 标记
example 样例拥有了标记信息的示例
label space 标记空间,输出空间
testing 测试使用模型进行测试的过程
testing sample 测试样本, testing instance 测试示例用于测试的样本
generalization 泛化将模型应用于新样本
induction 归纳泛化过程
deduction 演绎
specialization 特殊化
inductive learning 归纳学习
concept 概念
概念学习,概念形成狭义的归纳学习
version space 版本空间与训练集相一致的假设集合
inductive bias 归纳偏好
Occam’s razor 奥卡姆剃须刀选择最简单的那个一致的假设
error rate错误率,分类最常用的性能度量
accuracy 精度= 1 - 错误率
error 误差
empirical error 经验误差
training error 训练误差,empirical error 经验误差
generalization error 泛化误差
overfitting 过拟合
underfitting 欠拟合
model selection 模型选择学习算法,参数的选择
testing set 测试集
testing error 测试误差
hold-out 留出法
sampling 采样
stratified sampling 分层采样保留类别比例的采样方式
fidelity 保真性使用数据集训练出的模型与使用训练集训练出的模型的一致性
cross validation 交叉验证法
Leave-One-Out 留一法
bootstrapping 自助法
parameter tuning 调参
validation set 验证集
performance measure 性能度量
MSE mean squared error 均方误差回归最常用的性能度量
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值