机器学习的基本术语

一.Key ML Terminology

(1) Label(标签)
A label the thing we're predicting-t he  y  variable in simple linear regression. 是我们想要预测的简单线性回归的y变量
(2)Features(特征值)

feature is an input variable describing our data—the x variable in simple linear regression.描述数据的输入变量,简单线性回归的x变量

In the spam detector example, the features could include the following:

  • words in the email text
  • sender's address
  • time of day the email was sent
  • email contains the phrase "one weird trick."

在垃圾邮件的检测例子中,特征值可以是以上的特征。

(3)Example(样本)

Example is a particular instance of data ,x---在检查邮件是否是垃圾邮件中,邮件就是样本。

  3.1 Labeled example(被标记样本)

Includes both feature(s) and the label,used to train the model.--被标记样本包括特征值跟标签,用于训练模型,一般由用户提供被标记样本。

labeled examples: {features, label}: (x, y)
在垃圾邮件中,被标记样本就是用户标记的垃圾邮件或者非垃圾邮件。

For example, the following table shows 5 labeled examples from a data set(数据集) containing information about housing prices in California:

housingMedianAge(feature)totalRooms(feature)totalBedrooms(feature)medianHouseValue(label)
155612128366900
197650190180100
1772017485700
14150133773400
20145432665500
这就是五个典型的被标记样本,含有标签跟特征值,经过大量的被标记样本训练后的模型,我们可以预测Label。
3.2 unlabeled example (未标记样本)
An unlabeled example contains features but not the label.未标记样本只包含特征值,不包含标签。
 unlabeled examples: {features, ?}: (x, ?)

Once we've trained our model with labeled examples, we use that model to predict the label on unlabeled examples. In the spam detector, unlabeled examples are new emails that humans haven't yet labeled.

经过标记样本训练后的模型,用于预测未标记样本中的标签。邮件例子中,用户没有标记的邮件就是未标记样本。

(4)Model(模型)

A model defines the relationship between features and label.模型定义了特征值和标签之前的关系

For example, a spam detection model might associate certain features strongly with "spam".

在邮件例子中,模型中会定义了什么样的邮件内容(feature)会被标记为垃圾邮件(标签),里面包含了两者之间的关系。

 Let's highlight two phases of a model's life:

  • Training(训练) means creating or learning the model. That is, you show the model labeled examples and enable the model to gradually learn the relationships between features and label.

  • 训练是指模型逐渐学习特征值和标签之间关系的过程。

  • Inference(预测,推理)means applying the trained model to unlabeled examples. That is, you use the trained model to make useful predictions (y'). For example, during inference, you can predict medianHouseValue for new unlabeled examples.

  • 预测是指训练过的模型对含有特征值的样本预测其标签的过程。

(5)Regression(回归)vs. classification(分类)

A regression model(回归模型) predicts continuous values. 回归模型用于预测连续型的值

For example, regression models make predictions that answer questions like the following:

  • What is the value of a house in California?(预测房价)

  • What is the probability that a user will click on this ad?(预测点击量)

classification model(分类模型)predicts discrete values. 分类模型用于预测离散值

For example, classification models make predictions that answer questions like the following:

  • Is a given email message spam or not spam?(预测是否是垃圾邮件)

  • Is this an image of a dog, a cat, or a hamster?(照片的类型)

Key Terms
classification model(分类模型)example(样本)
feature(特征值)inference(推理)
label(标签)model(模型)
regression(回归)training(训练)
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值