machine learning week1-concept

1机器学习是什么?

第一个机器学习的定义来自于Arthur Samuel。 他定义机器学习为:在进行特定编程的情况下, 给予计算机学习能力的领域。

另一个年代近一点的定义,由Tom Mitchell提出,来自卡内基梅隆大学,他认为:一个程序被认为能从经验E中学习,解决任务 T,达到 性能度量值P,当且仅当,有了经验E后,经过P评判, 程序在处理 T 时的性能有所提升。

e:在西洋棋那例子中,经验e 就是 程序上万次的自我练习的经验 

t:任务 t 就是下棋

p:性能度量值 p呢, 就是它在与一些新的对手比赛时,赢得比赛的概率

2.机器学习目前主要存在两种不同的算法

监督学习、无监督学习 

 

What is Machine Learning?

Two definitions of Machine Learning are offered. Arthur Samuel described it as: "the field of study that gives computers the ability to learn without being explicitly programmed." This is an older, informal definition.

Tom Mitchell provides a more modern definition: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

Example: playing checkers.

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad classifications:

Supervised learning and Unsupervised learning

3监督学习(Supervised Learing)

 

 

术语监督学习,意指给出一个算法, 需要部分数据集已经有正确答案。

用更术语的方式来定义, 监督学习又叫回归问题,(应该是回归属于监督中的一种) 意指要预测一个连续值的输出,比如房价

 在分类问题中,有时会有超过两个的值, 输出的值可能超过两种。

 事实上,等我们介绍一种叫支持向量机的算法时, 就知道存在一个简洁的数学方法,能让电脑处理无限多的特征。

监督学习其基本思想是,监督学习中,对于数据集中的每个数据, 都有相应的正确答案,(训练集) 算法就是基于这些来做出预测。

后面介绍了回归问题。 即通过回归来预测一个连续值输出。 我们还谈到了分类问题, 目标是预测离散值输出。

Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression(回归) problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output(离散输出). In other words, we are trying to map input variables into discrete categories.

Example 1:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.

Example 2:

(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.

 

4.无监督学习(Unsupervised Learning)

 

 

 

 因此 对于监督学习中的每一个样本 我们已经被清楚地告知了 什么是所谓的正确答案 即它们是良性还是恶性 .

在无监督学习中 我们用的数据会和监督学习里的看起来有些不一样 在无监督学习中 没有属性或标签这一概念 也就是说所有的数据 都是一样的 没有区别 所以在无监督学习中 我们只有一个数据集 没人告诉我们该怎么做 我们也不知道 每个数据点究竟是什么意思 相反 它只告诉我们 现在有一个数据集 你能在其中找到某种结构吗?

对于给定的数据集 无监督学习算法可能判定 该数据集包含两个不同的聚类 你看 这是第一个聚类 然后这是另一个聚类 你猜对了 无监督学习算法 会把这些数据分成两个不同的聚类 所以这就是所谓的聚类算法

 

Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值