Machine learning course note 1


What is ML?

When we use it?

Samuel(1959): learn without being explicitly programmed

Tom Mitchell

For the checkers playing example the experience e, will be the experience of having the program play 10's of 1000's of games against itself. The task t, will be the task of playing checkers. And the performance measure p, will be the probability that it wins the next game of checkers against some new opponent.

clip_image001

Supervised learning, unsupervised learning

Others: Reinforcement learning, recommender systems

Official summary

What is Machine Learning?

Two definitions of Machine Learning are offered. Arthur Samuel described it as: "the field of study that gives computers the ability to learn without being explicitly programmed." This is an older, informal definition.

Tom Mitchell provides a more modern definition: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

Example: playing checkers.

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad classifications:

Supervised learning and Unsupervised learning.



Supervised learning


clip_image001[5]

Data set of house prices

Linear fit

Or quadratic? -> use learning algo

Supervised learning, give data set as well as the answer,(predict the answer)

Regression problem:

clip_image002

clip_image003

Classification Example

One feature example: age

clip_image004

Features: age, tumour size to decide good or bad tumour

clip_image005

Our algo is expected to fit that line and decide whether its good or bad feature

Quiz, regression or classification

clip_image006

Explanation:

For problem one, I would treat this as a regression problem, because if I have, you know, thousands of items, well, I would probably just treat this as a real value, as a continuous value. And treat, therefore, the number of items I sell, as a continuous value. And for the second problem, I would treat that as a classification problem, because I might say, set the value I want to predict with zero, to denote the account has not been hacked. And set the value one to denote an account that has been hacked into. So just like, you know, breast cancer, is, zero is benign, one is malignant. So I might set this be zero or one depending on whether it's been hacked, and have an algorithm try to predict each one of these two discrete values. And because there's a small number of discrete values, I would therefore treat it as a classification problem.

So regression->continuous value; classification->discrete(binary even)?

Summary

Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

Example 1:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.

Example 2:

(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.


Unsupervised learning

Dataset has no labels

clip_image001[7]

Given data set, can you find the pattern?

Example: google news

Grouping news stories: same themed news grouped together

  1. DNA example: group different people with similar type of DNA

So, unsupervised ML finds pattern and group things together?

Applications

Clustering:

Organise computing clusters;

societal network analysis;

market segmentation;

astronomical data analysis

Cocktail party problem: voice overlapping, but people still ok with communication

Cocktail party algo

Sounds like 2 audio source: filter one of them out

OCTIVE, MTALAB

SVD

QUIZ

clip_image002[4]

Summary

Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值