Machine Learning 学习笔记(第一周)

Week1 Introduction

Introduction

Welcome

Machine learning is everywhere.

The aim of machine learning is to build machines as intelligent.

Two goals:

  • know the algorithms & math
  • implement each algorithmes

Why machine learing successful today

  • grew out of work in AI
  • New capability for computers

Examples:

  • Database mining
  • application can’t program by hand

handwriting recognition, NLP, CV

  • Self-customizing programs

Recommendations

  • understanding human learning (brain, real AI)

What is Machine Learning

  • Arthur Samuel (1959). Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.
  • Tom Mitchell (1998). Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Machine learning algorithms:

  • Supervised learning

  • Unsupervised learning

    Others: Reinforcement learning, recommender system

Supervised Learning

Example:

  • housing price prediction

    “right answers” given

    Regression: Predict continuous valued output (price)

  • Breast cancer (malignant, benign)

    Classification: Discrete valued output (0 or 1)

    two input parameters (Age, Tumor Size) to predict (malignant, benign)

Unsupervised Learning

no label, cluster by algorithm itself

no feedback based on the prediction results

Examples:
  • Clustering in 2-D data

    • Age, Tumor Size data
    • Genes VS. individuals clustering
  • Organize computing clusters

  • Social network analysis

  • Market segmentation

  • Astronomical data analysis

  • cocktail party problem

    More than one people speaking, distinguish the vioce from microphones

Week1 Linear Regression with One Variable

Model and Cost function

Model Representation

Linear regression
  • Features

    supervised, regression

  • notation

    training set

    m = Number of training examples

    x’s = “input” variable / features

    y’s = “output” variable / “target” variable

    (x, y): one traning example

    ( x ( i ) , y ( i ) ) (x^{(i)}, y^{(i)}) (x(i),y(i)): its training example

ML workflow

在这里插入图片描述

h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x

For historical reasons, this function h is called a hypothesis. When the target variable that we’re trying to predict is continuous, such as in our housing example, we call the learning problem a regression problem. When y can take on only a small number of discrete values (such as if, given the living area, we wanted to predict if a dwelling is a house or an apartment, say), we call it a classification problem.

机器学习学到的是 h h h,也即某种假设。这种假设(数学上为各种参数 θ \theta θ)指的是输入数据和输出结果间的关系,就是说根据输入( X X X),我们能得出什么样的结果( y y y)。

Cost Function

Using linear regression as example:

h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x

Idea: Choose θ 0 , θ 1 \theta_0, \theta_1 θ0,θ1 so that h θ ( x ) h_\theta(x) hθ(x) is close to y y y for our training examples ( x , y ) (x,y) (x,y)

We want to minimize θ 0 , θ 1 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 \underset{\theta_0,\theta_1}{\text{minimize}} \frac{1}{2m} \displaystyle\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})^2 θ0,θ1minimize2m1i=1m(hθ(x(i))y(i))2,

the cost function J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1) = \frac{1}{2m} \displaystyle\sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)})^2 J(θ0,θ1)=2m1i=1m(hθ(x(i))y(i))2

which is half of squared error function, the 1 2 \frac{1}{2} 21 is convenience for computation of the gradient descent.

Cost Function Intuition

  • Hypothesis:

h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x

a function of x

  • Parameters:

    θ 0 , θ 1 \theta_0,\theta_1 θ0,θ1

  • Cost Function:

    J ( θ 0 , θ 1 ) = 1 2 m = ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1) = \frac{1}{2m} = \displaystyle\sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)})^2 J(θ0,θ1)=2m1=i=1m(hθ(x(i))y(i))2

    a function of θ \theta θ

  • Goal:

    minimize θ 0 , θ 1 J ( θ 0 , θ 1 ) \underset{\theta_0,\theta_1}{\text{minimize}} J(\theta_0,\theta_1) θ0,θ1minimizeJ(θ0,θ1)

contour plot for two parameters and quadratic function for one parameters

Parameter Learning

Gradient Descent

Have some function J ( θ ) J(\theta) J(θ)

Want min θ J ( θ ) \underset{\theta}{\text{min}}J(\theta) θminJ(θ)

Outline:

  • Start with some θ \theta θ
  • Keep changing θ \theta θ to reduce J ( θ )
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Pattern recognition and machine learning是一门涉及到模式识别和机器学习的课程,通过这门课程的学习,我对模式识别和机器学习有了更深入的了解。 在模式识别方面,我学习了如何使用统计学和概率论的知识对数据进行分析,识别出数据中的规律和模式。通过学习不同的模式识别算法,我了解了如何利用机器来识别图像、音频、文本甚至是生物特征等不同类型的模式。在机器学习方面,我学习了如何利用机器学习算法来训练模型,使得机器可以从数据中学习规律和模式,进而做出预测和决策。 通过学习这门课程,我对机器学习和模式识别的应用有了更清晰的认识,比如在图像识别、语音识别、自然语言处理、生物特征识别等领域的应用。我也学习到了如何应用这些知识和技术来解决现实生活中的问题,比如医疗诊断、金融风控、智能驾驶等领域的应用。 另外,通过课程中的实践项目,我有机会动手实践机器学习算法的应用,从数据的处理和特征提取到模型的训练和评估,这些实践使我对课程中学到的理论知识有了更深刻的理解。 总的来说,通过学习Pattern recognition and machine learning这门课程,我不仅对机器学习和模式识别的理论和技术有了更深入的了解,也掌握了一些实践应用的技能,对未来在相关领域的发展和应用有了更清晰的思路和认识。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值