Clustering - Unsupervised learning introduction

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第十四章《无监督学习》中第108课时《无监督学习简介》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正,使其更加简洁,方便阅读,以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助.

In this video, I'd like to start to talk about clustering. This will be exciting because this is our first unsupervised learning algorithm where we learn from unlabeled data instead of from the labeled data. So, what is unsupervised learning?

I briefly talked about unsupervised learning at the beginning of the class. But it's useful to contrast it with supervised learning. So, here's a typical supervised learning problem where we're given a labeled training sets and the goal is to find the decision boundary that separate the positive label examples and the negative label examples. So the supervised learning problem in this case is given a set of labels to fit a hypothesis to it.

In contrast, in the unsupervised learning problem, we're given data that does not have any labels associated with it. So we're given data that looks like this. Here's a set of points, and they have no labels. And so our training set is written just as \{x^{(1)}, x^{(2)}, x^{(3)},..., x^{(m)}\}. And we don't get any labels y. And that's why the points plotted on the figure don't have any labels with them. So in unsupervised learning, what we do is we give this sort of unlabeled training set to an algorithm and we just ask the algorithm to find some structure in the data for us. Given this data set, one type of structure we might have an algorithm find is that it looks like this data set has points grouped into two separate clusters. And so an algorithm that finds clusters like the one I just circled, is called a clustering algorithm. And this will be our first type of unsupervised learning although there will be other types of unsupervised learning algorithms that we'll talk about later that find other types of structure or other types of patterns in the data other than clusters. We'll talk about this afterwards. We will talk about clustering. So, what is clustering good for?

Early in this class I had already mentioned a few applications. One is market segmentation, where you may have a database of customers and want to group them into different market segments. So you can sell to them separately or serve your different market segments better. Social network analysis, there are actually groups that have done this, things like looking at a group of people, things like Facebook, Google plus or maybe information about who are the people you email the most frequently and who are the people that they email the most frequently. And to find coherent groups of people. So, this would be another maybe clustering algorithm where you'd want to find who are the coherent groups of friends in a social network. Here's something that one of my friends actually worked on, which is, using clustering to organize computer clusters or to organize data centers better because if you know which computers in the data center are the cluster tend to work together, you can use that to reorganize your resources and how to lay out the network and how you design your data center and communications. And lastly, something that actually I worked on using clustering algorithm to understand galaxy formation and using that to understand astronomical detail. So, that's clustering which is our firstly example of an unsupervised learning algorithm. In the next video, we'll start to talk about a specific clustering algorithm.


  • 0
  • 0
    觉得还不错? 一键收藏
  • 0


  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


