Computer Vision | Clustering Basics

There are two major issues in thinking of clustering:
What is a good inter-cluster distance?
Single-link: Using the distance between the closest elements
Complete-link: Using the distance between the furthest elements
Group average: Using the distance between two averages
How many clusters are there?
It can be given a priori, e.g., the number of desired regions
It can be indirectly inferred by setting a threshold by which two points can be decided whether they belong to the same cluster.

Two problems to avoid
Under-fitting: too few clusters
Over-fitting: too many clusters

Two clustering strategies
Agglomerative clustering
Divisive clustering

Missing Data Problem: Example
Let us consider a missing data problem example.
Assume that people can be classified into three groups according to the physical size, big, median, and small people.
Each group is characterized by the population percentage and a 2-D Gaussian showing the distribution of weight-height.
The reason for using Gaussian distributions instead hard-thresholds is due to the uncertainty or error for weight-height measurement.
在这里插入图片描述
在这里插入图片描述
Now you are given the statistics of a certain population, and you are given two tasks:
Estimate the model parameters for each class
Classify each data point into one of three classes

That means the class labels are missing in the data we have collected, and we need to find them.
在这里插入图片描述

Probabilistic Formulation

Prior probability: something you know before you even see the data or the observation. It is like your prior knowledge.
在这里插入图片描述
Likelihood function: something to evaluate how likely a data sample is generated from a certain class. It is like your evidence.
在这里插入图片描述
Posterior probability: based on what you see and you know, what is the probability of a data sample y belonging to certain class label. It is like the estimate of the missing data.
在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值