Python+OpenCV：K-Means聚类

最新推荐文章于 2024-05-15 20:04:31 发布

机器视觉001

最新推荐文章于 2024-05-15 20:04:31 发布

阅读量597

点赞数 1

分类专栏： Python OpenCV 文章标签： python opencv K-Means

本文链接：https://blog.csdn.net/liubing8609/article/details/110437548

版权

Python 同时被 2 个专栏收录

104 篇文章 5 订阅

订阅专栏

OpenCV

82 篇文章 20 订阅

订阅专栏

Python+OpenCV：K-Means聚类

目标

Learn to use cv.kmeans() function in OpenCV for data clustering.

理解参数

输入参数：

samples : It should be of np.float32 data type, and each feature should be put in a single column.
nclusters(K) : Number of clusters required at end.
criteria : It is the iteration termination criteria. When this criteria is satisfied, algorithm iteration stops. Actually, it should be a tuple of 3 parameters. They are `( type, max_iter, epsilon )`:
1. type of termination criteria. It has 3 flags as below:
  - cv.TERM_CRITERIA_EPS - stop the algorithm iteration if specified accuracy, epsilon, is reached.
  - cv.TERM_CRITERIA_MAX_ITER - stop the algorithm after the specified number of iterations, max_iter.
  - cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER - stop the iteration when any of the above condition is met.
2. max_iter - An integer specifying maximum number of iterations.
3. epsilon - Required accuracy.
attempts : Flag to specify the number of times the algorithm is executed using different initial labellings. The algorithm returns the labels that yield the best compactness. This compactness is returned as output.
flags : This flag is used to specify how initial centers are taken. Normally two flags are used for this :
cv.KMEANS_PP_CENTERS (Use kmeans++ center initialization by Arthur and Vassilvitskii [Arthur2007].) and cv.KMEANS_RANDOM_CENTERS (Select random initial centers in each attempt.).

输出参数：

compactness : It is the sum of squared distance from each point to their corresponding centers.
labels : This is the label array (same as 'code' in previous article) where each element marked '0', '1'.....
centers : This is array of centers of clusters.

Now we will see how to apply K-Means algorithm with three examples.

Data with Only One Feature

Consider, you have a set of data with only one feature, ie one-dimensional.

For eg, we can take our t-shirt problem where you use only height of people to decide the size of t-shirt.

####################################################################################################
# K-Means聚类(K-Means Clustering)
def lmc_cv_k_means_demo(method):
    """
        函数功能: method:
        0: Data with Only One Feature with K-Means Clustering in OpenCV.
    """

    # 0: Data with Only One Feature with K-Means Clustering in OpenCV.
    if 0 == method:
        x = np.random.randint(25, 100, 25)
        y = np.random.randint(175, 255, 25)
        z = np.hstack((x, y))
        z = z.reshape((50, 1))
        z = np.float32(z)
        pyplot.figure('Data Histogram', figsize=(16, 9))
        pyplot.hist(z, 256, [0, 256])
        pyplot.show()

        # Define criteria = ( type, max_iter = 10 , epsilon = 1.0 )
        criteria = (lmc_cv.TERM_CRITERIA_EPS + lmc_cv.TERM_CRITERIA_MAX_ITER, 10, 1.0)
        # Set flags (Just to avoid line break in the code)
        flags = lmc_cv.KMEANS_RANDOM_CENTERS
        # Apply KMeans
        compactness, labels, centers = lmc_cv.kmeans(z, 2, None, criteria, 10, flags)

        # split the data to different clusters depending on their labels.
        cluster_a = z[labels == 0]
        cluster_b = z[labels == 1]

        # plot 'A' in red, 'B' in blue, 'centers' in yellow
        pyplot.figure('Result', figsize=(16, 9))
        pyplot.hist(cluster_a, 256, [0, 256], color='r')
        pyplot.hist(cluster_b, 256, [0, 256], color='b')
        pyplot.hist(centers, 32, [0, 256], color='y')
        pyplot.show()

Data with Multiple Features

In previous example, we took only height for t-shirt problem. Here, we will take both height and weight, ie two features.

Remember, in previous case, we made our data to a single column vector. Each feature is arranged in a column, while each row corresponds to an input test sample.

For example, in this case, we set a test data of size 50x2, which are heights and weights of 50 people.

First column corresponds to height of all the 50 people and second column corresponds to their weights.

First row contains two elements where first one is the height of first person and second one his weight.

Similarly remaining rows corresponds to heights and weights of other people.

Check image below:

####################################################################################################
# K-Means聚类(K-Means Clustering)
def lmc_cv_k_means_demo(method):
    """
        函数功能: method:
        1: Data with Multiple Features with K-Means Clustering in OpenCV.
    """


    # 1: Data with Multiple Features with K-Means Clustering in OpenCV.
    if 1 == method:
        x = np.random.randint(25, 50, (25, 2))
        y = np.random.randint(60, 85, (25, 2))
        z = np.vstack((x, y))
        # convert to np.float32
        z = np.float32(z)

        # define criteria and apply kmeans()
        criteria = (lmc_cv.TERM_CRITERIA_EPS + lmc_cv.TERM_CRITERIA_MAX_ITER, 10, 1.0)
        ret, label, center = lmc_cv.kmeans(z, 2, None, criteria, 10, lmc_cv.KMEANS_RANDOM_CENTERS)

        # Now separate the data, Note the flatten()
        cluster_a = z[label.ravel() == 0]
        cluster_b = z[label.ravel() == 1]

        # Plot the data
        pyplot.figure('Result', figsize=(16, 9))
        pyplot.scatter(cluster_a[:, 0], cluster_a[:, 1])
        pyplot.scatter(cluster_b[:, 0], cluster_b[:, 1], c='r')
        pyplot.scatter(center[:, 0], center[:, 1], s=80, c='y', marker='s')
        pyplot.xlabel('Height')
        pyplot.ylabel('Weight')
        pyplot.show()

Color Quantization

Color Quantization is the process of reducing number of colors in an image.

One reason to do so is to reduce the memory. Sometimes, some devices may have limitation such that it can produce only limited number of colors.

In those cases also, color quantization is performed. Here we use k-means clustering for color quantization.

There is nothing new to be explained here. There are 3 features, say, R,G,B. So we need to reshape the image to an array of Mx3 size (M is number of pixels in image).

And after the clustering, we apply centroid values (it is also R,G,B) to all pixels, such that resulting image will have specified number of colors.

And again we need to reshape it back to the shape of original image.

Below is the code:

####################################################################################################
# K-Means聚类(K-Means Clustering)
def lmc_cv_k_means_demo(method):
    """
        函数功能: method:
        2: Color Quantization with K-Means Clustering in OpenCV.
    """


    # 2: Color Quantization with K-Means Clustering in OpenCV.
    if 2 == method:
        stacking_images = []
        image_file_name = ['D:/99-Research/TestData/image/Castle01.jpg',
                           'D:/99-Research/TestData/image/Castle02.jpg',
                           'D:/99-Research/TestData/image/Castle03.jpg',
                           'D:/99-Research/TestData/image/Castle04.jpg']
        for i in range(len(image_file_name)):
            image = lmc_cv.imread(image_file_name[i])
            image = lmc_cv.cvtColor(image, lmc_cv.COLOR_BGR2RGB)
            stacking_image = image.copy()
            result_image = image.copy()
            z = image.reshape((-1, 3))
            # convert to np.float32
            z = np.float32(z)
            # define criteria, number of clusters and apply kmeans()
            criteria = (lmc_cv.TERM_CRITERIA_EPS + lmc_cv.TERM_CRITERIA_MAX_ITER, 10, 1.0)
            for clusters_number in range(1, 4):
                ret, label, center = lmc_cv.kmeans(z, 2 ** clusters_number, None, criteria, 10,
                                                   lmc_cv.KMEANS_RANDOM_CENTERS)
                # Now convert back into uint8, and make original image
                center = np.uint8(center)
                res = center[label.flatten()]
                result_image = res.reshape(image.shape)
                # stacking images side-by-side
                stacking_image = np.hstack((stacking_image, result_image))

            # stacking images side-by-side
            stacking_images.append(stacking_image)

        # 显示图像
        for i in range(len(stacking_images)):
            pyplot.figure('Color Quantization with K-Means Clustering %d' % (i + 1))
            pyplot.subplot(1, 1, 1)
            pyplot.imshow(stacking_images[i], 'gray')
            pyplot.title('Color Quantization with K-Means Clustering: k=2 k=4 k=8')
            pyplot.xticks([])
            pyplot.yticks([])
            pyplot.savefig('%02d.png' % (i + 1))
        pyplot.show()

        # 根据用户输入保存图像
        if ord("q") == (lmc_cv.waitKey(0) & 0xFF):
            # 销毁窗口
            pyplot.close('all')
        return

机器视觉001

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Python+OpenCV：K-Means聚类

Python+OpenCV：K-Means聚类目标Learn to usecv.kmeans()function in OpenCV for data clustering.理解参数输入参数：samples: It should be ofnp.float32data type, and each feature should be put in a single column. nclusters(K): Number of clusters required at end...
复制链接

扫一扫