Chinese Whispers 聚类算法

Chinese Whispers 聚类算法用于当你不知道有多少个类时。他的基本算法步骤是:

1,对于所有节点v,都赋值一个初始的类class(vi)=i

2,随机选取一个节点vt,找到v所有的临接节点,对临接节点所属的类进行打分。例如一个节点1的临接节点有2,3,4,5,分别属于a,b,c,b类别,边1-2,1-3,1-4,1-5的权值都为1,那么类a的得分就是1,类b得分2,类c得分1

3,将得分最高的类别赋值给vt

4,返回2

下面上dlib的代码进行解析:

  inline unsigned long chinese_whispers (
        const std::vector<ordered_sample_pair>& edges,
        std::vector<unsigned long>& labels,
        const unsigned long num_iterations,
        dlib::rand& rnd
    )
    {
        // make sure requires clause is not broken,传进来的边集需要排好序
        DLIB_ASSERT(is_ordered_by_index(edges),
                    "\t unsigned long chinese_whispers()"
                    << "\n\t Invalid inputs were given to this function"
        );

        labels.clear();
        if (edges.size() == 0)
            return 0;

        std::vector<std::pair<unsigned long, unsigned long> > neighbors;
        find_neighbor_ranges(edges, neighbors);

        // Initialize the labels, each node gets a different label.
        
        labels.resize(neighbors.size());
        for (unsigned long i = 0; i < labels.size(); ++i)
            labels[i] = i;


        for (unsigned long iter = 0; iter < neighbors.size()*num_iterations; ++iter)
        {
            // Pick a random node.随机挑选一个节点
            const unsigned long idx = rnd.get_random_64bit_number()%neighbors.size();

            // Count how many times each label happens amongst our neighbors.对节点的临接几点所属的类别进行统计打分
            std::map<unsigned long, double> labels_to_counts;
            const unsigned long end = neighbors[idx].second;
            for (unsigned long i = neighbors[idx].first; i != end; ++i)
            {
                labels_to_counts[labels[edges[i].index2()]] += edges[i].distance();
            }

            // find the most common label.找到得分最高的类,并给该节点归类
            std::map<unsigned long, double>::iterator i;
            double best_score = -std::numeric_limits<double>::infinity();
            unsigned long best_label = labels[idx];
            for (i = labels_to_counts.begin(); i != labels_to_counts.end(); ++i)
            {
                if (i->second > best_score)
                {
                    best_score = i->second;
                    best_label = i->first;
                }
            }

            labels[idx] = best_label;
        }


        // Remap the labels into a contiguous range.  First we find the
        // mapping.因为上述找到的类别可能不是连续的0,1,2,3...,需要对类别进行重新映射为连续的编号
        std::map<unsigned long,unsigned long> label_remap;
        for (unsigned long i = 0; i < labels.size(); ++i)
        {
            const unsigned long next_id = label_remap.size();
            if (label_remap.count(labels[i]) == 0)
                label_remap[labels[i]] = next_id;
        }
        // now apply the mapping to all the labels.给所有节点赋值类别
        for (unsigned long i = 0; i < labels.size(); ++i)
        {
            labels[i] = label_remap[labels[i]];
        }

        return label_remap.size();
    }
相关参考论文

《Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems》

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
解决这个问题King Julien rules the Madagascar island whose primary crop is coconuts. If the price of coconuts is P , then King Julien’s subjects will demand D(P ) = 1200 − 100P coconuts per week for their own use. The number of coconuts that will be supplied per week by the island’s coconut growers is S(p) = 100P. (a) (2 pts) Calculate the equilibrium price and quantity for coconuts. (b) (2 pts) One day, King Julien decided to tax his subjects in order to collect coconuts for the Royal Larder. The king required that every subject who consumed a coconut would have to pay a coconut to the king as a tax. Thus, if a subject wanted 5 coconuts for himself, he would have to purchase 10 coconuts and give 5 to the king. When the price that is received by the sellers is pS, how much does it cost one of the king’s subjects to get an extra coconut for himself? (c) (3 pts) When the price paid to suppliers is pS, how many coconuts will the king’s subjects demand for their own consumption (as a function of pS)? 2 (d) (2 pts) Under the above coconut tax policy, determine the total number of coconuts demanded per week by King Julien and his subjects as a function of pS. (e) (3 pts) Calculate the equilibrium value of pS, the equilibrium total number of coconuts produced, and the equilibrium total number of coconuts consumed by Julien’s subjects. (f) (5 pts) King Julien’s subjects resented paying the extra coconuts to the king, and whispers of revolution spread through the palace. Worried by the hostile atmosphere, the king changed the coconut tax. Now, the shopkeepers who sold the coconuts would be responsible for paying the tax. For every coconut sold to a consumer, the shopkeeper would have to pay one coconut to the king. For this new policy, calculate the number of coconuts being sold to the consumers, the value per coconuts that the shopkeepers got after paying their tax to the king, and the price payed by the consumers.
最新发布
03-07

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值