Stanford Algorithms Design and Analysis Part 2 week 2

分享一下我老师大神的人工智能教程!零基础,通俗易懂!http://blog.csdn.net/jiangjunshow

也欢迎大家转载本篇文章。分享知识,造福人民,实现我们中华民族伟大复兴!

               

Problem Set-2




Programming Assignment-2

Question 1

In this programming problem and the next you'll code up the clustering algorithm from lecture for computing a max-spacing -clustering. Download the text file here. This file describes a distance function (equivalently, a complete graph with edge costs). It has the following format:

[number_of_nodes]
[edge 1 node 1] [edge 1 node 2] [edge 1 cost]
[edge 2 node 1] [edge 2 node 2] [edge 2 cost]
...
There is one edge  for each choice of , where  is the number of nodes. For example, the third line of the file is "1 3 5250", indicating that the distance between nodes 1 and 3 (equivalently, the cost of the edge (1,3)) is 5250. You can assume that distances are positive, but you should NOT assume that they are distinct.

Your task in this problem is to run the clustering algorithm from lecture on this data set, where the target number  of clusters is set to 4. What is the maximum spacing of a 4-clustering?

ADVICE: If you're not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!

import java.io.BufferedReader;import java.io.DataInputStream;import java.io.FileInputStream;import java.io.FileNotFoundException;import java.io.IOException;import java.io.InputStreamReader;import java.util.ArrayList;import java.util.Collections;/* * Question 1In this programming problem and the next you'll code up the clustering algorithm from lecture for computing a max-spacing k-clustering. Download the text file here. This file describes a distance function (equivalently, a complete graph with edge costs). It has the following format:[number_of_nodes][edge 1 node 1] [edge 1 node 2] [edge 1 cost][edge 2 node 1] [edge 2 node 2] [edge 2 cost]...There is one edge (i,j) for each choice of 1≤i<j≤n, where n is the number of nodes. For example, the third line of the file is "1 3 5250", indicating that the distance between nodes 1 and 3 (equivalently, the cost of the edge (1,3)) is 5250. You can assume that distances are positive, but you should NOT assume that they are distinct.Your task in this problem is to run the clustering algorithm from lecture on this data set, where the target number k of clusters is set to 4. What is the maximum spacing of a 4-clustering?ADVICE: If you're not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum! */public class PS2Q1 static int[] parents; /* parents[i] if -ve it is parent to itself. and the value represents its count- size of the cluster. */ static class Edge implements Comparable<Edge>{  int i;  int j;  int cost;  public Edge(int i, int j, int cost){   this.i = i;   this.j = j;   this.cost = cost;  }  @Override  public int compareTo(Edge arg0) {   // TODO check the order   final int BEFORE = 1;   final int AFTER = -1;   if (this.cost >= arg0.cost) return BEFORE;   else return AFTER;   //return 0;  } } public static int find(int i){  while (parents[i]>0){   i = parents[i];  }  return i; } public static void union(int i, int j){  //find parents of i and j..if same..they r in same cluster, update the count  //else in diff clusters..choose the one with less count and change its parent to the other parent  int pi = find(i);  int pj = find(j);  if (pi == pj) parents[pi] += -1;  else {   if (parents[pi] < parents[pj]){    //-ve counts so actually pi is larger than pj    parents[pi] += parents[pj];    parents[pj] = pi;   }else{    parents[pj] += parents[pi];    parents[pi] = pj;   }  }   } public static int numClusters(){  //num of -ve entries in parents is the no of clusters  int c = 0;  for(int i = 0; i < parents.length; i++)   if (parents[i]<0) c++;  return c;   }  /**  * @param args  */ public static void main(String[] args) {  ArrayList<Edge> edges = new ArrayList<Edge>();  ArrayList<ArrayList<Integer>> clusters = new ArrayList<ArrayList<Integer>>();  int k = 4;  int n = 0;  try {   FileInputStream f = new FileInputStream(".//Algo2//clustering1.txt");   DataInputStream d = new DataInputStream(f);   BufferedReader b =  new BufferedReader(new InputStreamReader(d));   n = Integer.parseInt(b.readLine());   parents = new int[n];      for(int i = 0; i < n;i++){    parents[i] = -1;       }   String str; int i,j,v;   while((str=b.readLine())!=null){    i = Integer.parseInt(str.split(" ")[0]);    j = Integer.parseInt(str.split(" ")[1]);    v = Integer.parseInt(str.split(" ")[2]);    edges.add(new Edge(i-1,j-1,v));   }   Collections.sort(edges);   for(Edge e : edges){    union(e.i,e.j);    if (numClusters()==k) {     System.out.println("k clusters found ");     break;        }   }   //print the max distance among clusters..is actuallz the min distance between any two clusters   int max = Integer.MAX_VALUE;   for (Edge e : edges){    if (find(e.i)!=find(e.j)) max=Math.min(max, e.cost);       }   System.out.println("max distance "+max);  } catch (FileNotFoundException e) {   // TODO Auto-generated catch block   e.printStackTrace();  } catch (NumberFormatException e) {   // TODO Auto-generated catch block   e.printStackTrace();  } catch (IOException e) {   // TODO Auto-generated catch block   e.printStackTrace();  }   //create n clusters  for(int i = 0; i <n;i++)   clusters.add(new ArrayList<Integer>(i));  while(clusters.size() > k){  } }}

Question 2

In this question your task is again to run the clustering algorithm from lecture, but on a MUCH bigger graph. So big, in fact, that the distances (i.e., edge costs) are only defined  implicitly, rather than being provided as an explicit list.

The data set is here. The format is:
[# of nodes] [# of bits for each node's label]
[first bit of node 1] ... [last bit of node 1]
[first bit of node 2] ... [last bit of node 2]
...
For example, the third line of the file "0 1 1 0 0 1 1 0 0 1 0 1 1 1 1 1 1 0 1 0 1 1 0 1" denotes the 24 bits associated with node #2.

The distance between two nodes  and  in this problem is defined as the Hamming distance--- the number of differing bits --- between the two nodes' labels. For example, the Hamming distance between the 24-bit label of node #2 above and the label "0 1 0 0 0 1 0 0 0 1 0 1 1 1 1 1 1 0 1 0 0 1 0 1" is 3 (since they differ in the 3rd, 7th, and 21st bits).

The question is: what is the largest value of  such that there is a -clustering with spacing at least 3? That is, how many clusters are needed to ensure that no pair of nodes with all but 2 bits in common get split into different clusters?

NOTE: The graph implicitly defined by the data file is so big that you probably can't write it out explicitly, let alone sort the edges by cost. So you will have to be a little creative to complete this part of the question. For example, is there some way you can identify the smallest distances without explicitly looking at every pair of nodes?

import java.io.BufferedReader;import java.io.DataInputStream;import java.io.FileInputStream;import java.io.FileNotFoundException;import java.io.IOException;import java.io.InputStreamReader;import java.util.ArrayList;import java.util.BitSet;import java.util.HashMap;import java.util.Map.Entry;/* * In this question your task is again to run the clustering algorithm from lecture, but on a MUCH bigger graph. So big, in fact,  * that the distances (i.e., edge costs) are only defined implicitly, rather than being provided as an explicit list.The data set is here. The format is:[# of nodes] [# of bits for each node's label][first bit of node 1] ... [last bit of node 1][first bit of node 2] ... [last bit of node 2]...For example, the third line of the file "0 1 1 0 0 1 1 0 0 1 0 1 1 1 1 1 1 0 1 0 1 1 0 1" denotes the 24 bits associated with node #2.The distance between two nodes u and v in this problem is defined as the Hamming distance--- the number of differing bits --- between the two nodes' labels. For example, the Hamming distance between the 24-bit label of node #2 above and the label "0 1 0 0 0 1 0 0 0 1 0 1 1 1 1 1 1 0 1 0 0 1 0 1" is 3 (since they differ in the 3rd, 7th, and 21st bits).The question is: what is the largest value of k such that there is a k-clustering with spacing at least 3? That is, how many clusters are needed to ensure that no pair of nodes with all but 2 bits in common get split into different clusters?NOTE: The graph implicitly defined by the data file is so big that you probably can't write it out explicitly, let alone sort the edges by cost. So you will have to be a little creative to complete this part of the question. For example, is there some way you can identify the smallest distances without explicitly looking at every pair of nodes? *  */public class PS2Q2 {  /*  * This problem is exactly like PS2Q1 except that there are too many comparisons to make if we go with crude approach.   * 20000 x 20000 comparisons needed  */ //point and its leader static HashMap<BitSet, BitSet> clusters = new HashMap<BitSet, BitSet>(); static int n; static int numBits; public static BitSet getBitSet(String str){  String str2[] = str.split(" ");  BitSet b = new BitSet(numBits);  int j = numBits-1;  b.clear();  for (int i = 0; i < str2.length; i++){   if (Integer.parseInt(str2[i]) == 1){    b.flip(j);   }   j--;  }  return b; } public static BitSet find(BitSet b){  while (!b.equals(clusters.get(b))){   //b = (BitSet) clusters.get(b).clone();   b = clusters.get(b);  }  return b; } public static void union (BitSet a, BitSet b){  //actually smaller cluster should be merged with bigger one. Here do it randomly. Cluster sizes should be maintained for  //it to work.  BitSet pa = find(a);  BitSet pb = find(b);  if (!pa.equals(pb)){   clusters.put(pa, pb);  } } public static ArrayList<BitSet> getMembers(BitSet s){  BitSet sbackup = (BitSet) s.clone();  ArrayList<BitSet> ret = new ArrayList<BitSet>();  for(int i = 0; i <= numBits-1; i++){   BitSet s1 = new BitSet();   s1.clear();   s1 = (BitSet) sbackup.clone();   s1.flip(i);   if (clusters.containsKey(s1)){    ret.add(s1);   }  }  //now flip 2 bits to create distance of 2  for(int i = 0; i <= numBits-1; i++){   BitSet s1 = new BitSet();   s1.clear();   s1 = (BitSet) sbackup.clone();   s1.flip(i);   for (int j = i+1; j<=numBits-1; j++){    BitSet s2 = new BitSet();    s2 = (BitSet) s1.clone();    s2.flip(j);    if (clusters.containsKey(s2)) ret.add(s2);   }  }  return ret; } public static void main(String[] args) {  //int distance = 2; //must be <= 2  try {   FileInputStream f = new FileInputStream(".//Algo2//clustering2.txt");   DataInputStream d = new DataInputStream(f);   BufferedReader br = new BufferedReader(new InputStreamReader(d));   String str = br.readLine();   n = Integer.parseInt(str.split(" ")[0]);   numBits = Integer.parseInt(str.split(" ")[1]);   int count2 = 0;   while((str = br.readLine())!= null){    BitSet b = getBitSet(str);//    if (clusters.containsKey(b)) {//     System.out.println(" a duplicate found " + b.toString());//    }    clusters.put(b, b);    //count2++;   }   //System.out.println( count2 + " entries are read ");   //System.out.println(" number of entries in DHT " + clusters.size());   for (BitSet s : clusters.keySet()){    //for all at distance of 1 or 2 from s    ArrayList<BitSet> members = getMembers(s);    //System.out.println(" members sizes "+members.size());    if (members.size() == 0) count2++;    for (BitSet m : members){     union(s,m);    }   }   System.out.println(" number of points with zero neighbours with <=2 distance "+count2);   int count = 0;   //parent of a parent is itself..each cluster has a single parent.    for(Entry<BitSet, BitSet> e : clusters.entrySet()){    if (e.getKey().equals(e.getValue())){     count++;    }   }   System.out.println(" num clusters " + count);  } catch (FileNotFoundException e) {   // TODO Auto-generated catch block   e.printStackTrace();  } catch (IOException e) {   // TODO Auto-generated catch block   e.printStackTrace();  } }}




           

给我老师的人工智能教程打call!http://blog.csdn.net/jiangjunshow
这里写图片描述
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
毕业设计,基于SpringBoot+Vue+MySQL开发的体育馆管理系统,源码+数据库+毕业论文+视频演示 现代经济快节奏发展以及不断完善升级的信息化技术,让传统数据信息的管理升级为软件存储,归纳,集中处理数据信息的管理方式。本体育馆管理系统就是在这样的大环境下诞生,其可以帮助管理者在短时间内处理完毕庞大的数据信息,使用这种软件工具可以帮助管理人员提高事务处理效率,达到事半功倍的效果。此体育馆管理系统利用当下成熟完善的SpringBoot框架,使用跨平台的可开发大型商业网站的Java语言,以及最受欢迎的RDBMS应用软件之一的Mysql数据库进行程序开发。实现了用户在线选择试题并完成答题,在线查看考核分数。管理员管理收货地址管理、购物车管理、场地管理、场地订单管理、字典管理、赛事管理、赛事收藏管理、赛事评价管理、赛事订单管理、商品管理、商品收藏管理、商品评价管理、商品订单管理、用户管理、管理员管理等功能。体育馆管理系统的开发根据操作人员需要设计的界面简洁美观,在功能模块布局上跟同类型网站保持一致,程序在实现基本要求功能时,也为数据信息面临的安全问题提供了一些实用的解决方案。可以说该程序在帮助管理者高效率地处理工作事务的同时,也实现了数据信息的整体化,规范化与自动化。 关键词:体育馆管理系统;SpringBoot框架;Mysql;自动化
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值