Stanford 算法入门 week 6 Bloom Filter, Hash Function, Search Tree

Bloom Filter 讲解:http://blog.csdn.net/jiaomeng/article/details/1495500


-----------------------------------------------------------------------------------------------------------------------

Programming Question - 6

Question 1

Download the text file here. (Right click and save link as).

The goal of this problem is to implement a variant of the 2-SUM algorithm (covered in the Week 5 lecture on hash table applications)

The file contains 500,000 positive integers (there might be some repetitions!).This is your array of integers, with theith row of the file specifying the ith entry of the array.

Your task is to compute the number of target values t in the interval [2500,4000] (inclusive) such that there are distinct numbersx,y in the input file that satisfy x+y=t. (NOTE: ensuring distinctness requires a one-line addition to the algorithm from lecture.)

Write your numeric answer (an integer between 0 and 1501) in the space provided.

As an optional exercise, you might try implementing your own hash table for this question.


Question 2

Download the text file here.

The goal of this problem is to implement the "Median Maintenance" algorithm (covered in the Week 5 lecture on heap applications). The text file contains a list of the integers from 1 to 10000 in unsorted order; you should treat this as a stream of numbers, arriving one by one. Letting xi denote the ith number of the file, the kth median mk is defined as the median of the numbers x1,…,xk. (So, if k is odd, then mk is ((k+1)/2)th smallest number among x1,…,xk; if k is even, then mk is the (k/2)th smallest number among x1,…,xk.)

In the box below you should type the sum of these 10000 medians, modulo 10000 (i.e., only the last 4 digits). That is, you should compute (m1+m2+m3+⋯+m10000)mod10000.

As an optional exercise, you might compare the performance achieved by heap-based and search-tree-based implementations of the algorithm.
-----------------------------------------------------------------------------------------------------------------------
第一题的代码
#include <iostream>
#include <fstream>

#define MAX 1000000
#define MIN 4000

using namespace std;

int hash[MIN + 1] = { 0 };
int count = 0;

void readData() {
	ifstream fin("HashInt.txt");
	int temp = 0;
	while(fin>>temp) {
		if(temp < MIN) hash[temp]++;
	}
}

bool hashMap(int n) {
	if(n > MIN) return false;
	if(hash[n]) return true;
	else return false;
}

int main() {
	readData();

	for(int i = 2500; i <= 4000; i++) {
		for(int j = 1; j <= (i - 1) / 2; j++) {
			if(hashMap(j) && hashMap(i - j)) {
				count++;
				break;
			}
		}
	}

	cout<<count<<endl;
	return 0;
}


第二题的思路:http://www.cnblogs.com/lienhua34/archive/2011/12/06/2381299.html

利用大顶堆+小顶堆的方法应该是最高效的
main.cpp的代码, 我的“heap.h”用到的是这里的 http://blog.csdn.net/neostar2008/article/details/7769058

#include <iostream>
#include <vector>
#include <fstream>
#include "heap.h"
using namespace std;

int main() {
	ifstream fin("Median.txt");

	MaxHeap maxh;
	MinHeap minh;
	int medianSum = 0;
	int temp;

	
	while(fin>>temp) {
		if(temp < maxh.top()) maxh.insert(temp);
		else minh.insert(temp);

		if(maxh.heapSize() - minh.heapSize() > 1) {
			minh.insert(maxh.extractMax());
		}
		if(minh.heapSize() > maxh.heapSize()) {
			maxh.insert(minh.extractMin());
		}

		medianSum += maxh.top();
		
	}

	cout<<medianSum % 10000<<endl;
	return 0;
}





  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值