Stanford 算法入门 week 6 Bloom Filter, Hash Function, Search Tree

最新推荐文章于 2024-01-08 15:52:50 发布

neostar2008

最新推荐文章于 2024-01-08 15:52:50 发布

阅读量1.9k

点赞数

分类专栏： Stanford Algo

本文链接：https://blog.csdn.net/neostar2008/article/details/7782858

版权

Stanford Algo 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Bloom Filter 讲解：http://blog.csdn.net/jiaomeng/article/details/1495500

-----------------------------------------------------------------------------------------------------------------------

Programming Question - 6

Question 1

Download the text file here. (Right click and save link as).

The goal of this problem is to implement a variant of the 2-SUM algorithm (covered in the Week 5 lecture on hash table applications)

The file contains 500,000 positive integers (there might be some repetitions!).This is your array of integers, with the $ith$ row of the file specifying the $ith$ entry of the array.

Your task is to compute the number of target values $t$ in the interval [2500,4000] (inclusive) such that there are distinct numbers $x,y$ in the input file that satisfy $x+y=t$ . (NOTE: ensuring distinctness requires a one-line addition to the algorithm from lecture.)

Write your numeric answer (an integer between 0 and 1501) in the space provided.

As an optional exercise, you might try implementing your own hash table for this question.

Question 2

Download the text file here.

The goal of this problem is to implement the "Median Maintenance" algorithm (covered in the Week 5 lecture on heap applications). The text file contains a list of the integers from 1 to 10000 in unsorted order; you should treat this as a stream of numbers, arriving one by one. Letting

xi

denote the

i

th number of the file, the

k

th median

mk

is defined as the median of the numbers

x1,\dots,xk

. (So, if

k

is odd, then

mk

((k+1)/2)

th smallest number among

x1,\dots,xk

; if

k

is even, then

mk

is the

(k/2)

th smallest number among

x1,\dots,xk

.)

In the box below you should type the sum of these 10000 medians, modulo 10000 (i.e., only the last 4 digits). That is, you should compute

(m1+m2+m3+\dots+m10000)mod10000

.

As an optional exercise, you might compare the performance achieved by heap-based and search-tree-based implementations of the algorithm.

-----------------------------------------------------------------------------------------------------------------------
第一题的代码

#include <iostream>
#include <fstream>

#define MAX 1000000
#define MIN 4000

using namespace std;

int hash[MIN + 1] = { 0 };
int count = 0;

void readData() {
	ifstream fin("HashInt.txt");
	int temp = 0;
	while(fin>>temp) {
		if(temp < MIN) hash[temp]++;
	}
}

bool hashMap(int n) {
	if(n > MIN) return false;
	if(hash[n]) return true;
	else return false;
}

int main() {
	readData();

	for(int i = 2500; i <= 4000; i++) {
		for(int j = 1; j <= (i - 1) / 2; j++) {
			if(hashMap(j) && hashMap(i - j)) {
				count++;
				break;
			}
		}
	}

	cout<<count<<endl;
	return 0;
}

第二题的思路：http://www.cnblogs.com/lienhua34/archive/2011/12/06/2381299.html

利用大顶堆+小顶堆的方法应该是最高效的
main.cpp的代码，我的“heap.h”用到的是这里的 http://blog.csdn.net/neostar2008/article/details/7769058

#include <iostream>
#include <vector>
#include <fstream>
#include "heap.h"
using namespace std;

int main() {
	ifstream fin("Median.txt");

	MaxHeap maxh;
	MinHeap minh;
	int medianSum = 0;
	int temp;

	
	while(fin>>temp) {
		if(temp < maxh.top()) maxh.insert(temp);
		else minh.insert(temp);

		if(maxh.heapSize() - minh.heapSize() > 1) {
			minh.insert(maxh.extractMax());
		}
		if(minh.heapSize() > maxh.heapSize()) {
			maxh.insert(minh.extractMin());
		}

		medianSum += maxh.top();
		
	}

	cout<<medianSum % 10000<<endl;
	return 0;
}

neostar2008

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Stanford 算法入门 week 6 Bloom Filter, Hash Function, Search Tree

Bloom Filter 讲解：http://blog.csdn.net/jiaomeng/article/details/1495500-----------------------------------------------------------------------------------------------------------------------Pr
复制链接

扫一扫