基本的搜索算法有线性搜索,二分搜索和散列法。
- 线性搜索:从开头依次访问各元素,效率很低,但适用性广
- 二分搜索:这种方式是建立在数组有序的情况下。每次搜索中间的元素,使得每次搜索之后范围缩小一半。
- 散列法:各元素的存储位置由散列函数决定。这种方法只需要将元素的关键字代入特定函数便可找出其对应位置。
线性搜索
线性搜索中有一点要注意的是标记的使用,引入标记之后效率能提升常数倍。这就像链表,如果有头结点的话操作会简化。
题目链接:http://judge.u-aizu.ac.jp/onlinejudge/description.jsp?id=ALDS1_4_A
Search I
You are given a sequence of n integers S and a sequence of different q integers T. Write a program which outputs C, the number of integers in T which are also in the set S.
Input
In the first line n is given. In the second line, n integers are given. In the third line q is given. Then, in the fourth line, q integers are given.
Output
Print C in a line.
Constraints
- n ≤ 10000
- q ≤ 500
- 0 ≤ an element in S ≤ \(10^9\)
- 0 ≤ an element in T ≤ \(10^9\)
Sample Input 1
5
1 2 3 4 5
3
3 4 1
Sample Output 1
3
Sample Input 2
3
3 1 2
1
5
Sample Output 2
0
Sample Input 3
5
1 1 2 2 3
2
1 2
Sample Output 3
2
题目大意为T数列的数字是不重复的,S数列中的数字可能有重复的,找出既包含与T也包含与S的整数。从例子3可以看出就算S的数字重复了,最后也只算了一次。所以我们可以看出以T中的数字为基准,S中如果含有与T的数字相同的数字,结果加1。
参考代码如下:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class LinearSearch {
public static boolean search(int[] num, int key){
// 设置标记
int len = num.length;
num[len-1] = key;
int i = 0;
while (num[i] != key){
i++;
}
return i != len-1;
}
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int n = Integer.parseInt(br.readLine());
int[] num = new int[n+1];
String[] strings = br.readLine().split("\\s");
for (int i=0; i<strings.length; i++){
num[i] = Integer.parseInt(strings[i]);
}
int q = Integer.parseInt(br.readLine());
String[] strs = br.readLine().split("\\s");
int sum = 0;
for (int i=0; i<q; i++){
if (search(num, Integer.parseInt(strs[i]))){
sum++;
}
}
System.out.println(sum);
}
}
二分搜索
二分搜索的前提是数组有序,没有这个条件就没法应用。在写二分搜索代码时,需要注意的是代码中left,right表示的意义,循环的判定条件。
题目链接: http://judge.u-aizu.ac.jp/onlinejudge/description.jsp?id=ALDS1_4_B
这道题目初看起来和刚才一样,但是数列长度变长了,如果还是使用上面的线性搜索的话,时间会超过。并且这道题目增加了一个条件,即S的元素按升序排列,这就为我们使用二分搜索提供了条件。
参考代码如下:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Main {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int n = Integer.parseInt(br.readLine());
int[] num = new int[n];
String[] strings = br.readLine().split("\\s");
for (int i=0; i<strings.length; i++){
num[i] = Integer.parseInt(strings[i]);
}
int q = Integer.parseInt(br.readLine());
String[] strs = br.readLine().split("\\s");
int sum = 0;
for (int i=0; i<q; i++){
if (binarySearch(num, Integer.parseInt(strs[i]))){
sum++;
}
}
System.out.println(sum);
}
private static boolean binarySearch(int[] num, int key) {
// left表示搜索范围开头的元素,right指示末尾元素的后一个元素
int left = 0;
int right = num.length;
// 此时循环条件为 <
while (left < right){
int mid = (left+right)/2;
if (key == num[mid]){
return true;
}
else if (key < num[mid]){
right = mid;
}
else {
left = mid+1;
}
}
return false;
}
}
散列法
散列法是一种搜索算法,根据关键字的值来确定存储位置,然后将关键字放在这个存储位置。这里使用了散列表,是一种数据结构,由数组和散列函数组成。散列表的两个重要问题分别为散列函数的设计和冲突的解决,具体就不详细展开了。
题目链接: http://judge.u-aizu.ac.jp/onlinejudge/description.jsp?id=ALDS1_4_C
Search III
Your task is to write a program of a simple dictionary which implements the following instructions:
- insert str: insert a string str in to the dictionary
- find str: if the distionary contains str, then print 'yes', otherwise print 'no'
Input
In the first line n, the number of instructions is given. In the following n lines, n instructions are given in the above mentioned format.
Output
Print yes or no for each find instruction in a line.
Constraints
A string consists of 'A', 'C', 'G', or 'T'
1 ≤ length of a string ≤ 12
n ≤ 1000000
Sample Input 1
5
insert A
insert T
insert C
find G
find A
Sample Output 1
no
yes
Sample Input 2
13
insert AAA
insert AAC
insert AGA
insert AGG
insert TTT
find AAA
find CCC
find CCC
insert CCC
find CCC
insert T
find TTT
find T
Sample Output 2
yes
no
no
yes
yes
yes
这里使用双散列结构中使用的开放地址法。
\[h(k) = k mod m\]
\[H(k) = h(k, i) = (h_1(k))+i*h_2(k)) mod m\]
参考代码如下:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Dictionary {
private static int LEN = 1046527;
private static String[] HASHTABLE = new String[LEN];
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int n = Integer.parseInt(br.readLine());
for (int i=0; i<n; i++){
String string = br.readLine();
if (string.charAt(0) == 'i'){
insert(string.substring(7));
}
else {
if (find(string.substring(5))){
System.out.println("yes");
}
else {
System.out.println("no");
}
}
}
}
private static boolean find(String string) {
int key = getKey(string);
for (int i=0; ;i++){
int h = (hash1(key)+i*hash2(key)) % LEN;
if (HASHTABLE[h] == null){
return false;
}
else if (HASHTABLE[h].equals(string)){
return true;
}
}
}
private static void insert(String string) {
// 首先将字符串转化为数字,然后将数字代入散列函数得到能放入的位置
// 如果这个位置存在字符串,且和要插入的一样,则不做任何操作;如果不存在,则直接放入即可。
int key = getKey(string);
for (int i=0; ;i++){
int h = (hash1(key)+i*hash2(key)) % LEN;
if (HASHTABLE[h] == null){
HASHTABLE[h] = string;
break;
}
else if (HASHTABLE[h].equals(string)){
break;
}
}
}
// 将字符串转化为数值并生成key
private static int getKey(String string) {
int sum = 0;
int p = 1;
for (int i=0; i<string.length(); i++){
sum += p*getChar(string.charAt(i));
p *= 5;
}
return sum;
}
private static int getChar(char c) {
switch (c){
case 'A':
return 1;
case 'C':
return 2;
case 'G':
return 3;
case 'T':
return 4;
default:
return 0;
}
}
// 设计两个散列函数
private static int hash1(int key){
return key % LEN;
}
private static int hash2(int key){
return 1 + (key % (LEN-1));
}
}
参考文献:《挑战程序设计竞赛-算法和数据结构》