数据结构-集合和映射篇
文章目录
一. 集合(Set)
Set
Void add(E) 不能添加重复元素 应用:客户统计、词汇量统计
Void remove(E)
Boolean contains(E)
Int getSize()
Boolean isEmpty()
1. 定义集合
/**
* 集合接口
* @param <E>
*/
public interface Set<E> {
void add(E e);
void remove(E e);
boolean contains(E e);
int getSize();
boolean isEmpty();
}
2. 基于二分搜索树的集合定义
public class BSTSet<E extends Comparable<E>> implements Set<E>{
private BST<E> bst;
public BSTSet(){
bst = new BST<>();
}
@Override
public void add(E e) {
bst.add(e);
}
@Override
public void remove(E e) {
bst.remove(e);
}
@Override
public boolean contains(E e) {
return bst.contains(e);
}
@Override
public int getSize() {
return bst.size();
}
@Override
public boolean isEmpty() {
return bst.isEmpty();
}
}
方法对应上一章二分搜索树的结构定义
测试类
public static void main(String[] args) {
System.out.println("Pride and Prejudice");
ArrayList<String> words1 = new ArrayList<>();
if (FileOperation.readFile("pride-and-prejudice.txt", words1)) {
System.out.println("Total words: " + words1.size());
BSTSet<String> set1 = new BSTSet<>();
for (String word : words1)
set1.add(word);
System.out.println("Total different words: " + set1.getSize());
}
}
BST和LinkedList都属于动态数据结构
BST:
Class Node{
E e;
Node left;
Node right;
}
LinkedList:
Class Node{
E e;
Node next;
}
3. 基于链表的集合定义
import java.util.ArrayList;
public class LinkedListSet<E> implements Set<E> {
private LinkedList<E> list;
public LinkedListSet(){
list = new LinkedList<>();
}
@Override
public void add(E e) {
if(!list.contains((e))){
list.addFirst(e);
}
}
@Override
public void remove(E e) {
list.removeElement(e);
}
@Override
public boolean contains(E e) {
return list.contains(e);
}
@Override
public int getSize(){
return list.getSize();
}
@Override
public boolean isEmpty() {
return list.isEmpty();
}
}
测试类
public static void main(String[] args) {
System.out.println("Pride and Prejudice");
ArrayList<String> words1 = new ArrayList<>();
FileOperation.readFile("pride-and-prejudice.txt", words1);
System.out.println("Total words:" + words1.size());
LinkedListSet<String> set1 = new LinkedListSet<>();
for(String word:words1){
set1.add(word);
}
System.out.println("Total different words:" + set1.getSize());
}
4. 集合类复杂度分析
对比二分搜索树和链表,代码如下:
import java.util.ArrayList;
public class SetMain {
private static double testSet(Set<String> set,String filename){
long startTime = System.nanoTime();
System.out.println(filename);
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile(filename,words)){
System.out.println("Total words:" + words.size());
for(String word:words){
set.add(word);
}
System.out.println("Total different words:" + set.getSize());
}
long endTime = System.nanoTime();
return (endTime-startTime)/1000000000.0;
}
public static void main(String[] args) {
String filename = "pride-and-prejudice.txt";
BSTSet<String> bstSet = new BSTSet<>();
double time1 = testSet(bstSet,filename);
System.out.println("BST Set:" + time1 + "s");
System.out.println();
LinkedListSet<String> linkedListSet = new LinkedListSet<>();
double time2 = testSet(linkedListSet,filename);
System.out.println("LinkedListSet:" + time2 + "s");
}
}
对比结果:
pride-and-prejudice.txt Total words:125901 Total different words:6530
BST Set:0.4018152s
pride-and-prejudice.txt Total words:125901 Total different words:6530
LinkedListSet:4.8493949s
5. 集合的时间复杂度分析
方法 | LinkedListSet | BSTSet | 平均 | 最差 |
---|---|---|---|---|
增add | O(n) | O(n) | O(logn) | O(n) |
查 contains | O(n) | O(n) | O(logn) | O(n) |
删 remove | O(n) | O(n) | O(logn) | O(n) |
计算h层一共有多少个节点
logn 和 n 的差距
logn | n | ||
---|---|---|---|
n=16 | 4 | 16 | 相差4倍 |
n=1024 | 10 | 1024 | 相差100倍 |
n=100万 | 20 | 100万 | 相差5万倍 |
有序集合和无序集合
-
有序集合中的元素具有顺序性 →基于搜索树的实现
-
无序集合中的元素没有顺序性 →基于哈希表的实现
多重集合
- 集合中的元素可以重复
二. 映射
-
存储(键,值)数据对的数据结构(Key,Value)
-
根据键(Key),寻找值(Value)
1. 定义映射接口
public interface Map<K,V> {
void add(K key);
V remove(K key);
boolean contains(K key);
V get(K key);
void set(K key,V value);
int getSize();
boolean isEmpty();
}
2. 基于链表的映射节点构造
private class Node{
public K key;
public V value;
public Node next;
public Node(K key,V value,Node next){
this.key = key;
this.value = value;
this.next = next;
}
public Node(K key){
this(key,null,null);
}
public Node(){
this(null,null,null);
}
@Override
public String toString(){
return key.toString() + ":" + value.toString();
}
}
基于链表的映射实现
public class LinkedListMap<K, V> implements Map<K, V> {
//加入上面的节点类
private Node dummyHead;
private int size;
public LinkedListMap() {
dummyHead = new Node();
size = 0;
}
@Override
public int getSize() {
return size;
}
@Override
public boolean isEmpty() {
return size == 0;
}
private Node getNode(K key) {
Node cur = dummyHead.next;
while (cur != null) {
if (cur.key.equals(key)) {
return cur;
}
cur = cur.next;
}
return null;
}
@Override
public boolean contains(K key) {
return getNode(key) != null;
}
@Override
public V get(K key) {
Node node = getNode(key);
return node == null ? null : node.value;
}
@Override
public void add(K key, V value) {
Node node = getNode(key);
if (node == null) {
dummyHead.next = new Node(key, value, dummyHead.next);
size++;
} else {
node.value = value;
}
}
@Override
public void set(K key, V newValue) {
Node node = getNode(key);
if (node == null) {
throw new IllegalArgumentException(key + "doesn't exist!");
}
node.value = newValue;
}
@Override
public V remove(K key) {
Node prev = dummyHead;
while (prev.next != null) {
if (prev.next.key.equals(key)) {
break;
}
prev = prev.next;
}
if (prev.next != null) {
Node delNode = prev.next;
prev.next = delNode.next;
delNode.next = null;
size--;
return delNode.value;
}
return null;
}
}
测试类
public static void main(String[] args) {
System.out.println("Pride and Prejudice");
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile("Pride-and-prejudice.txt",words)){
System.out.println("Total words:" + words.size());
LinkedListMap<String,Integer> map = new LinkedListMap<>();
for(String word:words){
if(map.contains(word)){
map.set(word,map.get(word)+1);
}else{
map.add(word,1);
}
}
System.out.println("Total different words :" + map.getSize());
System.out.println("Frequency of PRIDE:" + map.getNode("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}
}
测试结果:
Total words:125901
Total different words :6530
Frequency of PRIDE:pride:53
Frequency of PREJUDICE: 11
3. 基于二分搜索树的映射节点构造
private class Node {
public K key;
public V value;
public Node left, right;
public Node(K key, V value) {
this.key = key;
this.value = value;
left = null;
right = null;
}
}
基于二分搜索树的映射实现
public class BSTMap<K extends Comparable<K>, V> implements Map<K, V> {
//创建上面的节点
private Node root;
private int size;
public BSTMap() {
root = null;
size = 0;
}
@Override
public int getSize() {
return size;
}
@Override
public boolean isEmpty() {
return size == 0;
}
//向二分搜索树中添加新的元素(key,value)
@Override
public void add(K key, V value) {
root = add(root, key, value);
}
//向以node 为根的二分搜索树中插入元素K key,V value,递归算法
//返回插入新节点后的二分搜索树的根
private Node add(Node node, K key, V value) {
if (node == null) {
size++;
return new Node(key, value);
}
if (key.compareTo(node.key) < 0) {
node.left = add(node.left, key, value);
} else if (key.compareTo(node.key) > 0) {
node.right = add(node.right, key, value);
} else {
node.value = value;
}
return node;
}
//返回以Node为根节点的二分搜索树中,key所在的节点
private Node getNode(Node node, K key) {
if (node == null) {
return null;
}
if (key.compareTo(node.key) == 0) {
return node;
} else if (key.compareTo(node.key) < 0) {
return getNode(node.left, key);
} else {
return getNode(node.right, key);
}
}
@Override
public boolean contains(K key) {
return getNode(root, key) != null;
}
@Override
public V get(K key) {
Node node = getNode(root, key);
return node == null ? null : node.value;
}
@Override
public void set(K key, V newValue) {
Node node = getNode(root, key);
if (node == null) {
throw new IllegalArgumentException(key + "doesn't exist");
}
node.value = newValue;
}
//返回以node为根的二分搜索树的最小值所在的节点
private Node minimum(Node node) {
if (node.left == null) {
return node;
}
return minimum(node.left);
}
//删除掉以node为根的二分搜索树中的最小节点
//返回删除节点后新的二分搜索树的根
private Node removeMin(Node node) {
if (node.left == null) {
Node rightNode = node.right;
node.right = null;
size--;
return rightNode;
}
node.left = removeMin(node.left);
return node;
}
//从二分搜索树中删除元素(key,value)的节点
@Override
public V remove(K key) {
Node node = getNode(root, key);
if (node != null) {
root = remove(root, key);
return node.value;
}
return null;
}
private Node remove(Node node, K key) {
if (node == null) {
return null;
}
if (key.compareTo(node.key) < 0) {
node.left = remove(node.left, key);
return node;
} else if (key.compareTo(node.key) > 0) {
node.right = remove(node.right, key);
return node;
} else {
//待删除节点左子树为空的情况
if (node.left == null) {
Node rightNode = node.right;
node.right = null;
size--;
return rightNode;
}
//待删除右子树为空的情况
if (node.right == null) {
Node leftNode = node.left;
node.left = null;
size--;
return leftNode;
}
//待删除左右子树不为空
Node successor = minimum(node.right);
successor.right = removeMin(node.right);
successor.left = node.left;
node.left = node.right = null;
return successor;
}
}
}
测试类
public static void main(String[] args) {
System.out.println("Pride and Prejudice");
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile("Pride-and-prejudice.txt",words)){
System.out.println("Total words:" + words.size());
LinkedListMap<String,Integer> map = new LinkedListMap<>();
for(String word:words){
if(map.contains(word)){
map.set(word,map.get(word)+1);
}else{
map.add(word,1);
}
}
System.out.println("Total different words :" + map.getSize());
System.out.println("Frequency of PRIDE:" + map.get("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}
}
测试结果:
Pride and Prejudice
Total words:125901
Total different words :6530
Frequency of PRIDE:53
Frequency of PREJUDICE: 11
4. 映射的时间复杂度分析
方法 | LinkedListMap | BSTMap | 平均 | 最差 |
---|---|---|---|---|
增 add | O(n) | O(h) | O(logn) | O(n) |
删 remove | O(n) | O(h) | O(logn) | O(n) |
改 set | O(n) | O(h) | O(logn) | O(n) |
查 get | O(n) | O(h) | O(logn) | O(n) |
查 contains | O(n) | O(h) | O(logn) | O(n) |
对比代码如下:
public class MapMain {
private static double testMap(Map<String,Integer> map,String filename){
long startTime = System.nanoTime();
System.out.println("Pride and prejudice:");
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile("Pride-and-prejudice.txt",words)){
System.out.println("Total words:" + words.size());
for(String word:words){
if(map.contains(word)){
map.set(word,map.get(word)+1);
}else{
map.add(word,1);
}
}
System.out.println("Total different words :" + map.getSize());
System.out.println("Frequency of PRIDE:" + map.get("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}
long endTime = System.nanoTime();
return (endTime-startTime) / 1000000000.0;
}
//测试
public static void main(String[] args) {
String filename = "pride-and-prejudice.txt";
BSTMap<String,Integer> bstMap = new BSTMap<>();
double time1 = testMap(bstMap,filename);
System.out.println("BST Map:" + time1 + "s");
System.out.println();
LinkedListMap<String,Integer> linkedListMap = new LinkedListMap<>();
double time2 = testMap(linkedListMap,filename);
System.out.println("LinkedListMap:" + time2 + "s");
}
}
对比结果如下:
Pride and prejudice:
Total words:125901
Total different words :6530
Frequency of PRIDE:53
Frequency of PREJUDICE: 11
BST Map:0.1658282s
Pride and prejudice:
Total words:125901
Total different words :6530
Frequency of PRIDE:53
Frequency of PREJUDICE: 11
LinkedListMap:16.6475049s
有序映射和无序映射
-
有序映射中的键具有顺序性 → 基于搜索树的实现
-
无序映射中的键没有顺序性 → 基于哈希表的实现
多重映射
- 多重映射中的键是可以重复的