基础知识
- 数组查询速度快,增删速度慢,存储区连续,占用内存空间大。
- 链表查询速度慢,增删速度快,存储区散列不连续,占用空间小。
- HashMap集合了这两种数据结构,做了一个均衡处理。
- HashSet底层是有HashMap实现。
- HashSet不允许出现重复元素,新元素会覆盖旧元素。
HashSet:
HashSet实现了 Set接口,不允许出现重复元素,但是向HashSet中存储对象必须重写对象的HashCode和equals方法。HashSet是由HashMap实现的。HashSet允许存储NULL元素,并且NULL永远存储在第一个。
HashMap:
- HashMap实现了Map接口,允许NULL键NULL值。使用hash寻址会发生hash冲突问题,底层使用数组加链表的结构,解决了冲突也均衡了查找和增删的效率
- 一般将数组中的每一个元素称作桶(segment)
HashSet源码(JDK8有删减)
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, java.io.Serializable
{
static final long serialVersionUID = -5024744406713321676L;
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
// HashMap的value值,统一使用改对象
private static final Object PRESENT = new Object();
/**
* 构造函数,默认大小16,负载因子0.75
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
... 省略其他有参构造方法
/**
* iterator方法遍历map的keyset
* Returns an iterator over the elements in this set. The elements
* are returned in no particular order.
*
* @return an Iterator over the elements in this set
* @see ConcurrentModificationException
*/
public Iterator<E> iterator() {
return map.keySet().iterator();
}
/**
* add方法,调用map的put方法,对象为key,PRESENT为value
* Adds the specified element to this set if it is not already present.
* More formally, adds the specified element <tt>e</tt> to this set if
* this set contains no element <tt>e2</tt> such that
* <tt>(e==null ? e2==null : e.equals(e2))</tt>.
* If this set already contains the element, the call leaves the set
* unchanged and returns <tt>false</tt>.
*
* @param e element to be added to this set
* @return <tt>true</tt> if this set did not already contain the specified
* element
*/
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
/**
* romove方法同理,调用map的remove方法
* Removes the specified element from this set if it is present.
* More formally, removes an element <tt>e</tt> such that
* <tt>(o==null ? e==null : o.equals(e))</tt>,
* if this set contains such an element. Returns <tt>true</tt> if
* this set contained the element (or equivalently, if this set
* changed as a result of the call). (This set will not contain the
* element once the call returns.)
*
* @param o object to be removed from this set, if present
* @return <tt>true</tt> if the set contained the specified element
*/
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
....省略代码
}
- 从源码很直观的看出HashSet的底层居然全部使用HashMap实现,由此我们推测LinkedHashSet底层是LinkedHashMap,TreeSet底层是TreeMap,留在后面进行验证。
- 从HashSet的源码中看到add方法的代码,可以知道为什么HashSet中存储的对象必须重写hashCode方法和equals方法才能保证不重复。
- 因为底层的所有方法都是通过HashMap实现,所以这就解释了为什么HashSet允许出现NULL元素。
- 因此对于HashSet的源码研究,就要转移到对HashMap源码的研究上
HashMap源码(JDK8有删减)
public class HashMap<K,V> extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable {
/**
* code1: 默认的初始容量必须是2的幂次方
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* 构造函数中没有指定负载因子,使用这个默认的值
* Th