这里介绍几个常用的集合类的数据结构(ArrayList,LinkList,HashMap)
1.ArrayList
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, Serializable
{
private static final long serialVersionUID = 8683452581122892189L;
private transient Object[] elementData;
private int size;
private static final int MAX_ARRAY_SIZE = 2147483639;
public ArrayList(int paramInt)
{
if (paramInt < 0) {
throw new IllegalArgumentException("Illegal Capacity: " + paramInt);
}
this.elementData = new Object[paramInt];
}
public ArrayList()
{
this(10);
}
可以看出ArrayList其实封装了一个Object类型的数组(
transient 表示该属性不是对象串行化的一部分
),无参的ArrayList初始化数组长度为10;
public boolean add(E paramE)
{
ensureCapacityInternal(this.size + 1);
this.elementData[(this.size++)] = paramE;
return true;
}
可以看到arrayList每次增加一个元素之前都会对当前数组长度去做一个检查
private void ensureCapacityInternal(int paramInt) {
this.modCount += 1;
if (paramInt - this.elementData.length > 0)
grow(paramInt);
}
private void grow(int paramInt)
{
int i = this.elementData.length;
int j = i + (i >> 1);
if (j - paramInt < 0)
j = paramInt;
if (j - 2147483639 > 0) {
j = hugeCapacity(paramInt);
}
this.elementData = Arrays.copyOf(this.elementData, j);
}
private static int hugeCapacity(int paramInt) {
if (paramInt < 0)
throw new OutOfMemoryError();
return paramInt > 2147483639 ? 2147483647 : 2147483639;
}
从上面代码可以看出,ArrayList当长度不足的时候会对当前数组进行扩容操作,扩容为当前数组长度的1/2
之前碰到有人喜欢把数组转换成ArrayList再使用ArrayList的Contains(Object o)方法(吐槽:多此一举),其实contains方法采用的for循环去对比对象的hash值
2.linkedList
和ArrayList相比,他们除了实现了相同的接口,数据结构完全不一样
public class LinkedList<E> extends AbstractSequentialList<E>
implements List<E>, Deque<E>, Cloneable, Serializable
{
transient int size = 0;
transient Node<E> first;
transient Node<E> last;
private static final long serialVersionUID = 876323262645176354L;
public LinkedList()
{
}
public LinkedList(Collection<? extends E> paramCollection)
{
this();
addAll(paramCollection);
}
无参构造器什么都没有做,我们可以看到两个未初始化的Node类型的成员变量,也许到这里还不是很清楚,我们看看add方法
public boolean add(E paramE)
{
linkLast(paramE);
return true;
}
void linkLast(E paramE)
{
Node localNode1 = this.last;
Node localNode2 = new Node(localNode1, paramE, null);
this.last = localNode2;
if (localNode1 == null)
this.first = localNode2;
else
localNode1.next = localNode2;
this.size += 1;
this.modCount += 1;
}
Node(Node<E> paramNode1, E paramE, Node<E> paramNode2) {
this.item = paramE;
this.next = paramNode2;
this.prev = paramNode1;
}
从上面可以很清楚地看到,LinkedList实际上是一个双向链表(每个节点有一个指向前面元素和后面元素的指针(引用)),插入元素的时候会在最后一个元素末尾插入一个元素,并把之前最后一个元素的next指针指向新插入的元素。
3.HashMap
public class HashMap<K, V> extends AbstractMap<K, V>
implements Map<K, V>, Cloneable, Serializable
{
static final int DEFAULT_INITIAL_CAPACITY = 16;
static final int MAXIMUM_CAPACITY = 1073741824;
static final float DEFAULT_LOAD_FACTOR = 0.75F;
transient Entry<K, V>[] table;
final transient int hashSeed = Hashing.randomHashSeed(this);
private transient Set<Map.Entry<K, V>> entrySet = null;
public HashMap(int paramInt, float paramFloat)
{
if (paramInt < 0) {
throw new IllegalArgumentException("Illegal initial capacity: " + paramInt);
}
if (paramInt > 1073741824)
paramInt = 1073741824;
if ((paramFloat <= 0.0F) || (Float.isNaN(paramFloat))) {
throw new IllegalArgumentException("Illegal load factor: " + paramFloat);
}
int i = 1;
while (i < paramInt) {
i <<= 1;
}
this.loadFactor = paramFloat;
this.threshold = (int)Math.min(i * paramFloat, 1.073742E+009F);
this.table = new Entry[i];
this.useAltHashing = ((VM.isBooted()) && (i >= Holder.ALTERNATIVE_HASHING_THRESHOLD));
init();
}
public HashMap()
{
this(16, 0.75F);
}
其实可以看出HashMap用来存储数据的也是一个数组(Entry),无参构造器构造一个默认长度为16的数组,threshold取的是当前数组的75%和1.07374195E9的最小值,threshold是干什么的我们继续看
HashMap增加元素:
public V put(K paramK, V paramV)
{
if (paramK == null)
return putForNullKey(paramV);
int i = hash(paramK);
int j = indexFor(i, this.table.length);
for (Entry localEntry = this.table[j]; localEntry != null; localEntry = localEntry.next)
{
Object localObject1;
if ((localEntry.hash == i) && (((localObject1 = localEntry.key) == paramK) || (paramK.equals(localObject1)))) {
Object localObject2 = localEntry.value;
localEntry.value = paramV;
localEntry.recordAccess(this);
return localObject2;
}
}
this.modCount += 1;
addEntry(i, paramK, paramV, j);
return null;
}
可以看到增加元素的时候会检查key值,如果key值存在,则覆盖原先的value值,其实我们可以这样理解HashMap内部维护了一张Hash表,通过它的hash算法和indexFor方法找到存储元素的位置,如果位置上存在元素且key值相等,就替换
static int indexFor(int paramInt1, int paramInt2)
{
return paramInt1 & paramInt2 - 1;
}
addEntry(没有找到对应的key值则新增)
void addEntry(int paramInt1, K paramK, V paramV, int paramInt2)
{
if ((this.size >= this.threshold) && (null != this.table[paramInt2])) {
resize(2 * this.table.length);
paramInt1 = null != paramK ? hash(paramK) : 0;
paramInt2 = indexFor(paramInt1, this.table.length);
}
createEntry(paramInt1, paramK, paramV, paramInt2);
}
void resize(int paramInt)
{
Entry[] arrayOfEntry1 = this.table;
int i = arrayOfEntry1.length;
if (i == 1073741824) {
this.threshold = 2147483647;
return;
}
Entry[] arrayOfEntry2 = new Entry[paramInt];
boolean bool1 = this.useAltHashing;
this.useAltHashing |= ((VM.isBooted()) && (paramInt >= Holder.ALTERNATIVE_HASHING_THRESHOLD));
boolean bool2 = bool1 ^ this.useAltHashing;
transfer(arrayOfEntry2, bool2);
this.table = arrayOfEntry2;
this.threshold = (int)Math.min(paramInt * this.loadFactor, 1.073742E+009F);
}
void createEntry(int paramInt1, K paramK, V paramV, int paramInt2)
{
Entry localEntry = this.table[paramInt2];
this.table[paramInt2] = new Entry(paramInt1, paramK, paramV, localEntry);
this.size += 1;
}
Entry(int paramInt, K paramK, V paramV, Entry<K, V> paramEntry)
{
this.value = paramV;
this.next = paramEntry;
this.key = paramK;
this.hash = paramInt;
}
新增元素:会判断当前数组的长度是否大于threshold并且是否存在相同hash值得元素(都为true,扩容),新增的Entry里面有一个next指针,指向当前的同hash值得对象
扩容操作:扩充为当前数组的2倍,并且会对已存在的已储存的元素进行rehash操作
理解了上面,HashMap如何获取指定key值得元素也就明了了
final Entry<K, V> getEntry(Object paramObject)
{
int i = paramObject == null ? 0 : hash(paramObject);
for (Entry localEntry = this.table[indexFor(i, this.table.length)];
localEntry != null;
localEntry = localEntry.next)
{
Object localObject;
if ((localEntry.hash == i) && (((localObject = localEntry.key) == paramObject) || ((paramObject != null) && (paramObject.equals(localObject)))))
{
return localEntry;
}
}
return null;
}
查找操作:先通过hash值找到数组中的位置,再对比key值是否相同(不相同,Entry.next)
下面我们分别测试一下单线程情况下ArrayList,LinkedList,HashMap的增删查效率(无图无真相(模拟了1000w条数据),根据机器的性能不同时间差异也不同)
结果显而易见