List简介
常用的List是个通用的集合类,常常用它代替数组,因为它可以动态扩容,在我们写的时候不用手动去分配数组大小。
List<T>
底层实现的关键在于其动态数组的管理、自动扩展机制以及内存操作优化。
List源码分析
1、基本结构
从源码可以看出List是一个封装了动态数组的类,内部用数组来存储元素,并且当没有给予指定容量时,初始容量为0.
public class List<T> : IList<T>, System.Collections.IList, IReadOnlyList<T>
{
private const int _defaultCapacity = 4;
private T[] _items;
[ContractPublicPropertyName("Count")]
private int _size;
private int _version;
[NonSerialized]
private Object _syncRoot;
static readonly T[] _emptyArray = new T[0];
//其它构造函数和方法...
}
2、构造函数
- 默认构造函数:初始化一个空数组
- 指定容量的构造函数:初始化一个指定大小的空数组
- 从集合中复制元素的构造函数:从指定的集合中复制元素到新数组中
// Constructs a List. The list is initially empty and has a capacity
// of zero. Upon adding the first element to the list the capacity is
// increased to 16, and then increased in multiples of two as required.
public List() {
_items = _emptyArray;
}
// Constructs a List with a given initial capacity. The list is
// initially empty, but will have room for the given number of elements
// before any reallocations are required.
//
public List(int capacity) {
if (capacity < 0)
{
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum);
}
Contract.EndContractBlock();
if (capacity == 0)
_items = _emptyArray;
else
_items = new T[capacity];
}
// Constructs a List, copying the contents of the given collection. The
// size and capacity of the new list will both be equal to the size of the
// given collection.
//
public List(IEnumerable<T> collection) {
if (collection==null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
Contract.EndContractBlock();
ICollection<T> c = collection as ICollection<T>;
if( c != null) {
int count = c.Count;
if (count == 0)
{
_items = _emptyArray;
}
else {
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
}
else {
_size = 0;
_items = _emptyArray;
// This enumerable could be empty. Let Add allocate a new array, if needed.
// Note it will also go to _defaultCapacity first, not 1, then 2, etc.
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Add(en.Current);
}
}
}
}
3、索引器和属性
- 通过索引器this[int index]访问或设置特定位置的元素
- Count属性返回当前元素的数量
- Capacity属性用于获取或设置数组的容量
// Sets or Gets the element at the given index.
//
public T this[int index] {
get {
// Following trick can reduce the range check by one
if ((uint) index >= (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException();
}
Contract.EndContractBlock();
return _items[index];
}
set {
if ((uint) index >= (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException();
}
Contract.EndContractBlock();
_items[index] = value;
_version++;
}
}
// Gets and sets the capacity of this list. The capacity is the size of
// the internal array used to hold items. When set, the internal
// array of the list is reallocated to the given capacity.
//
public int Capacity {
get {
Contract.Ensures(Contract.Result<int>() >= 0);
return _items.Length;
}
set {
if (value < _size) {
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value, ExceptionResource.ArgumentOutOfRange_SmallCapacity);
}
Contract.EndContractBlock();
if (value != _items.Length) {
if (value > 0) {
T[] newItems = new T[value];
if (_size > 0) {
Array.Copy(_items, 0, newItems, 0, _size);
}
_items = newItems;
}
else {
_items = _emptyArray;
}
}
}
}
// Read-only property describing how many elements are in the List.
public int Count {
get {
Contract.Ensures(Contract.Result<int>() >= 0);
return _size;
}
}
4、动态扩容机制
- 当List中的元素数量达到当前数组的容量时,会自动扩展数据的大小
- 扩容的策略通常时将当前的容量翻倍,以减少频繁的内存分配操作
// Adds the given object to the end of this list. The size of the list is
// increased by one. If required, the capacity of the list is doubled
// before adding the new element.
//
public void Add(T item) {
if (_size == _items.Length) EnsureCapacity(_size + 1);
_items[_size++] = item;
_version++;
}
// Ensures that the capacity of this list is at least the given minimum
// value. If the currect capacity of the list is less than min, the
// capacity is increased to twice the current capacity or to min,
// whichever is larger.
private void EnsureCapacity(int min) {
if (_items.Length < min) {
int newCapacity = _items.Length == 0? _defaultCapacity : _items.Length * 2;
// Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow.
// Note that this check works even when _items.Length overflowed thanks to the (uint) cast
if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength;
if (newCapacity < min) newCapacity = min;
Capacity = newCapacity;
}
}
上述List源码中的Add函数,每次增加一个元素的数据,Add接口都会先检查容量够不够,如果不够则调用EnsureCapacity来增加容量。根据这段代码可以看出每次容量不够的时候,整个数组的容量都会扩充一倍,_defaultCapacity 是容量的默认值为4。因此整个扩充的路线为4,8,16,32,64,128,256,512,1024…依次类推。
因为List使用数组作为底层数据结构,数据的好处时使用索引查找元素很快,但是在扩容的时候需要重新new一个新的数组,将原来的数据拷贝过去,每次new新数组都会造成内存垃圾,这给垃圾回收GC带来了很大的负担。
这里使用2的指数倍进行扩容,可以减少扩容的次数,不过频繁的使用Add时不断的扩容还是增加GC的负担,而且当数量使用不当时会浪费大量的内存空间,例如元素数量为1025个时,List就会扩容到2048个元素。
5、常用方法
-
Remove:从列表中移除第一个匹配的元素。从源码可以看出,元素的删除是先调用IndexOf函数查找到指定元素的索引位置,再调用RemoveAt删除指定位置的元素,RemoveAt函数主要是调用Array.Copy对数组进行覆盖,就是将当前位置后面的元素位置都往前移一位。
// Removes the element at the given index. The size of the list is // decreased by one. public bool Remove(T item) { int index = IndexOf(item); if (index >= 0) { RemoveAt(index); return true; } return false; } // Removes the element at the given index. The size of the list is // decreased by one. // public void RemoveAt(int index) { if ((uint)index >= (uint)_size) { ThrowHelper.ThrowArgumentOutOfRangeException(); } Contract.EndContractBlock(); _size--; if (index < _size) { Array.Copy(_items, index + 1, _items, index, _size - index); } _items[_size] = default(T); _version++; }
-
Contains: 判断列表是否包含指定元素。线性查找比较是否存在目标元素
// Contains returns true if the specified element is in the List. // It does a linear, O(n) search. Equality is determined by calling // item.Equals(). // public bool Contains(T item) { if ((Object) item == null) { for(int i=0; i<_size; i++) if ((Object) _items[i] == null) return true; return false; } else { EqualityComparer<T> c = EqualityComparer<T>.Default; for(int i=0; i<_size; i++) { if (c.Equals(_items[i], item)) return true; } return false; } }
-
Clears:清空列表。Clear接口在调用时并不会删除数组,而是将数组中的元素清零,并设置_size为0,虚拟的表示当前的容量为0
// Clears the contents of List. public void Clear() { if (_size > 0) { Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references. _size = 0; } _version++; }
-
ToArray:将列表元素复制到一个新数组中。
// ToArray returns a new Object array containing the contents of the List. // This requires copying the List, which is an O(n) operation. public T[] ToArray() { Contract.Ensures(Contract.Result<T[]>() != null); Contract.Ensures(Contract.Result<T[]>().Length == Count); T[] array = new T[_size]; Array.Copy(_items, 0, array, 0, _size); return array; }
-
Insert:在指定位置插入一个元素。与Add接口一样,先检查容量是否足够,不足则扩容。从源码中获悉,Insert插入元素时,调用Array.Copy函数用拷贝数组的形式,将数组里的指定元素后面的元素向后移动一个位置。
// Inserts an element into this list at a given index. The size of the list // is increased by one. If required, the capacity of the list is doubled // before inserting the new element. // public void Insert(int index, T item) { // Note that insertions at the end are legal. if ((uint) index > (uint)_size) { ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_ListInsert); } Contract.EndContractBlock(); if (_size == _items.Length) EnsureCapacity(_size + 1); if (index < _size) { Array.Copy(_items, index, _items, index + 1, _size - index); } _items[index] = item; _size++; _version++; }
-
Sort:调用Array.Sort进行排序,而Array.Sort使用快速排序的方式进行排序
public void Sort(int index, int count, IComparer<T> comparer) { if (index < 0) { ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum); } if (count < 0) { ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.count, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum); } if (_size - index < count) ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_InvalidOffLen); Contract.EndContractBlock(); Array.Sort<T>(_items, index, count, comparer); _version++; } internal static void DepthLimitedQuickSort(T[] keys, int left, int right, IComparer<T> comparer, int depthLimit) { do { if (depthLimit == 0) { Heapsort(keys, left, right, comparer); return; } int i = left; int j = right; // pre-sort the low, middle (pivot), and high values in place. // this improves performance in the face of already sorted data, or // data that is made up of multiple sorted runs appended together. int middle = i + ((j - i) >> 1); SwapIfGreater(keys, comparer, i, middle); // swap the low with the mid point SwapIfGreater(keys, comparer, i, j); // swap the low with the high SwapIfGreater(keys, comparer, middle, j); // swap the middle with the high T x = keys[middle]; do { while (comparer.Compare(keys[i], x) < 0) i++; while (comparer.Compare(x, keys[j]) < 0) j--; Contract.Assert(i >= left && j <= right, "(i>=left && j<=right) Sort failed - Is your IComparer bogus?"); if (i > j) break; if (i < j) { T key = keys[i]; keys[i] = keys[j]; keys[j] = key; } i++; j--; } while (i <= j); // The next iteration of the while loop is to "recursively" sort the larger half of the array and the // following calls recrusively sort the smaller half. So we subtrack one from depthLimit here so // both sorts see the new value. depthLimit--; if (j - left <= right - i) { if (left < j) DepthLimitedQuickSort(keys, left, j, comparer, depthLimit); left = i; } else { if (i < right) DepthLimitedQuickSort(keys, i, right, comparer, depthLimit); right = j; } } while (left < right); }
总结
List效率不高,通用性强,大部分接口都是使用线性复杂度的算法,当List里的元素不断增加时,会多次重新new数组,导致原来的数组被抛弃,最后当GC被调用时造成回收的压力。我们可以提前告知 List 对象最多会有多少元素在里面,这样的话 List 就不会因为空间不够而抛弃原有的数组,去重新申请数组了。