C#中List和Dictionary的底层实现

前言

        记录模拟面试中出现的知识点。

一、List

        List是C#中一个常用的可伸缩数组,为此我们不用手动去分配List数组的大小。从源码中可以知道,List 继承于IList,IReadOnlyList,IList是提供了主要的接口,IReadOnlyList提供了迭代接口。我们可以推测到List内部是用数组实现的,并且没有指定初始容量时,初始容量为0。 

    // Implements a variable-size List that uses an array of objects to store the
    // elements. A List has a capacity, which is the allocated length
    // of the internal array. As elements are added to a List, the capacity
    // of the List is automatically increased as required by reallocating the
    // internal array.
    // 
    [DebuggerTypeProxy(typeof(Mscorlib_CollectionDebugView<>))]
    [DebuggerDisplay("Count = {Count}")]
    [Serializable]
    public class List<T> : IList<T>, System.Collections.IList, IReadOnlyList<T>
    {
        private const int _defaultCapacity = 4;
 
        private T[] _items;
        [ContractPublicPropertyName("Count")]
        private int _size;
        private int _version;
        [NonSerialized]
        private Object _syncRoot;
        
        static readonly T[]  _emptyArray = new T[0];        
            
        // Constructs a List. The list is initially empty and has a capacity
        // of zero. Upon adding the first element to the list the capacity is
        // increased to 16, and then increased in multiples of two as required.
        public List() {
            _items = _emptyArray;
        }
    
        // Constructs a List with a given initial capacity. The list is
        // initially empty, but will have room for the given number of elements
        // before any reallocations are required.
        // 
        public List(int capacity) {
            if (capacity < 0) ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum);
            Contract.EndContractBlock();
 
            if (capacity == 0)
                _items = _emptyArray;
            else
                _items = new T[capacity];
        }

        Add接口:

        Add函数中,每增加一个元素,Add接口会先检查容量够不够,如果不否则用EnsureCapacity 来增加容量。

        在 EnsureCapacity 中,每次容量不够的时候,整个数组的容量都会扩充一倍。_defaultCapacity 是容量的默认值为4,随后扩充容量为8、16、32、64...所以list每次申请的内存数量都是之前item元素数量两倍。然后将当前所有的item元素系但添加的元素复制到新的内存。

        List用数组作为底层数据结构,好处是索引方式提取元素快,但是扩容的时候很糟糕,每次new数组都会造成内存垃圾,这给GC带来了很多负担。

        虽然说这里按照2指数扩容的方式可以为GC减轻负担,但是数组连续被替换掉还是会为GC造成不小的负担,尤其是List中频繁进行增删操作时。

        另外,如果数量不得到还会浪费内存空间。当元素数量超过512时,如果现在为520个元素,List会扩容到1024个元素,剩下的504个空间就造成浪费。

​        // Adds the given object to the end of this list. The size of the list is
        // increased by one. If required, the capacity of the list is doubled
        // before adding the new element.
        //
        public void Add(T item) {
            if (_size == _items.Length) EnsureCapacity(_size + 1);
            _items[_size++] = item;
            _version++;
        }

        // Ensures that the capacity of this list is at least the given minimum
        // value. If the currect capacity of the list is less than min, the
        // capacity is increased to twice the current capacity or to min,
        // whichever is larger.
        private void EnsureCapacity(int min) {
            if (_items.Length < min) {
                int newCapacity = _items.Length == 0? _defaultCapacity : _items.Length * 2;
                // Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow.
                // Note that this check works even when _items.Length overflowed thanks to the (uint) cast
                if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength;
                if (newCapacity < min) newCapacity = min;
                Capacity = newCapacity;
            }
        }

​

 Remove接口:

        Remove接口中调用了IndexOf 和 RemoveAt,IndexOf函数是位了找到元素的索引位置,用 RemoveAt 可以删除指定位置的元素。实现元素删除就是用Array.Copy对原数组进行覆盖。IndexOf 启用的是 Array.IndexOf 接口来查找元素的索引位置,这个接口本身内部实现是就是按索引顺序从0到n对每个位置的比较,复杂度为O(n)。

        // Removes the element at the given index. The size of the list is
        // decreased by one.
        // 
        public bool Remove(T item) {
            int index = IndexOf(item);
            if (index >= 0) {
                RemoveAt(index);
                return true;
            }
 
            return false;
        }

        // Returns the index of the first occurrence of a given value in a range of
        // this list. The list is searched forwards from beginning to end.
        // The elements of the list are compared to the given value using the
        // Object.Equals method.
        // 
        // This method uses the Array.IndexOf method to perform the
        // search.
        // 
        public int IndexOf(T item) {
            Contract.Ensures(Contract.Result<int>() >= -1);
            Contract.Ensures(Contract.Result<int>() < Count);
            return Array.IndexOf(_items, item, 0, _size);
        }

        // Removes the element at the given index. The size of the list is
        // decreased by one.
        // 
        public void RemoveAt(int index) {
            if ((uint)index >= (uint)_size) {
                ThrowHelper.ThrowArgumentOutOfRangeException();
            }
            Contract.EndContractBlock();
            _size--;
            if (index < _size) {
                Array.Copy(_items, index + 1, _items, index, _size - index);
            }
            _items[_size] = default(T);
            _version++;
        }

Insert接口:

        Insert接口先检查容量是否足够,不够则进行扩容。从源码中获悉,Insert插入元素时,使用的用拷贝数组的形式,将数组里的指定元素后面的元素向后移动一个位置。

        // Inserts an element into this list at a given index. The size of the list
        // is increased by one. If required, the capacity of the list is doubled
        // before inserting the new element.
        // 
        public void Insert(int index, T item) {
            // Note that insertions at the end are legal.
            if ((uint) index > (uint)_size) {
                ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_ListInsert);
            }
            Contract.EndContractBlock();
            if (_size == _items.Length) EnsureCapacity(_size + 1);
            if (index < _size) {
                Array.Copy(_items, index, _items, index + 1, _size - index);
            }
            _items[index] = item;
            _size++;            
            _version++;
        }

        到这里我们知道了Add,Insert,IndexOf,Remove接口都是没有做过任何形式的优化,都使用的是顺序迭代的方式,如果过于频繁使用的话,会导致效率降低,也会造成不少内存的冗余,使得GC时承担了更多的压力。

        至于其他相关接口AddRange,RemoveRange的原理和Add与Remove一样,区别只是多了几个元素,把单个元素变成了以容器为单位的形式进行操作。都是先检查容量是否合适,不合适则扩容,或者当Remove时先得到索引位置再进行整体的覆盖掉后面的的元素,容器本身大小不会变化,只是做了重复覆盖的操作。

[]的实现

        []的实现,直接使用了数组的索引方式获取元素。

// Sets or Gets the element at the given index.
// 
public T this[int index] {
    get {
        // Following trick can reduce the range check by one
        if ((uint) index >= (uint)_size) {
            ThrowHelper.ThrowArgumentOutOfRangeException();
        }
        Contract.EndContractBlock();
        return _items[index]; 
    }

    set {
        if ((uint) index >= (uint)_size) {
            ThrowHelper.ThrowArgumentOutOfRangeException();
        }
        Contract.EndContractBlock();
        _items[index] = value;
        _version++;
    }
}

Clear清除接口

        Clear接口不是直接把该数组删除,而是将数组中的元素清0,并且把_size设置为0,虚拟地表明当前容量为0。

// Clears the contents of List.
public void Clear() {
    if (_size > 0)
    {
        Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
        _size = 0;
    }
    _version++;
}

Contains接口:用于确认某元素是否存在于List中

        Contains接口使用线性查找方法比较元素,对数组进行迭代,比较每一个元素与要查找的元素实例是否一致,一致则返回true,全部查找完还没找到则返回false。

// Contains returns true if the specified element is in the List.
// It does a linear, O(n) search.  Equality is determined by calling
// item.Equals().
//
public bool Contains(T item) {
    if ((Object) item == null) {
        for(int i=0; i<_size; i++)
            if ((Object) _items[i] == null)
                return true;
        return false;
    }
    else {
        EqualityComparer<T> c = EqualityComparer<T>.Default;
        for(int i=0; i<_size; i++) {
            if (c.Equals(_items[i], item)) return true;
        }
        return false;
    }
}

ToArray转化数组接口:

        ToArray接口中,重新new了一个指定大小的数组,再将本身数组上的内容考别到新数组上,再返回出来。

// ToArray returns a new Object array containing the contents of the List.
// This requires copying the List, which is an O(n) operation.
public T[] ToArray() {
    Contract.Ensures(Contract.Result<T[]>() != null);
    Contract.Ensures(Contract.Result<T[]>().Length == Count);

    T[] array = new T[_size];
    Array.Copy(_items, 0, array, 0, _size);
    return array;
}

Find查找接口:

        Find接口使用的同样是线性查找,对每个元素都进行了比较,复杂度为O(n)。

public T Find(Predicate<T> match) {
    if( match == null) {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    Contract.EndContractBlock();

    for(int i = 0 ; i < _size; i++) {
        if(match(_items[i])) {
            return _items[i];
        }
    }
    return default(T);
}

Enumerator 枚举迭代:

        每次获取迭代器时,Enumerator 每次都是被new出来,如果大量使用迭代器的话,比如foreach就会造成大量的垃圾对象。所以尽量不要用foreach,因为List的foreach会增加有新的Enumerator实例,最后由GC垃圾回收掉。

// Returns an enumerator for this list with the given
// permission for removal of elements. If modifications made to the list 
// while an enumeration is in progress, the MoveNext and 
// GetObject methods of the enumerator will throw an exception.
//
public Enumerator GetEnumerator() {
    return new Enumerator(this);
}

/// <internalonly/>
IEnumerator<T> IEnumerable<T>.GetEnumerator() {
    return new Enumerator(this);
}

System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
    return new Enumerator(this);
}

[Serializable]
public struct Enumerator : IEnumerator<T>, System.Collections.IEnumerator
{
    private List<T> list;
    private int index;
    private int version;
    private T current;

    internal Enumerator(List<T> list) {
        this.list = list;
        index = 0;
        version = list._version;
        current = default(T);
    }

    public void Dispose() {
    }

    public bool MoveNext() {

        List<T> localList = list;

        if (version == localList._version && ((uint)index < (uint)localList._size)) 
        {                                                     
            current = localList._items[index];                    
            index++;
            return true;
        }
        return MoveNextRare();
    }

    private bool MoveNextRare()
    {                
        if (version != list._version) {
            ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
        }

        index = list._size + 1;
        current = default(T);
        return false;                
    }

    public T Current {
        get {
            return current;
        }
    }

    Object System.Collections.IEnumerator.Current {
        get {
            if( index == 0 || index == list._size + 1) {
                 ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumOpCantHappen);
            }
            return Current;
        }
    }

    void System.Collections.IEnumerator.Reset() {
        if (version != list._version) {
            ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion);
        }
        
        index = 0;
        current = default(T);
    }

}

Sort排序接口:

        它使用了 Array.Sort接口进行排序。

// Sorts the elements in a section of this list. The sort compares the
// elements to each other using the given IComparer interface. If
// comparer is null, the elements are compared to each other using
// the IComparable interface, which in that case must be implemented by all
// elements of the list.
// 
// This method uses the Array.Sort method to sort the elements.
// 
public void Sort(int index, int count, IComparer<T> comparer) {
    if (index < 0) {
        ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum);
    }
    
    if (count < 0) {
        ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.count, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum);
    }
        
    if (_size - index < count)
        ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_InvalidOffLen);
    Contract.EndContractBlock();

    Array.Sort<T>(_items, index, count, comparer);
    _version++;
}

Array.Sort 使用的是快速排序方式进行排序,从而我们明白了 List 的 Sort 排序的效率为O(nlogn)。

internal static void DepthLimitedQuickSort(T[] keys, int left, int right, IComparer<T> comparer, int depthLimit)
{
    do
    {
        if (depthLimit == 0)
        {
            Heapsort(keys, left, right, comparer);
            return;
        }

        int i = left;
        int j = right;

        // pre-sort the low, middle (pivot), and high values in place.
        // this improves performance in the face of already sorted data, or 
        // data that is made up of multiple sorted runs appended together.
        int middle = i + ((j - i) >> 1);
        SwapIfGreater(keys, comparer, i, middle);  // swap the low with the mid point
        SwapIfGreater(keys, comparer, i, j);   // swap the low with the high
        SwapIfGreater(keys, comparer, middle, j); // swap the middle with the high

        T x = keys[middle];
        do
        {
            while (comparer.Compare(keys[i], x) < 0) i++;
            while (comparer.Compare(x, keys[j]) < 0) j--;
            Contract.Assert(i >= left && j <= right, "(i>=left && j<=right)  Sort failed - Is your IComparer bogus?");
            if (i > j) break;
            if (i < j)
            {
                T key = keys[i];
                keys[i] = keys[j];
                keys[j] = key;
            }
            i++;
            j--;
        } while (i <= j);

        // The next iteration of the while loop is to "recursively" sort the larger half of the array and the
        // following calls recrusively sort the smaller half.  So we subtrack one from depthLimit here so
        // both sorts see the new value.
        depthLimit--;

        if (j - left <= right - i)
        {
            if (left < j) DepthLimitedQuickSort(keys, left, j, comparer, depthLimit);
            left = i;
        }
        else
        {
            if (i < right) DepthLimitedQuickSort(keys, i, right, comparer, depthLimit);
            right = j;
        }
    } while (left < right);
}

        我们可以发现List的效率并不高,只是通用性很强,大部分算法都是用的是线性复杂度的算法,这种线性算法遇到规模比较大的计算量级时就会导致CPU的大量损耗。我们可以自己写工具类来优化List的线性算法。

        List的内存分配方式也极为不合理,当List里的元素不断增加时,会多次重新new数组,导致原来的数组被抛弃,最后当GC被调用时造成回收的压力。

        我们可以提前告知 List 对象最多会有多少元素在里面,这样的话 List 就不会因为空间不够而抛弃原有的数组,去重新申请数组了。

        我们也可以看出,代码是线程不安全的,它并没有对多线程下做任何锁或其他同步操作。并发情况下,无法判断 _size++ 的执行顺序,因此当我们在多线程间使用 List 时加上安全机制。

        最后List 并不是高效的组件,真实情况是,他比数组的效率还要差的多,他只是个兼容性比较强的组件而已,好用,但效率差。

二、Dictionary

        Dictionary字典数据结构,是以关键字Key和值Value进行一一映射的。Key的类型并没有做任何的限制,可以是整数,也可以是的字符串,甚至可以是实例对象。我们可以通过Hash函数来建立Key和Value的映射关系。

        当发生哈希冲突时,可以用开放定址法、拉链法、双哈希法等。Dictionary使用的解决冲突方法是拉链法,又称链地址法。

Dictionary的实现:

        Dictionary 主要继承了 IDictionary 接口,和 ISerializable 接口。IDictionary 和 ISerializable 在使用过程中,其主要的接口为:Add,Remove,ContainsKey,Clear,TryGetValue,Keys, Values,以及[]数组符号形式作为返回值的接口。也包括了常用库 Collection 中的接口,Count,Contains等。

        从 Dictionary 的定义变量中可以看出,Dictionary 是以数组为底层数据结构的类。当我们实例化 new Dictionary() 后,内部的数组是0个数组的状态。与 List 组件一样,Dictionary 也是需要扩容的,会随着元素数量的增加而不断扩容。

public class Dictionary<TKey,TValue>: IDictionary<TKey,TValue>, IDictionary, IReadOnlyDictionary<TKey, TValue>, ISerializable, IDeserializationCallback 
{
    
    private struct Entry {
        public int hashCode;    // Lower 31 bits of hash code, -1 if unused
        public int next;        // Index of next entry, -1 if last
        public TKey key;           // Key of entry
        public TValue value;         // Value of entry
    }

    private int[] buckets;
    private Entry[] entries;
    private int count;
    private int version;
    private int freeList;
    private int freeCount;
    private IEqualityComparer<TKey> comparer;
    private KeyCollection keys;
    private ValueCollection values;
    private Object _syncRoot;
}

Add接口:

        this.buckets主要用来进行Hash碰撞,this.entries用来存储字典的内容,并标识下一个元素的位置。

public void Add(TKey key, TValue value)
{
    Insert(key, value, true);
}

private void Initialize(int capacity)
{
    int size = HashHelpers.GetPrime(capacity);
    buckets = new int[size];
    for (int i = 0; i < buckets.Length; i++) buckets[i] = -1;
    entries = new Entry[size];
    freeList = -1;
}

private void Insert(TKey key, TValue value, bool add)
{
    if( key == null ) {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
    }

    if (buckets == null) Initialize(0);
    int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
    int targetBucket = hashCode % buckets.Length;

#if FEATURE_RANDOMIZED_STRING_HASHING
    int collisionCount = 0;
#endif

    for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
        if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
            if (add) { 
                ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
            }
            entries[i].value = value;
            version++;
            return;
        } 

#if FEATURE_RANDOMIZED_STRING_HASHING
        collisionCount++;
#endif
    }
    int index;
    if (freeCount > 0) {
        index = freeList;
        freeList = entries[index].next;
        freeCount--;
    }
    else {
        if (count == entries.Length)
        {
            Resize();
            targetBucket = hashCode % buckets.Length;
        }
        index = count;
        count++;
    }

    entries[index].hashCode = hashCode;
    entries[index].next = buckets[targetBucket];
    entries[index].key = key;
    entries[index].value = value;
    buckets[targetBucket] = index;
    version++;

#if FEATURE_RANDOMIZED_STRING_HASHING

#if FEATURE_CORECLR
    // In case we hit the collision threshold we'll need to switch to the comparer which is using randomized string hashing
    // in this case will be EqualityComparer<string>.Default.
    // Note, randomized string hashing is turned on by default on coreclr so EqualityComparer<string>.Default will 
    // be using randomized string hashing

    if (collisionCount > HashHelpers.HashCollisionThreshold && comparer == NonRandomizedStringEqualityComparer.Default) 
    {
        comparer = (IEqualityComparer<TKey>) EqualityComparer<string>.Default;
        Resize(entries.Length, true);
    }
#else
    if(collisionCount > HashHelpers.HashCollisionThreshold && HashHelpers.IsWellKnownEqualityComparer(comparer)) 
    {
        comparer = (IEqualityComparer<TKey>) HashHelpers.GetRandomizedEqualityComparer(comparer);
        Resize(entries.Length, true);
    }
#endif // FEATURE_CORECLR

#endif

}

        首先在加入数据前需要对数据结构进行构造:

if (buckets == null) Initialize(0);

        在 Dictionary 构建时如果没有指定任何数量 buckets 就有可能是空的,所以需要对buckets进行初始化,Initialize(0),说明构建的数量级最少。如果不是0,而是其他自然数,则size设置为:

int size = HashHelpers.GetPrime(capacity);

         有专门的方法来计算到底该使用多大的数组,查出源码 HashHelpers 中,primes数值是这样定义的:

        其中 GetPrime 会返回一个需要的 size 最小的数值,从 GetPrime 函数的代码中,可以知道这个 size 是由数组 primes 里的值与当前需要的数量大小有关当需要的数量小于 primes 某个单元格的数字时返回该数字,而 ExpandPrime 则更加简单粗暴,直接返回原来size的2倍作为扩展数量。

        从Prime的定义看的出,首次定义size为3,每次扩大2倍,也就是,3->7->17->37->…. 底层数据结构的大小是按照这个数值顺序来扩展的。

public static readonly int[] primes = {
        3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919,
        1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591,
        17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437,
        187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263,
        1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369};

public static int GetPrime(int min) 
{
    if (min < 0)
        throw new ArgumentException(Environment.GetResourceString("Arg_HTCapacityOverflow"));
    Contract.EndContractBlock();

    for (int i = 0; i < primes.Length; i++) 
    {
        int prime = primes[i];
        if (prime >= min) return prime;
    }

    //outside of our predefined table. 
    //compute the hard way. 
    for (int i = (min | 1); i < Int32.MaxValue;i+=2) 
    {
        if (IsPrime(i) && ((i - 1) % Hashtable.HashPrime != 0))
            return i;
    }
    return min;
}

// Returns size of hashtable to grow to.
public static int ExpandPrime(int oldSize)
{
    int newSize = 2 * oldSize;

    // Allow the hashtables to grow to maximum possible size (~2G elements) before encoutering capacity overflow.
    // Note that this check works even when _items.Length overflowed thanks to the (uint) cast
    if ((uint)newSize > MaxPrimeArrayLength && MaxPrimeArrayLength > oldSize)
    {
        Contract.Assert( MaxPrimeArrayLength == GetPrime(MaxPrimeArrayLength), "Invalid MaxPrimeArrayLength");
        return MaxPrimeArrayLength;
    }

    return GetPrime(newSize);
}

        初始化后,对关键字Key做Hash哈希操作从而获得地址索引:

int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
int targetBucket = hashCode % buckets.Length;

        当调用函数获得Hash哈希值后,还需要对哈希地址做余操作,以确定地址落在 Dictionary 数组长度范围内不会溢出。

        紧接着对指定数组单元格内的链表元素做遍历操作,找出空出来的位置将值填入。

         这一步就是进行拉链法的推链动作,当获得Hash值的数组索引后,我们知道了该将数据存放在哪个数组位置上,如果该位置已经有元素被推入,则需要将其推入到链表的尾部。从for循环开始,检查是否到达链表的末尾,最后将数据放入尾部,并结束函数。

for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
    if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
        if (add) { 
            ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
        }
        entries[i].value = value;
        version++;
        return;
    } 

#if FEATURE_RANDOMIZED_STRING_HASHING
    collisionCount++;
#endif
}

         如果数组大小不够,即当被用来记录剩余单元格数量的变量 freeCount 等于0时,则进行扩容,扩容后的大小就是我们前面提到的调用 ExpandPrime 后的数量,即通常情况下为原来的2倍,再根据这个空间大小数字调用 GetPrime 来得到真正的新数组的大小。

int index;
if (freeCount > 0) {
    index = freeList;
    freeList = entries[index].next;
    freeCount--;
}
else {
    if (count == entries.Length)
    {
        Resize();
        targetBucket = hashCode % buckets.Length;
    }
    index = count;
    count++;
}

entries[index].hashCode = hashCode;
entries[index].next = buckets[targetBucket];
entries[index].key = key;
entries[index].value = value;
buckets[targetBucket] = index;

 Remove接口:

        先要查找到Key元素所在位置,所以再次将Key值做哈希操作也是在所难免的。然后类似沿着拉链法的模式寻找与关键字匹配的元素。

        用哈希函数 comparer.GetHashCode 再除余后得到范围内的地址索引,再做余操作确定地址落在数组范围内,从哈希索引地址开始,查找冲突的元素的Key是否与需要移除的Key值相同,相同则进行移除操作并退出。

        Remove 的移除操作并没有对内存进行删减,而只是将其单元格置空,这是位了减少了内存的频繁操作

public bool Remove(TKey key)
{
    if(key == null) {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
    }

    if (buckets != null) {
        int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
        int bucket = hashCode % buckets.Length;
        int last = -1;
        for (int i = buckets[bucket]; i >= 0; last = i, i = entries[i].next) {
            if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
                if (last < 0) {
                    buckets[bucket] = entries[i].next;
                }
                else {
                    entries[last].next = entries[i].next;
                }
                entries[i].hashCode = -1;
                entries[i].next = freeList;
                entries[i].key = default(TKey);
                entries[i].value = default(TValue);
                freeList = i;
                freeCount++;
                version++;
                return true;
            }
        }
    }
    return false;
}

ContainsKey 检测是否包含关键字的接口:

        ContainsKey 是一个查找Key位置的过程。它调用了 FindEntry 函数,FindEntry 查找Key值位置的方法跟我们前面提到的相同。从用Key值得到的哈希值地址开始查找,查看所有冲突链表中,是否有与Key值相同的值,找到即刻返回该索引地址。

public bool ContainsKey(TKey key)
{
    return FindEntry(key) >= 0;
}

private int FindEntry(TKey key)
{
    if( key == null) {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
    }

    if (buckets != null) {
        int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
        for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) {
            if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;
        }
    }
    return -1;
}

 TryGetValue 尝试获取值的接口:

        调用FindEntry的接口,来获取Key对应的Value值。

public bool TryGetValue(TKey key, out TValue value)
{
    int i = FindEntry(key);
    if (i >= 0) {
        value = entries[i].value;
        return true;
    }
    value = default(TValue);
    return false;
}

 []操作符的重定义:

        获取元素时也同样使用 FindEntry 函数,而 Set 设置元素时使用哈希拉链冲突解决方案。

public TValue this[TKey key] {
    get {
        int i = FindEntry(key);
        if (i >= 0) return entries[i].value;
        ThrowHelper.ThrowKeyNotFoundException();
        return default(TValue);
    }
    set {
        Insert(key, value, false);
    }
}

Dictionary 同List一样并不是线程安全的组件

    ** Hashtable has multiple reader/single writer (MR/SW) thread safety built into 
    ** certain methods and properties, whereas Dictionary doesn't. If you're 
    ** converting framework code that formerly used Hashtable to Dictionary, it's
    ** important to consider whether callers may have taken a dependence on MR/SW
    ** thread safety. If a reader writer lock is available, then that may be used
    ** with a Dictionary to get the same thread safety guarantee.

         Hashtable在多线程读写中是线程安全的,而 Dictionary 不是。如果要在多个线程中共享Dictionaray的读写操作,就要自己写lock以保证线程安全。

        Dictionary由数组构成,并且由哈希函数完成地址构建,由拉链法解决冲突。

        Dictionary和List一样最好在实例化对象时,哈希冲突与数组大小有很大关系,数组越大哈希冲突率就越小,所以new时尽量确定大致数量会更加高效。用数值方式做Key比用类实例方式作为Key值更加高效率。

        从内存操作上看,大小以3->7->17->37->….的速度,每次增加2倍多的顺序进行,删除时,并不缩减内存。

        如果想在多线程中,共享 Dictionary 则需要进行我们自己进行lock操作。

        在C#中所有类都继承自Object类,Dictionary使用Object的GetHashCode来获取类实例的哈希值,而GetHashCode是用算法将内存地址转化为哈希值的过程,因此我们可以认为它只是一个算法,并没有做任何值做缓存,每次调用它都会计算一次哈希值,这是比较隐形的性能损耗。如果频繁使用GetHashCode作为关键字来提取Value,那么我们应该关注下GetHashCode的算力损耗,是否可以用唯一ID的方式来代替GetHashCode算法。

文章参考:

《Unity3D高级编程之进阶主程》第一章,C#要点技术(一) - List 底层源码剖析 - 技术人生 - 编程技术 - JESSE人生 (luzexi.com)

《Unity3D高级编程之进阶主程》第一章,C#要点技术(一) - List 底层源码剖析 - 技术人生 - 编程技术 - JESSE人生 (luzexi.com)

  • 24
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值