NSDictionary 内部结构、实现原理

本文详细探讨了NSDictionary的内部结构,包括它的存储方式、实例创建过程、键值查找方法及其优化措施。通过分析Objective-C运行时源代码和反汇编,揭示了NSDictionary如何利用哈希表和链表实现高效的数据存储和访问。文中还讨论了存储大小的动态调整、键的复制策略以及键值对的存储布局。最后,提到了潜在的性能问题和设计决策,强调了正确实现isEqual和hash方法的重要性。
摘要由CSDN通过智能技术生成

首先咱们了解一下这几个概念:哈希表、时间复杂度、链表

The Class

Plenty of Foundation classes are class clusters and NSDictionary is no exception. For quite a long time NSDictionary used CFDictionary as its default implementation, however, starting with iOS 6.0 things have changed:

1
2
(lldb) po [[NSDictionary new] class]
__NSDictionaryI

Similarly to __NSArrayM__NSDictionaryI rests within the CoreFoundation framework, in spite of being publicly presented as a part of Foundation. Running the library through class-dump generates the following ivar layout:

1
2
3
4
5
@interface __NSDictionaryI : NSDictionary
{
          
    NSUIngeter _used:58;
    NSUIngeter _szidx:6;
}

It’s surprisingly short. There doesn’t seem to be any pointer to either keys or objects storage. As we will soon see, __NSDictionary literally keeps its storage to itself.

The Storage

Instance Creation

To understand where __NSDictionaryI keeps its contents, let’s take a quick tour through the instance creation process. There is just one class method that’s responsible for spawning new instances of __NSDictionaryI. According to class-dump, the method has the following signature:

+ (id)__new:(const id *)arg1:(const id *)arg2:(unsigned long long)arg3:(_Bool)arg4:(_Bool)arg5;

It takes five arguments, of which only the first one is named. Seriously, if you were to use it in a @selectorstatement it would have a form of @selector(__new:::::). The first three arguments are easily inferred by setting a breakpoint on this method and peeking into the contents of x2x3 and x4 registers which contain the array of keys, array of objects and number of keys (objects) respectively. Notice, that keys and objects arrays are swapped in comparison to the public facing API which takes a form of:

+ (instancetype)dictionaryWithObjects:(const id [])objects forKeys:(const id <NSCopying> [])keys count:(NSUInteger)cnt;

It doesn’t matter whether an argument is defined as const id * or const id [] since arrays decay into pointers when passed as function arguments.

With three arguments covered we’re left with the two unidentified boolean parameters. I’ve done some assembly digging with the following results: the fourth argument governs whether the keys should be copied, and the last one decides whether the arguments should not be retained. We can now rewrite the method with named parameters:

+ (id)__new:(const id *)keys :(const id *)objects :(unsigned long long)count :(_Bool)copyKeys :(_Bool)dontRetain;

Unfortunately, we don’t have explicit access to this private method, so by using the regular means of allocation the last two arguments are always set to YES and NO respectively. It is nonetheless interesting that __NSDictionaryI is capable of a more sophisticated keys and objects control.

Indexed ivars

Skimming through the disassembly of + __new::::: reveals that both malloc and calloc are nowhere to be found. Instead, the method calls into __CFAllocateObject2 passing the __NSDictionaryI class as first argument and requested storage size as a second. Stepping down into the sea of ARM64 shows that the first thing __CFAllocateObject2 does is call into class_createInstance with the exact same arguments.

Fortunately, at this point we have access to the source code of Objective-C runtime which makes further investigation much easier.

The class_createInstance(Class cls, size_t extraBytes) function merely calls into _class_createInstanceFromZone passing nil as a zone, but this is the final step of object allocation. While the function itself has many additional checks for different various circumstances, its gist can be covered with just three lines:

_class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone)
{
          
    ...
    size_t size = cls->alignedInstanceSize() + extraBytes;
    ...
    id obj = (id)calloc(1, size);
    ...
    return obj;
}

The extraBytes argument couldn’t have been more descriptive. It’s literally the number of extra bytes that inflate the default instance size. As an added bonus, notice that it’s the calloc call that ensures all the ivars are zeroed out when the object gets allocated.

The indexed ivars section is nothing more than an additional space that sits at the end of regular ivars:

Allocating objects

Allocating space on its own doesn’t sound very thrilling so the runtime publishes an accessor:

void *object_getIndexedIvars(id obj)

There is no magic whatsoever in this function, it just returns a pointer to the beginning of indexed ivars section:

Indexed ivars section

There are few cool things about indexed ivars. First of all, each instance can have different amount of extra bytes dedicated to it. This is exactly the feature __NSDictionaryI makes use of.

Secondly, they provide faster access to the storage. It all comes down to being cache-friendly. Generally speaking, jumping to random memory locations (by dereferencing a pointer) can be expensive. Since the object has just been accessed (somebody has called a method on it), it’s very likely that its indexed ivars have landed in cache. By keeping everything that’s needed very close, the object can provide as good performance as possible.

Finally, indexed ivars can be used as a crude defensive measure to make object’s internals invisible to the utilities like class-dump. This is a very basic protection since a dedicated attacker can simply look for object_getIndexedIvars calls in the disassembly or randomly probe the instance past its regular ivars section to see what’s going on.

While powerful, indexed ivars come with two caveats. First of all, class_createInstance can’t be used under ARC, so you’ll have to compile some parts of your class with -fno-objc-arc flag to make it shine. Secondly, the runtime doesn’t keep the indexed ivar size information anywhere. Even though dealloc will clean everything up (as it calls free internally), you should keep the storage size somewhere, assuming you use variable number of extra bytes.

Looking for Key and Fetching Object

Analyzing Assembly

Although at this point we could poke the __NSDictionaryI instances to figure out how they work, the ultimate truth lies within the assembly. Instead of going through the entire wall of ARM64 we will discuss the equivalent Objective-C code instead.

The class itself implements very few methods, but I claim the most important is objectForKey: – this is what we’re going to discuss in more detail. Since I made the assembly analysis anyway, you can read it on a separate page. It’s dense, but the thorough pass should convince you the following code is more or less correct.

The C Code

Unfortunately, I don’t have access to the Apple’s code base, so the reverse-engineered code below is not identical to the original implementation. On the other hand, it seems to be working well and I’ve yet to find an edge case that behaves differently in comparison to the genuine method.

The following code is written from the perspective of __NSDictionaryI class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
- (id)objectForKey:(id)aKey
{
          
    NSUInteger sizeIndex = _szidx;
    NSUInteger size = __NSDictionarySizes[sizeIndex];

    id *storage = (id *)object_getIndexedIvars(dict);

    NSUInteger fetchIndex = [aKey hash] % size;

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值