Issue #2 Concurrent Programming - Thread-Safe Class Design
Issue #2 并发程序设计 线程安全的类设计
该文将重点介绍在设计线程安全的类和使用Grand Central Dispatch(GCD)时的实际应用贴士,设计模式,反模式。Thread Safety 线程安全
Apple’s Frameworks 苹果的框架
首先来看苹果的框架。一般除特别声明,大多数类都默认为非线程安全。对某些人来说这正是他们所期望的,对另外一些人来说这就很有趣。通过后台线程进入UIKit/AppKit是一种常见的错误,即使是极富经验的iOS/Mac开发工程师也可能会犯。很容易造成通过后台线程设置诸如image的properties这样的错误,这是因为这些内容通常是通过运行在后台的网络服务而获得的。而Apple的代码都是经过性能优化的,如果你通过不同的线程改变这些properties,也不会得到任何警告。
在这个image的例子中,有一个通常的信号,就是你所做的更改通过某些延迟而体现。但是,如果有两个不同线程在同一时间对同一图片进行设置,因为当前的设置会在同一时间两次影响图片,所以你的app就特别容易崩溃。因为这种情况和时间关系密切,app就常常在你的用户使用时发生crash,而不是在你开发它的时候。
还没有什么特别行之有效的工具去发现这中错误,不过有一些不错的技巧可以帮助开发工程师以解决问题。 UIKit Main Thread Guard 是一段有趣的代码,它可以修复任何对UIView的setNeedsLayout和setNeedsDisplay的调用,并且在转发调用之前确定它们是否正在被执行。因为这两个方法会被大量的UIKit setters(包括image)所调用,所以难免会造成很多线程相关的错误。尽管这种技巧并没有在私有API中采用,我们不建议将其使用在产品级的app中。不过这种技巧在开发中比较给力。
Apple有意不让UIKit的设计为线程安全。线程安全的设计不会提供更好的性能表现,这种设计可能会造成运转迟缓。事实上,将UIKit绑定在主线程上,使得编写并行程序和使用UIKit变得更加简单
Why Isn’t UIKit Thread Safe? UIKit线程为什么不安全
确保诸如UIKit这样大型框架的线程安全是一项重大的任务,需要付出很大的代价。将properties从非原子型更改为原子型仅仅是所有改变中很小的一部分。通常情况下你希望一次性的更改多个properties,接着得到更改后的预期结果。为了达到这样的目的,Apple给出诸如CoraData的performBlock:和performBlockAndWait:之类的方法,对改变做同步。如果你认为,多数对UIKit类的调用仅仅是关乎配置,那么使它们线程安全就显得没多大意义。然而,即便是不关乎设置的调用,也会与内部状态共享,因此这些调用都不是线程安全的。如果你回到为iOS 3.2或之前编写应用的黑暗年代,你一定会有这样的经历,当使用NSString的drawInRect:withFont: 方法准备后台图片时,会发生随机的崩溃。谢天谢地,随着iOS 4的到来,苹果Apple使大部分的绘制方法和诸如UIColor和UIFont这样的类都可以在后台线程中使用。
不幸的是,Apple的文档中少有涉及到线程安全的议题。他们建议只操作主线程,即便是绘制方法,他们也不明确地表示是线程安全的。所以常常参考iOS Release Notes也是很有必要的。
大多数情况下,UIKit类只能在程序的主线程上被使用。一定程度上,不论是从UIResponder派生出的类,还是那些涉及操作程序用户界面的类,这个原则都适用。
The Deallocation Problem 释放问题
在使用UIKit对象时候会发生的另一个危险是“The Deallocation Problem”。Apple指出了问题TN2019,并给出了一些解决方案。问题指出,UI对象应该在主线程上被释放,因为它们可能在视图层级上的dealloc中被改变。正如我们所知的,诸如此类的调用应该发生在主线程。因为通常第二个线程,操作,或块,会保持着调用者,这会导致错误极易发生并且很难发现或修复。一个在AFNetworking中长期存在,而不是很多人都知道的bug,一种不易暴露且很难重现的崩溃。持续的使用__weak,在移步块/操作中不适用成员变量。
Collection Classes
对于大多数常用的基础类,Apple对于iOS和Mac所列出的线程安全问题,都有很明确的文档描述。通常,诸如NSArray这样的不可变类都是线程安全的,而对应的诸如NSMutableArray这样的不可变变量是非线程安全的。事实上,只要通过序列化的queue使用,在不同线程里操作它们也是安全的。必须注意即便是声明成可变的返回值,方法的返回值也必须是不可变的。通过return [array copy]来确保事实上是不可变的返回对象,不失为一种好办法。和Java不同,iOS基础框架不对外提供线程安全的集合类。这也是有道理的,因为在多数情况下,你希望在更高的层次上加锁以避免过多的加锁操作。有个值得注意的例外是caches,在cache中一个不可变的字典可能会保存可变的数据,Aplle在iOS中引入NSCache,不仅是为其加锁,也提供了在低内存情况下的cache释放机制。
也就是说,在程序中存在这样一种合法的情况,你可以使用可变的线程安全的字典。多亏了类簇方法,这实现起来并不算难。
Atomic Properties 原子属性
还在困惑于Apple是如何处理原子类型属性的setting/getting吗?现在你可能想了解一下spinlocks(自旋锁),semaphores(信号量),locks(锁),@synchronized - Apple已经将相关代码开源,让我们接下来看一看。以下是一个非原子属性的setter方法:
- (void)setUserName:(NSString *)userName {
if (userName != _userName) {
[userName retain];
[_userName release];
_userName = userName;
}
}
变量手动的retain/release,采用ARC的代码也类似。如果setUserName:方法被并发地调用,问题显而易见。_userName可以通过两次释放终结,这会导致极难发现的bug。
在任何属性内,编译器会自动调用 objc_setProperty_non_gc(id self, SEL _cmd, ptrdiff_t offset, id newValue, BOOL atomic, signed char shouldCopy).在这个例子中,调用参数如下:
objc_setProperty_non_gc(self, _cmd,
(ptrdiff_t)(&_userName) - (ptrdiff_t)(self), userName, NO, NO);
ptrdiff_t开起来有点奇怪,不过说到底它就是一种指针算法,Objective-C的类也就是另一种C结构体。
objc_setProperty向下调用了以下方法:
static inline void reallySetProperty(id self, SEL _cmd, id newValue,
ptrdiff_t offset, bool atomic, bool copy, bool mutableCopy)
{
id oldValue;
id *slot = (id*) ((char*)self + offset);
if (copy) {
newValue = [newValue copyWithZone:NULL];
} else if (mutableCopy) {
newValue = [newValue mutableCopyWithZone:NULL];
} else {
if (*slot == newValue) return;
newValue = objc_retain(newValue);
}
if (!atomic) {
oldValue = *slot;
*slot = newValue;
} else {
spin_lock_t *slotlock = &PropertyLocks[GOODHASH(slot)];
_spin_lock(slotlock);
oldValue = *slot;
*slot = newValue;
_spin_unlock(slotlock);
}
objc_release(oldValue);
}
撇开这些匪夷所思的命名,该方法直截了当的使用了PropertyLocks中128个自旋锁之一。这是一种使用而快速的方法,在最糟糕的情况下,因为哈希冲突,setter方法需要等待另一个不相关的setter方法完成。
当这些方法没有在公共头文件中声明时,他们就有可能被手动的调用。我并不是说这是个好办法,但当你想使用原子属性并同时实现它的setter方法时,这也许会很有用。
// Manually declare runtime methods.
extern void objc_setProperty(id self, SEL _cmd, ptrdiff_t offset, id newValue, BOOL atomic, BOOL shouldCopy);
extern id objc_getProperty(id self, SEL _cmd, ptrdiff_t offset, BOOL atomic);
#define PSTAtomicRetainedSet(dest, src) objc_setProperty(self, _cmd, (ptrdiff_t)(&dest) - (ptrdiff_t)(self), src, YES, NO)
#define PSTAtomicAutoreleasedGet(src) objc_getProperty(self, _cmd, (ptrdiff_t)(&src) - (ptrdiff_t)(self), YES)<span style="font-family: Arial, Helvetica, sans-serif;"> </span>
What about @synchronized? 关于 @synchronized
你也许会好奇,为什么Apple不使用 @synchronized(self) 这种已经存在的运行时特性为属性加锁。Apple使用多达3个的加锁解锁序列,因为它们已经加入了异常处理。这比自旋锁方法效率低。因为setting属性通常很快,自旋锁是很好地解决办法。 如果你可以确保在没有死锁的前提下可以异常抛出,那么@synchonized(self) 是不错的办法。Your Own Classes
仅仅使用原子属性是不足以确保你的类线程安全的。这仅仅是保证了在竞争条件下setter,但在逻辑上对你的程序毫无帮助。参考以下代码片段:if (self.contents) {
CFAttributedStringRef stringRef = CFAttributedStringCreate(NULL, (__bridge CFStringRef)self.contents, NULL);
// draw string
}
当contents属性在被判断过后再设为nil时,程序会发生EXCBADACCESS类型的崩溃。以下是简单地通过捕获变量来实现修复:
NSString *contents = self.contents;
if (contents) {
CFAttributedStringRef stringRef = CFAttributedStringCreate(NULL, (__bridge CFStringRef)contents, NULL);
// draw string
}
在这个例子中问题确实被解决了,但多数情况下,解决方法却并非如此简单。假设hai有一个textColor属性,我们可以在一个线程中同时更改textColor和contents属性。同一线程中新的content和旧的textColor将会产生一个奇怪的组合。这也是CoreData将model objects同一个线程或队列绑定的原因。
对于这个问题,不存在一种不变应万变的解决办法。使用不可变模型是一种解决办法,但它也只是针对自己特定的问题。另一种方法是,限制已经存在的对象进入主线程或某一特殊队列,并在它们被用于工作线程之前对其做拷贝。一种简单地解决方案就是使用@synchronize
Practical Thread-Safe Design 实践线程安全的设计
在尝试线程安全设计之前,仔细思考是十分必要的。确保不要过早的最优化。如果任何事物都如同设置类,那也就不需要再考虑线程安全了。一个有效的办法是通过抛出asserts以确保它被正确地使用:void PSPDFAssertIfNotMainThread(void) {
NSAssert(NSThread.isMainThread,
@"Error: Method needs to be called on the main thread. %@",
[NSThread callStackSymbols]);
}
使用并发的dispatch_queue作为读/写锁可以最大程度的优化性能,并且只为真正必要的区域加锁。一旦你使用多队列为不同部分加锁,问题将很快的变得棘手。
有时候编写一些不需要特别加锁的代码。参考一下multicast delegate的代码(在多数情况下,可以使用NSNotification,但多路代理的使用也是合理的):
// header
@property (nonatomic, strong) NSMutableSet *delegates;
// in init
_delegateQueue = dispatch_queue_create("com.PSPDFKit.cacheDelegateQueue", DISPATCH_QUEUE_CONCURRENT);
- (void)addDelegate:(id<PSPDFCacheDelegate>)delegate {
dispatch_barrier_async(_delegateQueue, ^{
[self.delegates addObject:delegate];
});
}
- (void)removeAllDelegates {
dispatch_barrier_async(_delegateQueue, ^{
self.delegates removeAllObjects];
});
}
- (void)callDelegateForX {
dispatch_sync(_delegateQueue, ^{
[self.delegates enumerateObjectsUsingBlock:^(id<PSPDFCacheDelegate> delegate, NSUInteger idx, BOOL *stop) {
// Call delegate
}];
});
}
Unless addDelegate: or removeDelegate: is called thousand times per second, a simpler and cleaner approach is the following:
// header
@property (atomic, copy) NSSet *delegates;
- (void)addDelegate:(id<PSPDFCacheDelegate>)delegate {
@synchronized(self) {
self.delegates = [self.delegates setByAddingObject:delegate];
}
}
- (void)removeAllDelegates {
self.delegates = nil;
}
- (void)callDelegateForX {
[self.delegates enumerateObjectsUsingBlock:^(id<PSPDFCacheDelegate> delegate, NSUInteger idx, BOOL *stop) {
// Call delegate
}];
}
Granted, this example is a bit constructed and one could simply confine changes to the main thread. But for many data structures, it might be worth it to create immutable copies in the modifier methods, so that the general application logic doesn’t have to deal with excessive locking. Notice how we still have to apply locking in addDelegate:, since otherwise delegate objects might get lost if called from different threads concurrently.
Pitfalls of GCD GCD陷阱
For most of your locking needs, GCD is perfect. It’s simple, it’s fast, and its block-based API makes it much harder to accidentally do imbalanced locks. However, there are quite a few pitfalls, some of which we are going to explore here.Using GCD as a Recursive Lock 使用GCD作为递归锁
GCD是一个队列,它可以序列化的进入以共享资源。This can be used for locking, but it’s quite different than @synchronized. GCD queues are not reentrant - this would break the queue characteristics. Many people tried working around this with using dispatch_get_current_queue(), which is a bad idea, and Apple had its reasons for deprecating this method in iOS6.// This is a bad idea.
inline void pst_dispatch_sync_reentrant(dispatch_queue_t queue,
dispatch_block_t block)
{
dispatch_get_current_queue() == queue ? block()
: dispatch_sync(queue, block);
}
Testing for the current queue might work for simple solutions, but it fails as soon as your code gets more complex, and you might have multiple queues locked at the same time. Once you are there, you almost certainly will get a deadlock. Sure, one could use dispatch_get_specific(), which will traverse the whole queue hierarchy to test for specific queues. For that you would have to write custom queue constructors that apply this metadata. Don’t go that way. There are use cases where a NSRecursiveLock is the better solution.
Fixing Timing Issues with dispatch_async
Having some timing-issues in UIKit? Most of the time, this will be the perfect “fix”:dispatch_async(dispatch_get_main_queue(), ^{
// Some UIKit call that had timing issues but works fine
// in the next runloop.
[self updatePopoverSize];
});
Don’t do this, trust me. This will haunt you later as your app gets larger. It’s super hard to debug and soon things will fall apart when you need to dispatch more and more because of “timing issues.” Look through your code and find the proper place for the call (e.g. viewWillAppear instead of viewDidLoad). I still have some of those hacks in my codebase, but most of them are properly documented and an issue is filed.
Remember that this isn’t really GCD-specific, but it’s a common anti-pattern and just very easy to do with GCD. You can apply the same wisdom for performSelector:afterDelay:, where the delay is 0.f for the next runloop.
Mixing dispatch_sync and dispatch_async in Performance Critical Code
That one took me a while to figure out. In PSPDFKit there is a caching class that uses a LRU list to track image access. When you scroll through the pages, this is called a lot. The initial implementation used dispatch_sync for availability access, and dispatch_async to update the LRU position. This resulted in a frame rate far from the goal of 60 FPS.When other code running in your app is blocking GCD’s threads, it might take a while until the dispatch manager finds a thread to perform the dispatch_async code – until then, your sync call will be blocked. Even when, as in this example, the order of execution for the async case isn’t important, there’s no easy way to tell that to GCD. Read/Write locks won’t help you there, since the async process most definitely needs to perform a barrier write and all your readers will be locked during that. Lesson: dispatch\_async can be expensive if it’s misused. Be careful when using it for locking.
Using dispatch_async to Dispatch Memory-Intensive Operations
We already talked a lot about NSOperations, and that it’s usually a good idea to use the more high-level API. This is especially true if you deal with blocks of work that do memory-intensive operations.In an old version of PSPDFKit, I used a GCD queue to dispatch writing cached JPG images to disk. When the retina iPad came out, this started causing trouble. The resolution doubled, and it took much longer to encode the image data than it took to render it. Consequently, operations piled up in the queue and when the system was busy it could crash of memory exhaustion.
There’s no way to see how many operations are queued (unless you manually add code to track this), and there’s also no built-in way to cancel operations in case of a low-memory notification. Switching to NSOperations made the code a lot more debuggable and allowed all this without writing manual management code.
Of course there are some caveats; for example you can’t set a target queue on your NSOperationQueue (like DISPATCH_QUEUE_PRIORITY_BACKGROUND for throttled I/O). But that’s a small price for debuggability, and it also prevents you from running into problem like priority inversion. I even recommend against the nice NSBlockOperation API and suggest real subclasses of NSOperation, including an implementation of description. It’s more work, but later on, having a way to print all running/pending operations is insanely useful.
http://www.objc.io/issue-2/thread-safe-class-design.html