Chromium 中的单例模板实现方法分析
前言
由于 c++ 的编译器可以保证函数内部静态成员的初始化唯一,因此在类内定义的静态成员变量天然的线程安全,并且局部静态变量仅在执行时才会初始化,因此也是懒汉模式的单例。
在chromium base 库中存在 StaticSingleton 和 LazyInstance 模板, LazyInstance本身是为了定义懒汉模式的线程安全单例; StaticSingleton 提供了单线程和线程安全的静态单例方法,实际上这LazyInstance模板可以废弃了,而 StaticSingleton 也仅仅是简单的包装
。
但是我们可以分析一下这两个模板的实现方案,尤其是 LazyInstance, 可以对我们有很多启发。
StaticSingleton
使用方法
//src/third_party/blink/renderer/platform/wtf/hash_table.cc
static Mutex& hashTableStatsMutex() {
DEFINE_THREAD_SAFE_STATIC_LOCAL(Mutex, mutex, ());
return mutex;
}
HashTableStats& HashTableStats::instance() {
DEFINE_THREAD_SAFE_STATIC_LOCAL(HashTableStats, stats, ());
return stats;
}
先看一下如何使用: DEFINE_THREAD_SAFE_STATIC_LOCAL 宏,可以创建一个线程安全的单例对象。三个参数分别为,Class, 对象名, 以及初始化列表,下面分析一下如何实现。
#define DEFINE_THREAD_SAFE_STATIC_LOCAL(Type, Name, Arguments) \
DEFINE_STATIC_LOCAL_IMPL(Type, Name, Arguments, true)
// allow_Cross_thread 这个参数已经没用了,因为编译器保证线程安全
#define DEFINE_STATIC_LOCAL_IMPL(Type, Name, Arguments, allow_cross_thread) \
static WTF::StaticSingleton<Type> s_##Name( \
[&]() { return new WTF::StaticSingleton<Type>::WrapperType Arguments; }, \
[&](void* leaked_ptr) { \
new (leaked_ptr) WTF::StaticSingleton<Type>::WrapperType Arguments; \
}); \
Type& Name = s_##Name.Get(allow_cross_thread)
template <typename Type>
class StaticSingleton final {
template <typename HeapNew, typename PlacementNew>
StaticSingleton(const HeapNew& heap_new, const PlacementNew& placement_new)
: instance_(heap_new, placement_new){
LEAK_SANITIZER_REGISTER_STATIC_LOCAL(WrapperType, instance_.Get());
}
Type& Get(bool allow_cross_thread_use) {
return Wrapper<Type>::Unwrap(instance_.Get());
}
InstanceStorage<WrapperType> instance_;
...
};
template <typename T, bool is_small = sizeof(T) <= 32>
class InstanceStorage {
public:
template <typename HeapNew, typename PlacementNew>
InstanceStorage(const HeapNew& heap_new, const PlacementNew&)
: pointer_(heap_new()) {}
T* Get() { return pointer_; }
private:
T* pointer_;
};
template <typename T>
class InstanceStorage<T, true> {
public:
template <typename HeapNew, typename PlacementNew>
InstanceStorage(const HeapNew&, const PlacementNew& placement_new) {
placement_new(&object_);
}
T* Get() { return reinterpret_cast<T*>(object_); }
private:
alignas(T) char object_[sizeof(T)];
};
如上可以看到,其原理就是定义了一个函数内的静态变量 StaticSingleton s_ClassName; 然后由编译器线程安全的初始化。
其优化在于当单例对象的内存大于32字节时,使用堆内存和 new 方法来初始化对象, 而小于32字节时,使用栈内存 和 placement new 方法来初始化对象
内部静态成员唯一的缺点是占用静态区内存,这个模板针对大的单例对象优化,可以让其实体存放在堆内存中,已达到减少内存占用的目的。
LazyInstance Lazy的单例模式
LazyInstance 尽管已经废弃,但其使用的原子量和自旋锁的技术还是可以给我们很多启发的。为了阅读方便这里删除了大量的模板代码
template <
typename Type,
typename Traits =
internal::ErrorMustSelectLazyOrDestructorAtExitForLazyInstance<Type>>
class LazyInstance {
public:
// Do not define a destructor, as doing so makes LazyInstance a
// non-POD-struct. We don't want that because then a static initializer will
// be created to register the (empty) destructor with atexit() under MSVC, for
// example. We handle destruction of the contained Type class explicitly via
// the OnExit member function, where needed.
// ~LazyInstance() {}
// Convenience typedef to avoid having to repeat Type for leaky lazy
// instances.
typedef LazyInstance<Type, internal::LeakyLazyInstanceTraits<Type>> Leaky;
typedef LazyInstance<Type, internal::DestructorAtExitLazyInstanceTraits<Type>>
DestructorAtExit;
Type& Get() {
return *Pointer();
}
Type* Pointer() {
return subtle::GetOrCreateLazyPointer(
&private_instance_, &Traits::New, private_buf_,
Traits::kRegisterOnExit ? OnExit : nullptr, this);
}
// Returns true if the lazy instance has been created. Unlike Get() and
// Pointer(), calling IsCreated() will not instantiate the object of Type.
bool IsCreated() {
// Return true (i.e. "created") if |private_instance_| is either being
// created right now (i.e. |private_instance_| has value of
// internal::kLazyInstanceStateCreating) or was already created (i.e.
// |private_instance_| has any other non-zero value).
return 0 != subtle::NoBarrier_Load(&private_instance_);
}
subtle::AtomicWord private_instance_;
// Preallocated space for the Type instance.
alignas(Type) char private_buf_[sizeof(Type)];
private:
Type* instance() {
return reinterpret_cast<Type*>(subtle::NoBarrier_Load(&private_instance_));
}
static void OnExit(void* lazy_instance) {
LazyInstance<Type, Traits>* me =
reinterpret_cast<LazyInstance<Type, Traits>*>(lazy_instance);
Traits::Delete(me->instance());
subtle::NoBarrier_Store(&me->private_instance_, 0);
}
};
可以看出 LazyInstance 持有两个成员变量,一个是 private_buf_ ,为单例对象预留静态内存,后面直接在这块内存上创建对象;另一个是原子变量 private_instance_, 其表示单例对象的状态,为 0 时代表没有创建单例对象,1 代表另一个线程正在创建对象中
, 其他时则代表单例对象的地址,也就是 &private_buf_, 由于用户不可能使用到0x01 和0x01这样的地址,所以不可能产生歧义。
这样看来重点就在 GetOrCreateLazyPointer 方法是如何实现线程同步的。
template <typename Type>
Type* GetOrCreateLazyPointer(subtle::AtomicWord* state,
Type* (*creator_func)(void*),
void* creator_arg,
void (*destructor)(void*),
void* destructor_arg) {
DCHECK(state);
DCHECK(creator_func);
// If any bit in the created mask is true, the instance has already been
// fully constructed.
constexpr subtle::AtomicWord kLazyInstanceCreatedMask =
~internal::kLazyInstanceStateCreating;
// We will hopefully have fast access when the instance is already created.
// Since a thread sees |state| == 0 or kLazyInstanceStateCreating at most
// once, the load is taken out of NeedsLazyInstance() as a fast-path. The load
// has acquire memory ordering as a thread which sees |state| > creating needs
// to acquire visibility over the associated data. Pairing Release_Store is in
// CompleteLazyInstance().
subtle::AtomicWord instance = subtle::Acquire_Load(state);
if (!(instance & kLazyInstanceCreatedMask)) {
if (internal::NeedsLazyInstance(state)) {
// This thread won the race and is now responsible for creating the
// instance and storing it back into |state|.
instance =
reinterpret_cast<subtle::AtomicWord>((*creator_func)(creator_arg));
internal::CompleteLazyInstance(state, instance, destructor,
destructor_arg);
} else {
// This thread lost the race but now has visibility over the constructed
// instance (NeedsLazyInstance() doesn't return until the constructing
// thread releases the instance via CompleteLazyInstance()).
instance = subtle::Acquire_Load(state);
DCHECK(instance & kLazyInstanceCreatedMask);
}
}
return reinterpret_cast<Type*>(instance);
}
首先, 会使用原子方法 Acquire_Load 取出state的值,判断是否是一个指针,如果是指针就可以直接返回 instance, 否则需要构造单例对象.
调用 internal::NeedsLazyInstance 进行同步,如果返回true,则代表该线程需要构造对象,false则不需要。由于两个线程同时进入该函数,必定只有一个线程可以构造对象,另一个线程必须等待对象构造完毕,因此,当函数返回false时必须保证单例对象已经创建,因此这里面有一个同步的操作。
bool NeedsLazyInstance(subtle::AtomicWord* state) {
// Try to create the instance, if we're the first, will go from 0 to
// kLazyInstanceStateCreating, otherwise we've already been beaten here.
// The memory access has no memory ordering as state 0 and
// kLazyInstanceStateCreating have no associated data (memory barriers are
// all about ordering of memory accesses to *associated* data).
if (subtle::NoBarrier_CompareAndSwap(state, 0, kLazyInstanceStateCreating) ==
0) {
// Caller must create instance
return true;
}
// It's either in the process of being created, or already created. Spin.
// The load has acquire memory ordering as a thread which sees
// state_ == STATE_CREATED needs to acquire visibility over
// the associated data (buf_). Pairing Release_Store is in
// CompleteLazyInstance().
if (subtle::Acquire_Load(state) == kLazyInstanceStateCreating) {
const base::TimeTicks start = base::TimeTicks::Now();
do {
const base::TimeDelta elapsed = base::TimeTicks::Now() - start;
// Spin with YieldCurrentThread for at most one ms - this ensures maximum
// responsiveness. After that spin with Sleep(1ms) so that we don't burn
// excessive CPU time - this also avoids infinite loops due to priority
// inversions (https://crbug.com/797129).
if (elapsed < TimeDelta::FromMilliseconds(1))
PlatformThread::YieldCurrentThread();
else
PlatformThread::Sleep(TimeDelta::FromMilliseconds(1));
} while (subtle::Acquire_Load(state) == kLazyInstanceStateCreating);
}
// Someone else created the instance.
return false;
}
NeedsLazyInstance 首先使用原子方法 NoBarrier_CompareAndSwap,(底层使用的是compare_exchange_strong), NoBarrier_CompareAndSwap 函数的作用是当 state 等于0时,将其赋值为1,并返回0,如果不为零则返回state的值。
因此当一个线程将state设置为1,并返回true时,另一个线程必然只能读到1, 保证了构造的唯一性。 接下来,会利用原子量state的状态构造一个自旋锁的结构,线程会每隔1ms 读取state的状态,如果状态不为1,则证明对象已经构建完成,则跳出循环。
(YieldCurrentThread 方法调用 sched_yield, 主动让出当前线程的CPU占有权,而sleep一般需要等当前时间片结束)
后记
由此我们可以看出,LazyInstance利用原子变量构建了一个自旋锁,来保证线程同步,这种高效的同步方案可以被我们用到其他地方。
比如是否可能用在线程安全的weakptr上,chromium 实现的 weakptr 是非线程安全的。
比如当主线程的weakptrOwner 调用 Invalid方法,让所有子线程的指向该指针的 weakptr 都安全的失效。