想让不同线程间的变量互不影响，你会怎么做？

最新推荐文章于 2022-07-06 21:12:20 发布

郭霖_

最新推荐文章于 2022-07-06 21:12:20 发布

阅读量1.3k

点赞数

原文链接：https://www.jianshu.com/p/e0eb1f6a151a

版权

/ 今日科技快讯 /

昨天《第一行代码第3版》开启首发预售，3000本签名版分别供给了京东、当当、常青藤、悦悦四家店铺。其中京东的1800本已全部售完，当当、常青藤、悦悦还剩一定库存。为了防止签名版订单出错，各个商家在签名版售完之后会将这本书短暂下架，等预售订单配送完成后再重新开启。也就是说，昨天只要下单成功的朋友们，拿到手的一定会是签名版。

祝大家有个愉快的假期，清明期间也尽量减少外出，特殊时期安全为重。

/ 作者简介 /

本篇文章来自小乌贼007的投稿，分享了他从需求的角度去分析ThreadLocal源码，相信会对大家有所帮助！同时也感谢作者贡献的精彩文章。

小乌贼007的博客地址：

https://www.jianshu.com/u/57c41cd8549e

/ 怎么理解ThreadLocal /

第一次接触ThreadLocal这个概念是在《java编程思想》这本书上，当时看的云里雾里，直到昨天还在云里雾里。

没有接触过ThreadLocal的同学可以看一看下面这段代码，初步认识一下ThreadLocal。

public class Looper {
    private static final ThreadLocal<Looper> sThreadLocal = new ThreadLocal<>();

    public static void prepare(String name) {
        sThreadLocal.set(new Looper(name));
    }

    public static Looper myLooper() {
        return sThreadLocal.get();
    }

    public final String name;

    private Looper(String name) {
        this.name = name;
    }

    @NotNull
    @Override
    public String toString() {
        return name == null ? "" : name;
    }

    public static void testLooper() {
        Thread mainThread = new Thread(new Runnable() {
            @Override
            public void run() {
                Looper.prepare("mainThread");

                System.out.println("1 " + Looper.myLooper());

                Thread workThread = new Thread(new Runnable() {
                    @Override
                    public void run() {
                        {
                            //我明明初始化了looper，为啥取出来是空的？
                            System.out.println("2 " + Looper.myLooper());
                            Looper.prepare("workThread");
                            System.out.println("3 " + Looper.myLooper());
                        }
                    }
                });
                workThread.start();
                try {
                    TimeUnit.SECONDS.sleep(1);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                System.out.println("4 " + Looper.myLooper());
                //程序输出如下:
                //1 mainThread
                //2 null
                //3 workThread
                //4 mainThread
            }
        });
        mainThread.start();
    }
}

从输出可以看出ThreadLocal这个变量在不同的线程中有不同的值，怎么想怎么神奇有没有？网上搜了一些对ThreadLocal的解释，大多如下：

ThreadLocal为解决多线程程序的并发问题提供了一种新的思路。

当使用ThreadLocal维护变量时，ThreadLocal为每个使用该变量的线程提供独立的变量副本，所以每一个线程都可以独立地改变自己的副本，而不会影响其它线程所对应的副本。

副本？并发？这两个词放在一起，搞的我头疼，一直放在一起，一直头疼。

困惑1：我对并发问题的理解是多个线程操作用一个对象，需要保证对这个对象的有序访问（谨慎操作临界区）。不同线程操作不同的对象，我不知道他们为什么说这解决了线程安全问题。
困惑2：我们为什么需要某个类在不同线程有着不同的副本？

直到有一天事情出现了转机，我第n次（因为太笨，得看多次）看了HandlerThread的实现原理，下面用一个简略的模型来理解HandlerThread。HandlerThread可以理解为实现了一套线程通信机制的线程。

HandlerThread里面有个Looper（上面的class Looper就是节选了该Looper的部分代码），可以称之为消息泵，这个消息泵不断地泵出消息队列里面的消息，然后这些消息被解释（运行）。还有个东西叫Handler，这个东西可以往消息队列里面丢消息。所以两个HandlerThread之间可以通过Handler来完成线程通信。

这个消息泵（Looper）就是个ThreadLocal，为什么要是ThreadLocal？因为不同的线程应该拥有自己独立的消息泵。这似乎解释了困惑2，有些场景下我们需要某个类在不同的线程有着不同的副本，比如Looper。

但是我还是觉得这种解释怪怪的，又说不上来哪里怪。

“因为不同的线程应该拥有自己独立的消息泵。” ，我试着重新理解这句话，每个线程有自己独立的消息泵，在面向对象的世界里，这个消息泵不就应该是线程的成员变量吗？

ThreadLocal新的理解：ThreadLocal是用来在Thread类中添加一个逻辑上的成员变量。

这样理解的一大好处是直接解释了ThreadLocal的应用场景，我的线程需要一个Looper，但是Thread类里面又没有定义Looper这个成员变量，怎么办？用ThreadLocal吧，它可以在逻辑上帮我们往Thread类中添加成员变量，从需求的角度来理解ThreadLocal，更容易掌握它的应用场景。
这样理解的另一大好处是暗示其实现原理，试着思考一下，不改变类代码的前提下，逻辑上往一个类里面添加一个成员变量该怎么实现？试着在Person类中添加Gender属性。

class Person {
    @NonNull public final String name;

    Person(@NonNull String name) {
        this.name = name;
    }
}

继承？继承不行，继承改变了子类，并未改变父类。维护一个map，map的key为 Person对象+属性，map的value为Person对象的这个属性的值。貌似方案可行，尝试着写了一下，下面贴一下我的简单实现。

public class PersonLocal<T> {
    private static final Map<PersonLocalKey, Object> sValues = new HashMap<>();

    public void set(@NonNull Person person, @NonNull T attribute) {
        sValues.put(new PersonLocalKey(person, attribute.getClass()), attribute);
    }

    public T get(@NonNull Person person, @NonNull Class<T> attributeType) {
        Object object = sValues.get(new PersonLocalKey(person, attributeType));
        if (object != null) {
            @SuppressWarnings("unchecked")
            T ret = ((T) object);
            return ret;
        }
        return null;
    }

    public void remove(@NonNull Person person, @NonNull Class<T> attributeType) {
        sValues.remove(new PersonLocalKey(person, attributeType));
    }

    public int size(){
        return sValues.size();
    }

    public static void testPersonLocal() {
        PersonLocal<Gender> genderPersonLocal = new PersonLocal<>();

        Person bob = new Person("bob");
        genderPersonLocal.set(bob, Gender.MALE);//逻辑上 相当于bob.setGender(Gender.MALE)。
        Gender bobGender = genderPersonLocal.get(bob, Gender.class);//逻辑上 相当于Gender bobGender=bob.getGender()。
        System.out.println("bob is " + bobGender);

        Person mei = new Person("mei");
        genderPersonLocal.set(mei, Gender.FEMALE);
        System.out.println("mei is " + genderPersonLocal.get(mei, Gender.class));

        genderPersonLocal.remove(bob, Gender.class);
        genderPersonLocal.remove(mei, Gender.class);

        assert genderPersonLocal.size() == 0;

    }

}

enum Gender {
    MALE,
    FEMALE,
}

class PersonLocalKey {
    @NonNull
    private final Person person;
    @NonNull
    private final Class<?> attribute;

    PersonLocalKey(@NonNull Person person,
                   @NonNull Class<?> attribute) {
        this.person = person;
        this.attribute = attribute;
    }

    @Override
    public int hashCode() {
        int result = 17;
        result = result * 31 + System.identityHashCode(person);
        result = result * 31 + System.identityHashCode(attribute);
        return result;
    }

    @Override
    public boolean equals(@Nullable Object obj) {
        if (obj == this) {
            return true;
        }
        if (obj instanceof PersonLocalKey) {
            PersonLocalKey other = ((PersonLocalKey) obj);
            return other.person == this.person && other.attribute == this.attribute;
        }
        return super.equals(obj);
    }
}

PersonLocal勉强算是实现了功能，但是有个不好的地方就是存在内存泄漏的风险，当Person不需要时需要手动地去释放掉Gender属性，也就是执行PersonLocal的remove方法，否则会泄漏Person、Gender和ThreadLocalKey。

ThreadLocal本质上也是在操作map，这个map位于Thread中，即：

/* ThreadLocal values pertaining to this thread. This map is maintained
  * by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

相信看懂了PersonLocal，接下来再去理解ThreadLocal的实现，应该会轻松一些。下面来看一下ThreadLocal的set方法：

public void set(T value) {
        //获取当前线程
        Thread t = Thread.currentThread();
         //获取线程的map
        ThreadLocalMap map = getMap(t);
        //ThreadLocal对象为key，将值放进map。
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
}

ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
}

再来看一下ThreadLocal的get方法：

public T get() {
    Thread t = Thread.currentThread();
    //获取线程的map。
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        //从map中获取值。
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();
 }

最后看一下remove方法：

public void remove() {
     //获取map
     ThreadLocalMap m = getMap(Thread.currentThread());
     //调用map的remove方法，删除元素
     if (m != null)
         m.remove(this);
}

ThreadLocal明显比PersonLocal的接口要简洁得多，一个原因是map的位置，若Person中也有个map，那么试着来改进下PersonLocal？

class PersonV2 {
    @NonNull
    public final String name;

    PersonV2(@NonNull String name) {
        this.name = name;
    }
    Map<PersonLocalV2, Object> extraAttributes = new HashMap<>();
}

public class PersonLocalV2<T> {

    public void set(@NonNull PersonV2 person, @NonNull T attribute) {
        person.extraAttributes.put(this, attribute);
    }

    public T get(@NonNull PersonV2 person) {
        Object object = person.extraAttributes.get(this);
        if (object != null) {
            @SuppressWarnings("unchecked")
            T ret = ((T) object);
            return ret;
        }
        return null;
    }

    public void remove(@NonNull PersonV2 person) {
        person.extraAttributes.remove(this);
    }

    public static void testPersonLocal() {
        PersonLocalV2<Gender> genderPersonLocal = new PersonLocalV2<>();

        PersonV2 bob = new PersonV2("bob");
        genderPersonLocal.set(bob, Gender.MALE);//逻辑上 相当于bob.setGender(Gender.MALE)。
        Gender bobGender = genderPersonLocal.get(bob);//逻辑上 相当于Gender bobGender=bob.getGender()。
        System.out.println("bob is " + bobGender);

        PersonV2 mei = new PersonV2("mei");
        genderPersonLocal.set(mei, Gender.FEMALE);
        System.out.println("mei is " + genderPersonLocal.get(mei));

        genderPersonLocal.remove(bob);
        genderPersonLocal.remove(mei);

        assert bob.extraAttributes.size() == 0;
        assert mei.extraAttributes.size() == 0;

    }
}

代码少了很多，但是PersonLocalV2还是没有ThreadLocal简洁，原因是PersonLocalV2必须指定Person，不然它无法知道该操作哪个对象。ThreadLocal就不一样，ThreadLocal可以利用Thread.currentThread()来得到自己操作的Thread对象。

理解更新：

ThreadLocal通过操作Thread中的一个map，来实现在Therad中添加一个逻辑上的成员变量。

现在我们知道了ThreadLocal的原理（操作map）和应用场景（Thread需要一个成员变量，如消息泵），但是在敲代码的时候还是有点儿懵，至少我在敲Looper.class的时候还是一阵迷糊。有没有发现ThreadLocal一般都是static 的，如

private static final ThreadLocal<Looper> sThreadLocal = new ThreadLocal<>();

因为sThreadLocal是一个key，我只需要一个key去各个thread中的map上找值就可以了。所以这个key只需要一个就好了，这一个key代表需要添加一个逻辑成员变量，若需要多个，则需要实例化多个key。

还有一个需要讨论的点是内存泄漏。（内存泄漏我的理解：当我们不再需要一个对象，这个对象却无法被gc，这时就发生了内存泄漏，这个对象永远无法gc，或者这个对象无法立即gc过一段时间之后才gc（延迟gc），这里都算作内存泄漏）。

当我们不再需要这个逻辑上的成员变量时，需要调用ThreadLocal的remove方法，否则会造成内存泄漏。

当thread被释放、没有强引用指向thread时，此时map会被释放、value也会被释放，不会发生内存泄漏。
thread没被释放、有强引用指向thread时，在各个线程调用了threadlocal.remove，不会发生内存泄漏。
thread没被释放、有强引用指向thread时，若没办法在各个线程调用threadlocal.remove，可以将sThreadLocal 置空，有机会延迟gc，会缓解部分内存泄漏。具体原因需要看下ThreadLocalMap的源码，ThreadLocalMap持有了ThreadLocal的弱引用，当有机会调用threadlocal的get或set方法时，map有机会清理掉部分对无用value的引用，此时是延迟gc。

从另一个角度看待这个问题，一般情况我们不会在对象没被释放的情况下去主动释放它的成员变量。所以当Thread没被释放时，我们也没必要去释放它的逻辑成员变量。

如果是这么思考问题的话，可以认为没有发生内存泄漏，即ThreadLocal实现的逻辑变量的生命周期和thread保持一致，我们不用调用thread.remove方法。

两种看法的不同点在于怎么定义有用。

这段比较绕，暂时没有好的叙述方式，看不懂请略过。

下面介绍ThreadLocalMap的源码实现，有几个有意思的点（垃圾清理机制，斐波拉契哈希），不感兴趣的可以到此为止。

/ ThreadLocalMap的实现 /

ThreadLocalMap是用数组来存放数据的，数组的初始长度为16，装载因子为2/3，超出装载因子数组长度会翻倍，所以数组的长度一定是2的n次幂。

private static final int INITIAL_CAPACITY = 16;

/**
   * The table, resized as necessary.
   * table.length MUST always be a power of two.
   * 用来存储的数据结构为数组，表长度必须是2的n次幂
   */
private Entry[] table;
private void setThreshold(int len) {
    threshold = len * 2 / 3;
}
//数组的元素Entry和Map.Entry不同，它本身是个弱引用，指向了ThreadLocal对象，
//它的另一个成员变量value，就是Thread逻辑上的成员变量。
static class Entry extends WeakReference<ThreadLocal<?>> {
    Object value;
    Entry(ThreadLocal<?> k, Object v) {
                super(k);
                value = v;
     }
}

为了有个整体的认识，绘制了一张ThreadLocal的整体数据结构模型，如下图所示：

下面看一下ThreadLocalMap的getEntry方法：

private Entry getEntry(ThreadLocal<?> key) {
    // 根据hash值计算数据的存储位置（斐波拉契哈希，亮点1，稍后介绍）
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];
    //直接命中，取出值
    if (e != null && e.get() == key)
        return e;
    //未直接命中，往后继续查找(线性探测法)。
    else
        return getEntryAfterMiss(key, i, e);
    }

private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
     Entry[] tab = table;
     int len = tab.length;
     //一直往后探测，找到则返回退出。直到找到一个为空的entry，说明map中没有这个key对应的值。
     while (e != null) {
         ThreadLocal<?> k = e.get();
         //找到退出
         if (k == key)
             return e;
         //key为空，说明弱引用指向的对象被gc（图1中的6号引用断开、
         //ThreadLocal对象被gc），
         //这是一个Stale Slot（概念后面会介绍），此时做一些清理防止内存泄漏，
         //同时提高未来查找的命中率，稍后再来看这个方法（亮点2，垃圾清理）
         if (k == null)
             expungeStaleEntry(i);
         //未找到，继续往后探测
         else
             i = nextIndex(i, len);
             e = tab[i];
      }
      //一直探测到一个为空的entry，说明map中没有这个key对应的值。
      return null;
}

//下一个
private static int nextIndex(int i, int len) {
    return ((i + 1 < len) ? i + 1 : 0);
}

//上一个
private static int prevIndex(int i, int len) {
    return ((i - 1 >= 0) ? i - 1 : len - 1);
}

从nextIndex，prevIndex两个方法中可以看出内部存储结构在逻辑上是个环形结构。从getEntry和getEntryAfterMiss两个方法可以看出，ThreadLocalMap解决冲突的方式是开放地址法、冲突探测的方式是线性探测法。还有个方法expungeStaleEntry没被介绍，请留意一下，后面会补充。下面来解释这一行代码int i = key.threadLocalHashCode & (table.length - 1);

/ 什么是斐波拉契哈希 /

getEntry方法中有这么一行代码，根据hash值计算存储位置。threadlocal（key）的hash值没有使用threadlocal.hash()，而是使用threadLocal.threadLocalHashCode 。

int i = key.threadLocalHashCode & (table.length - 1);

追进去看一下。

private final int threadLocalHashCode = nextHashCode();
private static int nextHashCode() {
    //hashCode总是0x61c88647的倍数。
    return nextHashCode.getAndAdd(HASH_INCREMENT);
}
private static final int HASH_INCREMENT = 0x61c88647;
private static AtomicInteger nextHashCode = new AtomicInteger();

从上面代码可以看出threadLocal的threadLocalHashCode总是0x61c88647的倍数。0x61c88647这个数字是干嘛的？magic number，赶紧去google一下。

0x61c88647的十进制数值为1640531527。

(1L << 32) * (1 - (Math.sqrt(5.0)-1)/2))=1640531527

原来0x61c88647是2的32次幂的黄金分割点。这个黄金分割点有个特点，它相邻的倍数对2的n次幂取余的结果总是相对分散，如

1640531527 x 1 % 16= 7
1640531527 x 2 % 16=14
1640531527 x 3 % 16=5
1640531527 x 4 % 16=12
...

|14-7|=7，|5-14|=9，|12-5|=7...所以前后两次生成的hashCode对16取余的结果总是相差约16的一半儿。

将对2的n次幂取可以用位操作进行改写，提高计算速度：x % (2^n) = x & (2^n -1)。当数组的长度为16时：

key.threadLocalHashCode & (table.length - 1) = 1640531527 x N % 16

也就是说先后生成的两个ThreadLocal在数组中的存储位置总是相距数组长度的一半儿（没有碰撞的情况下）。这样的好处显而易见，减少碰撞且容易解决冲突，当碰撞发生时，用开放地址法进行线性探测时，后面的位置（冲突位置右边的位置）大概率是空着的。

下面继续来看ThreadLocalMap的set方法：

private void set(ThreadLocal<?> key, Object value) {
    // We don't use a fast path as with get() because it is at
    // least as common to use set() to create new entries as
    // it is to replace existing ones, in which case, a fast
    // path would fail more often than not.

    Entry[] tab = table;
    int len = tab.length;
    //根据hash值计算物理坐标
    int i = key.threadLocalHashCode & (len-1);

    //从物理坐标开始线性探测
    for (Entry e = tab[i];
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {
        ThreadLocal<?> k = e.get();
        //发现历史记录，直接改值后退出
        if (k == key) {
            e.value = value;
            return;
        }
        //发现Stale Entry，将State Entry替换为新的Full Entry，并垃圾清理后退出，后面详细介绍。
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return;
        }
    }
    //未发现历史记录，未发现Stale Entry，新建一个Entry，放入数组。
    tab[i] = new Entry(key, value);
    int sz = ++size;
    //进行垃圾清理，垃圾清理后若超出装载因子则数组长度翻倍重新hash。
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
    rehash();//全量垃圾清理，数组翻倍，后面会介绍。
}

不难看出get和set方法不是很难理解，主要是开发地址和线性探测。LocalThreadMap主要麻烦的地方在于垃圾清理。接下来介绍expungeStaleEntry、replaceStaleEntry和cleanSomeSlots三个方法，看一下垃圾清理机制。

算法中有些概念需要用到，先介绍下必要概念：

Slot：散列表中的一个位置（节点），既数组的item。
Full Slot和Full Entry：Full Entry表示Entry对象里的弱引用不为空。Full Slot指向Full Entry。图2中的1和3为Full Slot，1和3指向的Entry为Full Entry。
Stale Slot和Stale Entry：Stale Entry表示Entry对象里的弱引用为空（图1中的6号引用被断开，ThreadLocal对象被垃圾回收，此时不再需要Stale Entry）。Stale Slot指向Stale Entry。垃圾清理也就是释放Stale Entry（Entry中的value也跟着被释放掉）。释放之后Stale Slot也就变成了Null Slot。图2中的4和6为Stale Slot，需要被清理。
Null Slot：散列表的Null Slot位置为null，可用于放置新的Full Entry。图2中的2和9为Null Slot。
Run：散列表中任意两个Null Slot之间的一段，不包括两端。图2中的14、15、0、1组成的一段为一个Run。垃圾清理就是将Run中的Stale Slot变成Null Slot，同时将Full Slot左移（发生冲突时，开发地址法往右进行探测，所以当左边的Stale Slot无用时，右边的Full Slot有左移的可能，这样可以提高后续hash的命中率）。

下面来看expungeStaleEntry方法，该方法清理Stale Slot到Stale Slot后第一个Null Slot组成的这一段，清理后Stale Slot变成Null Slot，Full Slot左移，并返回尾巴后第一个Null Slot的坐标。（如图1中的6 7 8组成的这一段儿，清理后方法返回9）

private int expungeStaleEntry(int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;
    //将Stale Slot变成Null Slot，size--
    tab[staleSlot].value = null;
    tab[staleSlot] = null;
    size--;

    // Rehash until we encounter null
    Entry e;
    int i;
    for (i = nextIndex(staleSlot, len);
                 (e = tab[i]) != null;
                 i = nextIndex(i, len)) {
        ThreadLocal<?> k = e.get();
        //Stale Slot 变成Null Slot.
        if (k == null) {
            e.value = null;
            tab[i] = null;
            size--;
        } else {
            int h = k.threadLocalHashCode & (len - 1);
            //直接命中的位置和现有位置是否一致，若不一致则尝试左移（重新hash，线性探测）。
            if (h != i) {
                tab[i] = null;
                //左移过程中的线性探测
                while (tab[h] != null)
                    h = nextIndex(h, len);
                tab[h] = e;
            }
        }
    }
    //返回Run尾巴后面第一个Null Slot的位置。
    return i;
}

下面来看replaceStaleEntry方法，该方法将一个Full Slot（参数key value组成）放入一个存在Stale Slot的Run中，并清理Run及Run后面的一段儿。（如图1，若stale slot=6，则run的范围为3 4 5 6 7 8。会将新Full Entry放入位置6，并清理run及run后面一段儿）

private void replaceStaleEntry(ThreadLocal<?> key, Object value,
                                       int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;
    Entry e;

    // Back up to check for prior stale entry in current run.
    // We clean out whole runs at a time to avoid continual
    // incremental rehashing due to garbage collector freeing
    // up refs in bunches (i.e., whenever the collector runs).
    int slotToExpunge = staleSlot;
    //找到Run的起始位置。
    for (int i = prevIndex(staleSlot, len);
                 (e = tab[i]) != null;
                 i = prevIndex(i, len))
        if (e.get() == null)
            slotToExpunge = i;

        // Find either the key or trailing null slot of run, whichever
        // occurs first
        for (int i = nextIndex(staleSlot, len);
                 (e = tab[i]) != null;
                 i = nextIndex(i, len)) {
            ThreadLocal<?> k = e.get();

            // If we find key, then we need to swap it
            // with the stale entry to maintain hash table order.
            // The newly stale slot, or any other stale slot
            // encountered above it, can then be sent to expungeStaleEntry
            // to remove or rehash all of the other entries in run.
            if (k == key) {
                e.value = value;
                tab[i] = tab[staleSlot];
                //注意这里的交换，staleSlot位置存了新值。
                tab[staleSlot] = e;

                // Start expunge at preceding stale entry if it exists
                //若run的头就是staleSlot，将头后移，因为staleSlot会存新值，staleSlot不需要清理。
                if (slotToExpunge == staleSlot)
                    slotToExpunge = i;
                //清理Run后将Run后面的一段也进行清理，然后退出
                cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                return;
            }

            // If we didn't find stale entry on backward scan, the
            // first stale entry seen while scanning for key is the
            // first still present in the run.
            //若staleSlot就是run的头部，因为staleSlot会存新值，所以staleSlot不需要清理，将run的头部后移到staleSlot后的第一个Stale Slot。
            if (k == null && slotToExpunge == staleSlot)
                slotToExpunge = i;
        }

        // If key not found, put new entry in stale slot
        //如果没找到key，则将staleSlot位置放入新值。
        tab[staleSlot].value = null;
        tab[staleSlot] = new Entry(key, value);

        // If there are any other stale entries in run, expunge them
        //如果整个run不止staleSlot一个Stale Slot，则将Run清理。否则不需要清理，因为staleSlot存入了新值。
        if (slotToExpunge != staleSlot)
            cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
}

下面来看cleanSomeSlots方法，该方法从位置i+1开始清理log2(n)次，使用对数扫描是在垃圾清理效果和垃圾清理时间上做个平衡。

  /** 
     * Heuristically scan some cells looking for stale entries.
     * This is invoked when either a new element is added, or
     * another stale one has been expunged. It performs a
     * logarithmic number of scans, as a balance between no
     * scanning (fast but retains garbage) and a number of scans
     * proportional to number of elements, that would find all
     * garbage but would cause some insertions to take O(n) time.
     * 我的理解：
     * 启发式清理stale entris，当插入一个新entry或者调用
     * expungeStaleEntry方法清理entries时会调用此方法。
     * 本方法使用了对数扫描，以权衡"不扫描，留存垃圾"和
     * "全量扫描，工作量线性增长"这两种方式。后一种方式虽然可以找到
     * 所有的垃圾，但是会当时插入动作花费O(n)的时间复杂度
     * @param i a position known NOT to hold a stale entry. The
     * scan starts at the element after i.
     *散列表在入参i位置上肯定不是stale slot，所以从i后面扫描
     *
     * @param n scan control: {@code log2(n)} cells are scanned,
     * unless a stale entry is found, in which case
     * {@code log2(table.length)-1} additional cells are scanned.
     * When called from insertions, this parameter is the number
     * of elements, but when from replaceStaleEntry, it is the
     * table length. (Note: all this could be changed to be either
     * more or less aggressive by weighting n instead of just
     * using straight log n. But this version is simple, fast, and
     * seems to work well.)
     *
     * @return true if any stale entries have been removed.
     */
private boolean cleanSomeSlots(int i, int n) {
    boolean removed = false;
     Entry[] tab = table;
     int len = tab.length;
     do {
         i = nextIndex(i, len);
         Entry e = tab[i];
         if (e != null && e.get() == null) {
             //发现有Stale Slot，将n赋值为表长度
             n = len;
             removed = true;
             //下一次扫描的位置为expungeStaleEntry的返回值
             i = expungeStaleEntry(i);
         }
      } while ( (n >>>= 1) != 0);
            //扫描log2(n)次。
    return removed;
}

垃圾清理相关方法到这里就就讲完了。（文章快完了，好累）

set方法执行过程中若发现装载因子溢出，会调用rehash方法，来看下这个方法。

private void rehash() {
    //全量垃圾清理
    expungeStaleEntries();
    // Use lower threshold for doubling to avoid hysteresis
    //装载因子溢出，数组翻倍。
    if (size >= threshold - threshold / 4)
        resize();
    }

    /**
      * Expunge all stale entries in the table.
      */
    private void expungeStaleEntries() {
        Entry[] tab = table;
        int len = tab.length;
        //全表遍历Stale Slot，发现则清理
        for (int j = 0; j < len; j++) {
            Entry e = tab[j];
            if (e != null && e.get() == null)
                expungeStaleEntry(j);
        }
    }  

    /**
     * Double the capacity of the table.
     */
    private void resize() {
        Entry[] oldTab = table;
        int oldLen = oldTab.length;
        //新表长度是旧表长度的两倍
        int newLen = oldLen * 2;
        Entry[] newTab = new Entry[newLen];
        int count = 0;
        //遍历旧表，将旧表中的Full Entry，放入新表。
        for (int j = 0; j < oldLen; ++j) {
            Entry e = oldTab[j];
            if (e != null) {
                ThreadLocal<?> k = e.get();
                if (k == null) {
                    e.value = null; // Help the GC
                } else {
                    int h = k.threadLocalHashCode & (newLen - 1);
                    //冲突，线性探测
                    while (newTab[h] != null)
                        h = nextIndex(h, newLen);
                        newTab[h] = e;
                        count++;
                }
            }
        }

        setThreshold(newLen);
        size = count;
        table = newTab;
}

差点儿忘记，ThreadLocalMap还有个remove方法没讲，ThreadLocal的remove方法，会调用ThreadLocalMap的remove方法，将值从map中移除。

/**
  * Remove the entry for key.
  */
private void remove(ThreadLocal<?> key) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
    //线性探测
    for (Entry e = tab[i];
    e != null;
    e = tab[i = nextIndex(i, len)]) {
        if (e.get() == key) {
            //断开弱引用
            e.clear();
            //清理。
            expungeStaleEntry(i);
            return;
        }
    }
}

/ 总结 /

ThreadLocal通过操作Thread中的一个map，来实现在Therad中添加一个逻辑上的成员变量。

ThreadLocalMap使用了斐波拉契哈希；存储方式是数组，超出装载因子数组翻倍；解决冲突的方式是开放地址法；冲突探测的方式是线性探测；麻烦的地方在于垃圾清理。get方法在线性探测的过程中会做一点儿清理工作（释放Stale Slot，Full Slot左移），清理后内存得到释放，提高未来hash命中率；set方法主要思想是用新的Full Entry替换一段Run中的Stale Entry，并清理Run及Run后面的一段儿（对数扫描log2(n)，平衡垃圾回收和清理时间）。