BitSet的原理介绍

最新推荐文章于 2024-07-19 14:20:13 发布

yanph123

最新推荐文章于 2024-07-19 14:20:13 发布

阅读量562

点赞数

分类专栏： java 数据结构文章标签： java 数据结构

本文链接：https://blog.csdn.net/weixin_39815304/article/details/107228890

版权

java 同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

数据结构

1 篇文章 0 订阅

订阅专栏

原理

众所周知，Java的BitSet使用一个Long（一共64位）的数组中的每一位（bit）是否为1来表示当前Index的数存在不。但是BitSet又是如何实现的呢？其实只需要理解其中的两个方法：

就能够理解BitSet的实现原理是什么了。

set

先看源代码：

public void set(int bitIndex) {
    if (bitIndex < 0)
        throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

    int wordIndex = wordIndex(bitIndex);
    expandTo(wordIndex);

    words[wordIndex] |= (1L << bitIndex); // Restores invariants

    checkInvariants();
}

抛过判断小于0, 从第 4行开始读取

wordIndex

/**
 * Given a bit index, return word index containing it.
 */
private static int wordIndex(int bitIndex) {
    return bitIndex >> ADDRESS_BITS_PER_WORD;
}

private final static int ADDRESS_BITS_PER_WORD = 6;

这里ADDRESS_BITS_PER_WORD的值是6，那么最先想到的问题就是：为什么是6呢？而不是其他值呢？

答案其实很简单，还记得在最开始提到的：BitSet里使用一个Long数组里的每一位来存放当前Index是否有数存在。

因为在Java里Long类型是64位，所以一个Long可以存储64个数，而要计算给定的参数bitIndex应该放在数组（在BitSet里存在word的实例变量里）的哪个long里，只需要计算：bitIndex / 64即可，这里正是使用>>来代替除法（因为位运算要比除法效率高）。而64正好是2的6次幂，所以ADDRESS_BITS_PER_WORD的值是6。

通过wordIndex函数就能计算出参数bitIndex应该存放在words数组里的哪一个long里。

expandTo

private void expandTo(int wordIndex) {
    int wordsRequired = wordIndex+1;
    if (wordsInUse < wordsRequired) {
        ensureCapacity(wordsRequired);
        wordsInUse = wordsRequired;
    }
}

从上面已经知道在BitSet里是通过一个Long数组（words）来存放数据的，这里的expandTo方法就是用来判断words数组的长度是否大于当前所计算出来的wordIndex（简单的说，就是能不能存的下），如果超过当前words数组的长度（记录在实例变量wordsInUse里），也即是存不下，则新加一个long数到words里(ensureCapacity(wordsRequired)所实现的。)。

Restores invariants

words[wordIndex] |= (1L << bitIndex); // Restores invariants

这一行代码可以说是BitSet的精髓了，先不说什么意思，我们先看看下面代码的输出：

System.out.println(Integer.toBinaryString(1<<0));
System.out.println(Integer.toBinaryString(1<<1));
System.out.println(Integer.toBinaryString(1<<2));
System.out.println(Integer.toBinaryString(1<<3));
System.out.println(Integer.toBinaryString(1<<4));
System.out.println(Integer.toBinaryString(1<<5));
System.out.println(Integer.toBinaryString(1<<6));
System.out.println(Integer.toBinaryString(1<<7));

结果为：

1
10
100
1000
10000
100000
1000000
10000000

从而发现，上面所有的输出力，1 所在的位置，正好是第1，2，3，4，5，6，7，8（Java数组的Index从0开始）位。而BitSet正是通过这种方式，将所给的bitIndex所对应的位设置成1，表示这个数已经存在了。这也解释了(1L << bitIndex)的意思（注意：因为BitSet是使用的Long，所以要使用1L来进行位移）。

搞懂了(1L << bitIndex)，剩下的就是用|来将当前算出来的和以前的值进行合并了words[wordIndex] |= (1L << bitIndex);。

剩下的checkInvariants就没什么好解释的了。

get

搞懂了set方法，那么get方法也就好懂了，整体意思就是算出来所给定的bitIndex所对应的位数是否为1即可。先看看代码：

public boolean get(int bitIndex) {
    if (bitIndex < 0)
        throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

    checkInvariants();

    int wordIndex = wordIndex(bitIndex);
    return (wordIndex < wordsInUse)
        && ((words[wordIndex] & (1L << bitIndex)) != 0);
}

计算wordIndex在上面set方法里已经说明了，就不再细述。这个方法里，最重要的就只有：words[wordIndex] & (1L << bitIndex)。这里(1L << bitIndex)也已经做过说明，就是算出一个数，只有bitIndex位上为1，其他都为0，然后再和words[wordIndex]做&计算，如果words[wordIndex]数的bitIndex位是0，则结果就是0，以此来判断参数bitIndex存在。

贴上API

Modifier and Type	Method and Description
`void`	`and(BitSet set)` 执行此参数位置位的此目标位设置的逻辑 AND 。
`void`	`andNot(BitSet set)` 清除所有的位，这 `BitSet`其相应的位被设置在指定的 `BitSet` 。
`int`	`cardinality()` 返回此 `BitSet`设置为 `true`的 `BitSet` 。
`void`	`clear()` 将此BitSet中的所有位设置为 `false` 。
`void`	`clear(int bitIndex)` 将索引指定的位设置为 `false` 。
`void`	`clear(int fromIndex, int toIndex)` 将指定的 `fromIndex` （含）的位设置为 `toIndex` （排他）到 `false` 。
`Object`	`clone()` 克隆这个 `BitSet`产生一个新的 `BitSet`等于它。
`boolean`	`equals(Object obj)` 将此对象与指定对象进行比较。
`void`	`flip(int bitIndex)` 将指定索引处的位设置为其当前值的补码。
`void`	`flip(int fromIndex, int toIndex)` 将指定的每一位 `fromIndex` （含）到指定的 `toIndex` （独家）为其当前值的补码。
`boolean`	`get(int bitIndex)` 返回具有指定索引的位的值。
`BitSet`	`get(int fromIndex, int toIndex)` 返回一个新 `BitSet`组成位从这个 `BitSet`从 `fromIndex` （含）至 `toIndex` （独家）。
`int`	`hashCode()` 返回此位集的哈希码值。
`boolean`	`intersects(BitSet set)` 如果指定，则返回true `BitSet`具有设置为任何位 `true`这也被设置为 `true`这个 `BitSet` 。
`boolean`	`isEmpty()` 如果此 `BitSet`包含设置为 `true`位，则返回true。
`int`	`length()` 返回这个 `BitSet`的“逻辑大小”： `BitSet`加上最高位的索引。
`int`	`nextClearBit(int fromIndex)` 返回在指定的起始索引上或之后设置为 `false`的第一个位的索引。
`int`	`nextSetBit(int fromIndex)` 返回在指定的起始索引上或之后发生的第一个位的索引设置为 `true` 。
`void`	`or(BitSet set)` 使用位设置参数执行该位的逻辑或。
`int`	`previousClearBit(int fromIndex)` 返回被设置为最接近的位的索引 `false`上或指定的起始索引之前发生。
`int`	`previousSetBit(int fromIndex)` 返回被设置为最接近的位的索引 `true`上或指定的起始索引之前发生。
`void`	`set(int bitIndex)` 将指定索引处的位设置为 `true` 。
`void`	`set(int bitIndex, boolean value)` 将指定索引处的位设置为指定值。
`void`	`set(int fromIndex, int toIndex)` 将指定的 `fromIndex` （含）的位设置为指定的 `toIndex` （排他）到 `true` 。
`void`	`set(int fromIndex, int toIndex, boolean value)` 将指定的 `fromIndex` （含）的位设置为指定值的 `toIndex` （排除）。
`int`	`size()` 返回此 `BitSet`实际使用的空间位数，以表示位值。
IntStream	`stream()` 返回此 `BitSet`包含设置状态位的索引流。
`byte[]`	`toByteArray()` 返回一个包含该位集中所有位的新字节数组。
`long[]`	`toLongArray()` 返回一个包含该位集合中所有位的新长数组。
String	`toString()` 返回此位集的字符串表示形式。
`static BitSet`	`valueOf(byte[] bytes)` 返回包含给定字节数组中所有位的新位集合。
`static BitSet`	`valueOf(ByteBuffer bb)` 返回一个新的位集，其中包含给定字节缓冲区中位置和极限之间的所有位。
`static BitSet`	`valueOf(long[] longs)` 返回包含给定长数组中所有位的新位集。
`static BitSet`	`valueOf(LongBuffer lb)` 返回包含给定长缓冲区中其位置和极限之间的所有位的新位集合。
`void`	`xor(BitSet set)` 使用位设置参数执行该位的逻辑异或。