1、BitSet存储位的方式
private long[] words;
2、BitSet主要构造方法
public BitSet(long[] longs){
words=Arrays.copyOf(longs,longs.length);
}
public static BitSet valueOf(byte[] bytes){
return BitSet.valueOf(ByteBuffer.wrap(bytes));
}
public static BitSet valueOf(ByteBuffer bb) {
bb = bb.slice().order(ByteOrder.LITTLE_ENDIAN);
int n;
//去除高位的0
for (n = bb.remaining(); n > 0 && bb.get(n - 1) == 0; n--)
;
long[] words = new long[(n + 7) / 8];
//重新定位缓冲区界限
bb.limit(n);
int i = 0;
while (bb.remaining() >= 8)
words[i++] = bb.getLong();
//若后面位数不满8个字节,则移位处理
for (int remaining = bb.remaining(), j = 0; j < remaining; j++)
words[i] |= (bb.get() & 0xffL) << (8 * j);
return new BitSet(words);
}
for (int remaining = bb.remaining(), j = 0; j < remaining; j++)
words[i] |= (bb.get() & 0xffL) << (8 * j);
我认为这两个语句是值得深思的,关于怎么把byte转化成long,bb.get() && 0xffL 是把字节转化成了long值,我测试了这种方式和普通强制转换的性能差距,普通强制转换会快一点,不过基本都是1~3ms的差距,所以这里这样实现应该是为了增强可读性。把上述的转换结果移位再与原结果进行或运算,8*j这个部分我尝试着改成过j<<3,性能测试结果反而变慢了,此处笔者也抱有疑问?可能是CPU(Pentium G4560)的问题。
3、位操作
//获取该位在long数组的位置
private static int wordIndex(int bitIndex) {
return bitIndex >> ADDRESS_BITS_PER_WORD;//该常量值为6
}
public void set(int fromIndex, int toIndex) {
checkRange(fromIndex, toIndex);
if (fromIndex == toIndex)
return;
// Increase capacity if necessary
int startWordIndex = wordIndex(fromIndex);
int endWordIndex = wordIndex(toIndex - 1);
expandTo(endWordIndex);
//WORD_MASK=0xffffffffffffffffL
long firstWordMask = WORD_MASK << fromIndex;
long lastWordMask = WORD_MASK >>> -toIndex;
if (startWordIndex == endWordIndex) {
//当fromIndex和toIndex所操作的位在同一个long位里
words[startWordIndex] |= (firstWordMask & lastWordMask);
} else {
// 当fromIndex和toIndex所操作的位在多个long位里
//处理第一个long
words[startWordIndex] |= firstWordMask;
// Handle intermediate words, if any
for (int i = startWordIndex+1; i < endWordIndex; i++)
words[i] = WORD_MASK;
// Handle last word (restores invariants)
words[endWordIndex] |= lastWordMask;
}
checkInvariants();
}
这个方法是设置一段长度的位为1的方法,对于一个long值,这个方法最多这能设置连续的63位值为0,这是wordIndex方法和不包括的toIndex位决定的。
long lastWordMask = WORD_MASK >>> -toIndex;
这个语句有个特别的地方,>>>右侧的值是负数,这种操作对0xffffffffffffffffL有特别的操作,对于 0xffffffffffffffffL>>>-i 结果是 从最右往左数连续i个1 ,0没有操作,例如:0xffffffffffffffffL>>>-4 结果为 0x000..00FL。
举个例子,这个方法的意思就通过 long firstWordMask = WORD_MASK << fromIndex 生成的0x11111….L 位,1聚集在右侧,而 long lastWordMask = WORD_MASK >>> -toIndex 生成 0x….1111L,1聚集在左侧,如下图所示,与对应的long或运算即可实现置1操作。
public void clear(int fromIndex, int toIndex) {
checkRange(fromIndex, toIndex);
if (fromIndex == toIndex)
return;
int startWordIndex = wordIndex(fromIndex);
if (startWordIndex >= wordsInUse)
return;
int endWordIndex = wordIndex(toIndex - 1);
if (endWordIndex >= wordsInUse) {
toIndex = length();
endWordIndex = wordsInUse - 1;
}
long firstWordMask = WORD_MASK << fromIndex;
long lastWordMask = WORD_MASK >>> -toIndex;
if (startWordIndex == endWordIndex) {
// Case 1: One word
words[startWordIndex] &= ~(firstWordMask & lastWordMask);
} else {
// Case 2: Multiple words
// Handle first word
words[startWordIndex] &= ~firstWordMask;
// Handle intermediate words, if any
for (int i = startWordIndex+1; i < endWordIndex; i++)
words[i] = 0;
// Handle last word
words[endWordIndex] &= ~lastWordMask;
}
recalculateWordsInUse();
checkInvariants();
}
这个方法实现的是范围清0操作,这个原理跟set一样,就不介绍了。关于clear我做过如下改动:
public void clearJDK0(int from, int to) {
if (from > to && to == 0) {
return;
}
int fromWordIndex = wordIndex(from);
int toWordIndex = wordIndex(to);
//减少取反操作
long startWord = ~(MASK << from);
// to=0,~(0xffff..fL)
long toWord = ~(MASK >>> -to);
// System.out.println("from:" + ByteUtil.toBinaryString(startWord));
if (fromWordIndex == toWordIndex) {
words[fromWordIndex] &= startWord | toWord;
} else {
words[fromWordIndex] &= startWord;
for (int i = fromWordIndex + 1; i < toWordIndex; ++i) {
words[i] |= 0x0L;
}
words[toWordIndex] &= toWord;
}
}
public void clear(int from, int to) {
if (from > to) {
return;
}
int fromWordIndex = wordIndex(from);
int toWordIndex = wordIndex(to);
//改动
long startWord = from == 0 ? 0x0L : MASK >>> -from;
long toWord = MASK << to;
// System.out.println("from:" + ByteUtil.toBinaryString(startWord));
if (fromWordIndex == toWordIndex) {
words[fromWordIndex] &= startWord | toWord;
} else {
words[fromWordIndex] &= startWord;
for (int i = fromWordIndex + 1; i < toWordIndex; ++i) {
words[i] |= 0x0L;
}
words[toWordIndex] &= toWord;
}
}
JDK中的操作精简:
public void clearJDK1(int from, int to) {
if (from > to) {
return;
}
int fromWordIndex = wordIndex(from);
int toWordIndex = wordIndex(to);
long startWord = MASK << from;
long toWord = MASK >>> -to;
// System.out.println("from:" + ByteUtil.toBinaryString(startWord));
if (fromWordIndex == toWordIndex) {
words[fromWordIndex] &= ~(startWord & toWord);
} else {
words[fromWordIndex] &= ~startWord;
for (int i = fromWordIndex + 1; i < toWordIndex; ++i) {
words[i] |= 0x0L;
}
words[toWordIndex] &= ~toWord;
}
}
性能都只是1~2ms的差距,没有什么优化的必要
public boolean get(int bitIndex) {
if (bitIndex < 0)
throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);
checkInvariants();
int wordIndex = wordIndex(bitIndex);
return (wordIndex < wordsInUse)
&& ((words[wordIndex] & (1L << bitIndex)) != 0);
}
这个方法是获取某一个位置的位值,清楚怎么获取某一位的值就可以了
关于BitSet的模拟代码已上传github:https://github.com/delin10/frame
下面是项目框架