在文章软删除softDeletes(二)中我们说到,在Lucene 7.5.0版本中,使用了下面两个容器来存储软删除的删除信息、DocValues的更新信息:
- Map<String,LinkedHashMap<Term,NumericDocValuesUpdate>> numericUpdates:DocValuesUpdatesNode
- Map<String,LinkedHashMap<Term,BinaryDocValuesUpdate>> binaryUpdate:DocValuesUpdatesNode
而从Lucene 7.7.0版本之后,使用了下面的一个容器来优化存储:
- final Map<String, FieldUpdatesBuffer> fieldUpdates = new HashMap<>();
为什么使用FieldUpdatesBuffer类存储
在介绍这两种存储的差异前,我们先通过源码中的注释来介绍下改用FieldUpdatesBuffer来存储完善后的删除结点的目的:
1
This class efficiently buffers numeric and binary field updates and stores terms, values and metadata in a memory efficient way without creating large amounts of objects. Update terms are stored without de-duplicating the update term.
2
In general we try to optimize for several use-cases. For instance we try to use constant space for update terms field since the common case always updates on the same field. Also for docUpTo we try to optimize for the case when updates should be applied to all docs ie. docUpTo=Integer.MAX_VALUE. In other cases each update will likely have a different docUpTo.
3
Along the same lines this impl optimizes the case when all updates have a value. Lastly, if all updates share the same value for a numeric field we only store the value once.
看这里:
https://www.amazingkoala.com.cn/Lucene/Index/2020/0629/151.html