Roaring Bitmaps
When you can’t decide if your data is dense or sparse
Filter Cachine
• A filter either matches or does not match a document
• Due to immutable segments, we have an opportunity to cache frequentfilters
Some points to keep in mind
Approach #1: Sorted List
Approach #2: Bitmaps
Alternative #3: Various Compressed Bitmaps
Alternative #3: Various RLE Compressed Bitmaps
Overview so far
Roaring Bitmaps
Partition into 216 chunks
Store containers in vector
Vector index == 16 least-significant bits
Fewer than 4096 Values?
More than 61440 values?
Why 4096 cutoff?
Memory Footprint
Simulated Annealing
Quicklyfinding “good enough”parameters
Moving averages
Variously weighted averages
Configurable parameters
Turns out, tuning parameters is hard
Black-box optimization
Becausesometimes you just need a hammer
Simulated Annealing Process
“Random neighbor”
Mutateone of parameters, leave the rest constant
Simulated Annealing
Local Minima
Simulated Annealing in Elasticsearch
T-Digest Percentiles
T-Digest Percentiles
Calculating T-digest Percentiles
Inserting Values into T-digest
T-Digest Practical Notes