浅谈时序数据库内核:如何用单机扛住亿级数据写入

本文探讨时序数据库面临的挑战,如Prometheus和InfluxDB的数据存储问题,包括LSM Tree和BoltDB的优缺点。提出了解决方案,如采用LSM-Tree的变种Time Structured Merge Tree,利用WAL优化写入性能,以及通过数据保留策略和再采样来节省存储空间。
摘要由CSDN通过智能技术生成
版本 日期 备注
1.0 2021.10.19 文章首发

0. 背景

标题来源于InfluxDB对于它们的存储引擎诞生的背景介绍:

The workload of time series data is quite different from normal database workloads. There are a number of factors that conspire to make it very difficult to get it to scale and perform well:
- Billions of individual data points
- High write throughput
- High read throughput
- Large deletes to free up disk space
- Mostly an insert/append workload, very few updates

The first and most obvious problem is one of scale. In DevOps, for instance, you can collect hundreds of millions or billions of unique data points every day.

To prove out the numbers, let’s say we have 200 VMs or servers running, with each server collecting an average of 100 measurements every 10 seconds. Given there are 86,400 seconds in a day, a single measurement will generate 8,640 points in a day, per server
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值