Efficient SSD Caching by Avoiding Unnecessary Writes using Machine Learning

最新推荐文章于 2020-11-04 17:13:59 发布

dc199706

最新推荐文章于 2020-11-04 17:13:59 发布

阅读量367

点赞数 2

分类专栏： # 读论文 CS-1级

本文链接：https://blog.csdn.net/dc199706/article/details/103335154

版权

本文提出了一种基于决策树的智能缓存系统，通过预测避免将仅访问一次的文件写入SSD，从而减少了不必要的写入操作，提高了SSD缓存的效率和寿命。在腾讯的照片缓存场景中，该系统能够减少60%的写入次数，同时提高缓存命中率24%，平均访问延迟降低5%。

摘要由CSDN通过智能技术生成

Efficient SSD Caching by Avoiding Unnecessary Writes using Machine Learning

关键词
作者及其他信息
abstract
introduction
background and motivations
- Tencent photo caching
application of machine learning to cache
- classifying algorithm
- feature extraction
design and implementation of intelligent caching
some details
one-time-access criteria
evaluation
个人学习总结

用机器学习预测只会被访问一次的图像并阻止它们写入cache，从而提高效率、延长SSD寿命。
社交网络情境下，图像文件的时间局部性差，当前访问的图像很大概率只会被访问这一次。由此导致的频繁写cache大大降低了硬件寿命和效率，因此提出一种办法预测那些只会被访问一次的图像，并阻止这种图像进入cache。

关键词

cache、决策树、机器学习、one-time-access file、SSD

作者及其他信息

HuaWang†, XinboYi†, PingHuang†‡, BinCheng§, and KeZhou† .
ICPP 2018, August 13–16, 2018, Eugene, OR, USA
©2018 Associationfor Computing Machinery.
ACM ISBN 978-1-4503-6510-9/18/08.
https://doi.org/10.1145/3225058.3225126

abstract

SSD作为cache，容量小，因此写密度比HDD大得多。 under social network workloads, quite a few writes on SSD are unnecessary, e.g.,Tencent’s photo caching shows that about 61% of total photos are just accessed once whereas they are still swapped in and out of the cache。因此想要精准预测写操作，Unlike the state-of-the-art history-based predictions, our prediction is non-history-oriented, which is challenging to achieve a good prediction accuracy。但是比起LRU，命中率提高24%，cache写次数减少70%，平均访问延迟减少5%。

introduction

大数据环境下，flash为基础的SSD被用于HDD 存储系统的上层（作为cache）。对SSD buffer 的研究主要集中在：lifetime extension and performance improvement.
cache就是用来存储最常被访问的哪些数据的，因此其write traffic比起backend非常大，寿命会很快被消耗尽。（文章提到已经有许多efective的延长寿命的方法，本方法是complementary。）
对腾讯QQ的“图像一次访问”现象的观察是本文动机，目标是阻止这种图像进入cache。分类器采用了决策树。