Introduction of Wear Leveling

eaglelau

已于 2023-07-13 08:44:54 修改

阅读量485

点赞数 1

分类专栏： SSD 文章标签：嵌入式硬件

于 2023-06-12 14:03:53 首次发布

本文链接：https://blog.csdn.net/eaglelau/article/details/131167284

版权

SSD 专栏收录该内容

19 篇文章

订阅专栏

By Gibson Ming-Dar

Introduction of Wear Leveling - Embedded Computing Design

There are three types of Wear Leveling: Dynamic, Static and Global.

In the application of NAND Flash, there is a limitation of Program/Erase cycle, which is referred to "P/E cycles." In NAND Flash, when the P/E cycles of each block reach the maximum value, these blocks become non-workable and require a Spare block to replace it. When these Spare blocks are used up, this NAND Flash can no longer be used. Therefore, if certain blocks are only written and erased, the P/E cycles of these certain blocks will be consumed rapidly, and the spare blocks will be used up quickly, causing NAND Flash to fail early.

There are three types of Wear Leveling: Dynamic, Static and Global. Dynamic wear leveling only handles free space and ensures that write behavior will only occur in blocks that have a lower erase count in the same space. Static wear leveling considers the entire Flash die, including blank areas and blocks to which have been written. This technique moves the data from the block with the lower number of erases to the other blocks, so that the blocks with lower write times can be left for future usage. Finally, there’s Global wear leveling, where the biggest difference from Static wear leveling is that its scope extends to the entire device, while Static wear leveling works only on a single Flash die. This ensures that the write behavior occurs in blocks that are written less frequently throughout the device.

The vendors of NAND Flash controllers must consider this feature of NAND when developing. Therefore, Wear Leveling is designed to allow NAND Flash to evenly spread throughout the use of blocks so that the P/E cycles of all blocks rise with the same data. This technology allows all blocks to be used thoroughly before the product reaches its life cycle, extending the life of the NAND Flash. Therefore, wear leveling technology can improve the reliability and durability of NAND Flash products.

2011 Rejuvenator: A Static Wear Leveling Algorithm for NAND Flash Memory with Minimized Overhead

IEEE Xplore Full-Text PDF:

In this paper, our focus is on the wear out problem. A wear leveling algorithm aims to even out the wearing of different blocks of the flash memory. A block is said to be worn out, when it has been erased the maximum possible number of times. In this paper we define the lifetime of flash memory as the number of updates that can be executed before the first block is worn out. This is also called the first failure time [9]. The primary goal of any wear leveling algorithm is to increase the lifetime of flash memory by preventing any single block from reaching the 100𝐾 erasure cycle limit (we are assuming SLC flash). Our goal is to design an efficient wear leveling algorithm for flash memory.

在本文中，我们关注的重点是磨损问题。磨损均衡算法旨在消除闪存不同块的磨损。当一个块被擦除最大次数时，就是说被用坏了。在本文中，我们将闪存的生命周期定义为在第一个块磨损耗尽之前可以执行的更新的次数。这也被称为第一次失败时间[9]。任何磨损均衡算法的主要目标都是通过防止任何单个块达到100𝐾擦除周期限制（我们假设SLC闪存）来增加闪存的寿命。我们的目标是为闪存设计一种有效的磨损均衡算法。

even out （使）均衡，（使）平衡，（使）相等

wear out （使）磨损，（使）用坏

The data that is updated more frequently is defined as hot data, while the data that is relatively unchanged is defined as cold data. Optimizing the placement of hot and cold data in the flash memory assumes utmost importance given the limited number of erase cycles of a flash block. If hot data is being written repeatedly to certain blocks, then those blocks may wear out much faster than the blocks that store cold data. The existing approaches to wear leveling fall into two broad categories.

更新频率较高的数据定义为热数据，而相对不变的数据定义为冷数据。由于闪存块的擦除周期数量有限，因此在闪存中优化冷热数据的放置是最重要的。如果热数据被重复写入某些块，那么这些块可能比存储冷数据的块磨损得要快得多。现有的提高磨损水平的方法可分为两大类。

1) Dynamic wear leveling: These algorithms achieve wear leveling by repeatedly reusing blocks with lesser erase counts. However these algorithms do not attempt to move cold data that may remain forever in a few blocks. These blocks that store cold data wear out very slowly relative to other blocks. This results in a high degree of unevenness in the distribution of wear in the blocks. （从 Free Pool 去 EC count 最小的使用）
2) Static wear leveling: In contrast to dynamic wear leveling algorithms, static wear leveling algorithms attempt to move cold data to more worn blocks thereby facilitating more even spread of wear. However, moving cold data around without any update requests incurs overhead （Overhead：增加写放大，影响 Performance等）. （既要考虑 Free Pool，又要考虑 Data Blocks Pool。在 Data Blocks Pool 中找出冷数据所在的 block，将有效数据搬移到从 Free Pool 中找到的 EC count 最大的 Block 。

Rejuvenator clusters the blocks into different groups based on their current erase counts. Rejuvenator places hot data in blocks in lower numbered clusters and cold data in blocks in the higher numbered clusters. The range of the clusters is restricted within a threshold value. This threshold value is adapted according to the erase counts of the blocks.

The identification of hot and cold data is an integral part of Rejuvenator. We use a simple window based scheme with counters to determine which logical addresses are hot. The size of the window is fixed and it covers the logical addresses that were accessed in the recent past. At any point in time the logical addresses that have the highest counter values inside the window are considered hot. The hot data identification algorithm can be replaced by any sophisticated schemes that are available already [24], [25]. However in this paper we stick to the simple scheme.

冷热数据的识别是 Rejuvenator 的重要组成部分。我们使用一个简单的基于窗口的计数器方案来确定哪些逻辑地址是热的。窗口的大小是固定的，它涵盖了最近过去访问的逻辑地址。在任何时间点，窗口内计数器值最高的逻辑地址都被认为是热的。热数据识别算法可以被任何复杂的方案取代，已经有[24]，[25]。然而，在本文中，我们坚持采用简单的方案。

FTL那些事（3）之GC/WL (ssdfans.com)

WL（Wear-Leveling）磨损均衡的目的主要是防止Nand某些物理Block被频繁擦写而导致数据保持力很差，进而引发大量Bitflips，甚至是ECC Error或Bad Block的产生，数据因此出现错误而不可以再使用，这将是灾难性的。WL的出现正好解决这类问题，它能使所有的Block擦写处于同一水平，寿命均衡发展，用大家好才是真的好的理念保证数据安全性。WL有两种类型：动态WL和静态WL。简单来说动态WL每次挑选擦写次数最少的Block使用，而静态WL是把长期没有修改的Cold Data从擦写次数较少的Block中搬移出来放到擦写次数较多的Block中去，那么擦写次数较少的Block将被重新使用。一般动态WL发生在Write Request时候，而静态WL发生在空闲阶段定期检查触发条件执行，且具有全局Block的WL效果。对于其他文献来说，还有一种叫全局WL的概念，主要用在多Nand Chip的产品中，如果其中一个Chip磨损较快，则使用另外的Chip进行磨损均衡，磨损很快的Chip将被暂时Lock住，以避免再次频繁操作导致数据损坏错误，一直到所有的Chip达到磨损均衡为止。

根据经验，GC的考量因素有：（1）候选Block的有效选择条件，一般考虑上一章节说的Hot/Cold Data因素，上述的WL因素，以及Block中所包含的有效数据数目因素等；（2）GC发生的时机问题，是后台进程在空闲时间的主动GC，还是Write Request发生时因Nand Flash的直接可写空间不足时的被动GC，再或者将被动GC的整个数据量划分为多个细粒度子数据分布到未来的写操作中执行，以避免某次Write Request时间过长；（3）目标Block的选择条件，同样需要考虑WL因素，这样可以将GC和WL当作一个整体来执行，主动GC对应静态WL，被动GC对应动态WL，浑然一体。GC的效率计算如下：GC效率=GC产生空Page数目/（GC搬移Page数目 *（写时间+读时间）+ GC擦除Block数目 * 擦除时间）。可见GC跟有效数据大小以及Block擦除时间有关，随着SLC、MLC、TLC以及3D Nand的发展，Block Size在追歼增大，GC Latency也随之增加。

说到GC/WL，不得不顺带说一下WA（Write Amplification）写放大和OP（Over-provisioning）预留空间。WA的定义是实际写入到Flash中的数据除以被Host写入的数据。也就是说实际上写入的数据要大于Host所给的数据，原因是在写入Host数据的同时还要写入前面所述的Mapping Data，还有FTL管理数据，GC/WL搬移数据等，是影响性能的一个重要参数。而OP的定义是FTL预留不可给Host使用的空间，这些空间可用于GC/WL使用，以及FTL管理数据存放和坏块管理等。使用OP的好处是，OP越大WA越小，GC需要搬移的数据就越小，有助于性能的提升，但这是一个Trade-off的过程，毕竟用户也想使用更多Nand空间。对于帮助GC/WL性能的手段还有Trim的使用、Secure Erase、Hot/Cold Data分离、利用Cache增加顺序写入减少随机写入等，同时也需要对Bitflips严重的块格外关注，必要时候需要搬离这样的Block，该过程也被叫做Scrub。有兴趣的可以查找资料加深理解，如有讨论，请联系李大虾（mailto:lishizelibin@163.com）或关注微信公众号大虾谈（DaXiaTalking）。

曾经被人研究过的方法有：（1）通过识别过去负载特征来预测未来负载特征的方法，通过Block Age、Utilization、Erase Count、Invalidation Period等特征控制多个候选Block；（2）通过前面所述的识别Hot/Cold/Warm Data的方法，使得GC的效率大大增加；（3）利用Free-Page补给机制，让GC来的更迟一些吧，类似的还有研究者的LazyRTGC，尽可能的推迟GC的发生，将在下面具体介绍这种方法；（4）通过执行Partial GC的方法，代表性的是GFTL和RFTL（Distributed Partial GC Policy），定期的执行Partial GC以挖掘出可用的Block。除此之外还有等等方法，下面篇幅将代表性描述GC/WL的手段。在介绍GC/WL手段之前，先介绍一些GC Merge的三种经典概念：Full Merge，Partial Merge和Switch Merge。如下图所示，明理人都知道Switch Merge的效率最高。

下面先介绍一种类似Link-GC的WL方法，很多成熟的静态WL方法都与此类似。根据全局EC（Erase Count）最大值和最小值限制范围将Block划分为不同的Cluster，将Hot Data放到低EC值的Cluster中去，将Cold Data放到高EC值的Cluster中去，很多WL算法思想都是离不开Hot/Cold Data的划分，还有就是WL在搬移的过程中都借助GC动作。如下图所示，总共t个链表的滑动窗口，尽量将Block EC都维持在这个滑动窗口内。对于高级一点的程序t值的选取是自适应的，但是原则是随着EC均值的增加t值在逐渐减小，一种是线性减少（比如逐渐10%的递减），另一种是非线性减少（EC小的时候减小的慢，随着EC增大逐渐变快）。而m值反映的是存放Hot Data的Block数目水平，初始时设置为t值的50%，然后根据负载增加或减少。根据下图所示，可见如果EC增加集中在min_ec+t-1那头，那么m就相应减少；如果EC增加集中在min_ec那头，那么m就相应增加，变化的m将减少未来的数据搬移操作。除此之外，还可以加上将物理块映射到虚拟块上，通过判断虚拟块的访问状况来选择合适的物理块进行映射。

目前大多数静态WL算法都与上述做法类似，虽然有些手段不同，比如利用Hash来进行分布均匀化处理，但是概念上是差不多的。下面第一张图展示的是用户写入数据量（横轴）和均值EC（纵轴）之间的关系，说明某一段时间经过静态WL后，EC均值保持在某一水平上，均值上升速度与窗口大小有关，这是一张示意图，左边的窗口值很小，右边的窗口值很大；第二张展示的是用户写入数据量（横轴）和Wear Count（纵轴）关系图，不同颜色的线代表的是不同的WL窗口大小，有的前期长的快，有的后期长的快，实际上我们更希望后期（也就是较大PE Cycle时候）走势平缓一些，最右边两条线代表最大的窗口，最先平缓的是最小窗口值。因此综合均值EC和Wear Count走势选择一个合适的窗口值对一个静态WL算法还是蛮重要的。

II. BACKGROUND AND RELATED WORK

As mentioned above, the existing wear leveling algorithms fall into two broad categories - static and dynamic. Dynamic wear leveling algorithms are used due to their simplicity in management. Blocks with lesser erase counts are used to store hot data. L.P. Chang et al. [10] propose the use of an adaptive striping architecture for flash memory with multiple banks. Their wear leveling scheme allocates hot data to the banks that have least erase count. However as mentioned earlier, cold data remains in a few blocks and becomes stale. This contributes to a higher variance in the erase counts of the blocks. We do not discuss further about dynamic wear leveling algorithms since they obviously do a very poor job in leveling the wear.

TrueFFS [11] wear leveling mechanism maps a virtual erase unit to a chain of physical erase units. When there are no free physical units left in the free pool, folding occurs where the mapping of each virtual erase unit is changed from a chain of physical units to one physical unit. The valid data in the chain is copied to a single physical unit and the remaining physical units in the chain are freed. This guarantees a uniform distribution of erase counts for blocks storing dynamic data. Static wear leveling is done on a periodic basis and virtual units are folded in a round robin fashion. This mechanism is not adaptive and still has a high variance in erase counts depending on the frequency in which the static wear leveling is done. An alternative to the periodic static data migration is to swap the data in the most worn block and the least worn block [12]. JFFS [13] and STMicroelectronics [14] use very similar techniques for wear leveling.

Chang et al. [9] propose a static wear leveling algorithm in which a Bit Erase Table (BET) is maintained as an array of bits where each bit corresponds to 2^𝑘 contiguous blocks. Whenever a block is erased the corresponding bit is set. Static wear leveling is invoked when the ratio of the total erase count of all blocks to the total number of bits set in the BET is above a threshold. This algorithm still may lead to more than necessary cold data migrations depending on the number of blocks in the set of 2^𝑘 contiguous blocks. The choice of the value of 𝑘 heavily influences the performance of the algorithm. If the value of 𝑘 is small the size of the BET is very large. However if the value of 𝑘 is higher, the expensive work of moving cold data is done more than often.

The cleaning efficiency of a block is high if it has lesser number of valid pages. Agrawal et al. [15] propose a wear leveling algorithm which tries to balance the tradeoff between cleaning efficiency and the efficiency of wear-leveling. The recycling of hot blocks is not completely stopped. Instead the probability of restricting the recycling of a block is progressively increased as the erase count of the block is nearing the maximum erase count limit. Blocks with larger erase counts are recycled with lesser probability. Thereby the wear leveling efficiency and cleaning efficiency are optimized. Static wear leveling is performed by storing cold data in the more worn blocks and making the least worn blocks available for new updates. The cold data migration adds 4.7% to the average I/O operational latency.

The dual pool algorithm proposed by L.P. Chang [16] maintains two pools of blocks - hot and cold. The blocks are initially assigned to the hot and cold pools randomly. Then as updates are done the pool associations become stable and blocks that store hot data are associated with the hot pool and the blocks that store cold data are associated with cod pool. If some block in the hot pool is erased beyond a certain threshold its contents are swapped with those of the least worn block in cold pool. The algorithm takes a long time for the pool associations of blocks to become stable. There could be a lot of data migrations before the blocks are correctly associated with the appropriate pools. Also the dual pool algorithm does not explicitly consider cleaning efficiency. This can result in an increased number of valid pages to be copied from one block to another.

Besides wear leveling, other mechanisms like garbage collection and mapping of logical to physical blocks also affect the performance and lifetime of the flash memory. Many works have been proposed for efficient garbage collection in flash memory [17], [18], [19]. The mapping of logical to physical memory can be at a fine granularity at the page level or at a coarse granularity at the block level. The mapping tables are generally maintained in the RAM. The page level mapping technique consumes enormous memory since it contains mapping information about every page. Lee et al. [20] propose the use of a hybrid mapping scheme to get the performance benefits of page level mapping and space efficiency of block level mapping. Lee et al. [21] and Kang et al. [22] also propose similar hybrid mapping schemes that utilize both page and block level mapping. All the hybrid mapping schemes use a set of log blocks to capture the updates and then write them to the corresponding data blocks. The log blocks are page mapped while data blocks are block mapped. Gupta et al. propose a demand based page level mapping scheme called DFTL [23]. DFTL caches a portion of the page mapping table in RAM and the rest of the page mapping table is stored in the flash memory itself. This reduces the memory requirements for the page mapping table.

REFERENCES

[1] M. Sanvido, F. Chu, A. Kulkarni, and R. Selinger, “NAND Flash Memory and Its Role in Storage Architectures,” in Proceedings of the IEEE, vol. 96. IEEE, 2008, pp. 1864–1874.

[2] E. Gal and S. Toledo, “Algorithms and data structures for flash memories,” ACM Comput. Surv., vol. 37, no. 2, pp. 138–163, 2005.

[3] S. Hong and D. Shin, “NAND Flash-Based Disk Cache Using SLC/MLC Combined Flash Memory,” in Proceedings of the 2010 International Workshop on Storage Network Architecture and Parallel I/Os, ser. SNAPI ’10. Washington, DC, USA: IEEE Computer Society, 2010, pp. 21–30.

[4] T. Kgil, D. Roberts, and T. Mudge, “Improving nand flash based disk caches,” in Proceedings of the 35th Annual International Symposium on Computer Architecture, ser. ISCA ’08, 2008, pp. 327–338.

[5] X.-Y. Hu, E. Eleftheriou, R. Haas, I. Iliadis, and R. Pletka, “Write amplification analysis in flash-based solid state drives,” in Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference. New York, NY, USA: ACM, 2009, pp. 10:1–10:9.

[6] “FusionIO ioDrive specification sheet,” http://www.fusionio. com/PDFs/Fusion∖%20Specsheet.pdf.

[7] “Intel X25-E SATA solid state drive.” http://download.intel. com/design/flash/nand/extreme/extreme-sata-ssd-datasheet.pdf.

[8] W. K. Josephson, L. A. Bongo, D. Flynn, and K. Li, “DFS: A File System for Virtualized Flash Storage,” in FAST, 2010, pp. 85–100.

[9] Y.-H. Chang, J.-W. Hsieh, and T.-W. Kuo, “Endurance enhancement of flash-memory storage systems: an efficient static wear leveling design,” in DAC ’07: Proceedings of the 44th annual Design Automation Conference. New York, NY, USA: ACM, 2007, pp. 212–217.

[10] L.-P. Chang and T.-W. Kuo, “An Adaptive Striping Architecture for Flash Memory Storage Systems of Embedded Systems,” in RTAS ’02: Proceedings of the Eighth IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’02). Washington, DC, USA: IEEE Computer Society, 2002.

[11] D. Shmidt, “Technical Note: TrueFFS wear leveling mechanism,” Technical Report, Msystems, 2002.

[12] D. Jung, Y.-H. Chae, H. Jo, J.-S. Kim, and J. Lee, “A groupbased wear-leveling algorithm for large-capacity flash memory storage systems,” in Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, ser. CASES ’07. New York, NY, USA: ACM, 2007, pp. 160–164.

[13] D. Woodhouse, “JFFS: The Journalling Flash File System,,” Proceedings of Ottawa Linux Symposium, 2001.

[14] “Wear Leveling in Single Level Cell NAND Flash Memories,,” STMicroelectronics Application Note(AN1822), 2006.

[15] N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, and R. Panigrahy, “Design tradeoffs for SSD performance,” in ATC’08: USENIX 2008 Annual Technical Conference on Annual Technical Conference. Berkeley, CA, USA: USENIX Association, 2008, pp. 57–70.

[16] L.-P. Chang, “On efficient wear leveling for large-scale flashmemory storage systems,” in SAC ’07: Proceedings of the 2007 ACM symposium on Applied computing. New York, NY, USA: ACM, 2007, pp. 1126–1130.

[17] O. Kwon and K. Koh, “Swap-Aware Garbage Collection for NAND Flash Memory Based Embedded Systems,” in CIT ’07: Proceedings of the 7th IEEE International Conference on Computer and Information Technology. Washington, DC, USA: IEEE Computer Society, 2007, pp. 787–792.

[18] L.-P. Chang, T.-W. Kuo, and S.-W. Lo, “Real-time garbage collection for flash-memory storage systems of real-time embedded systems,” ACM Trans. Embed. Comput. Syst., vol. 3, no. 4, 2004.

[19] Y. Du, M. Cai, and J. Dong, “Adaptive Garbage Collection Mechanism for N-log Block Flash Memory Storage Systems,” in ICAT ’06: Proceedings of the 16th International Conference on Artificial Reality and Telexistence–Workshops. Washington, DC, USA: IEEE Computer Society, 2006.

[20] S.-W. Lee, D.-J. Park, T.-S. Chung, D.-H. Lee, S. Park, and H.-J. Song, “A log buffer-based flash translation layer using fullyassociative sector translation,” ACM Trans. Embed. Comput. Syst., vol. 6, no. 3, 2007.

[21] S. Lee, D. Shin, Y.-J. Kim, and J. Kim, “LAST: localityaware sector translation for NAND flash memory-based storage systems,” SIGOPS Oper. Syst. Rev., vol. 42, no. 6, pp. 36–42, 2008.

[22] J.-U. Kang, H. Jo, J.-S. Kim, and J. Lee, “A superblock-based flash translation layer for NAND flash memory,” in EMSOFT ’06: Proceedings of the 6th ACM & IEEE International conference on Embedded software. New York, NY, USA: ACM, 2006, pp. 161–170.

[23] A. Gupta, Y. Kim, and B. Urgaonkar, “DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings,” in Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, ser. ASPLOS ’09. New York, NY, USA: ACM, 2009.

[24] J.-W. Hsieh, T.-W. Kuo, and L.-P. Chang, “Efficient identification of hot data for flash memory storage systems,” Trans. Storage, vol. 2, pp. 22–40, February 2006.

[25] M.-L. Chiang, P. C. H. Lee, and R.-C. Chang, “Using data clustering to improve cleaning performance for flash memory,” Softw. Pract. Exper., vol. 29, no. 3, pp. 267–290, 1999.

[26] S.-W. Lee, D.-J. Park, T.-S. Chung, D.-H. Lee, S. Park, and H.-J. Song, “A log buffer-based flash translation layer using fullyassociative sector translation,” ACM Trans. Embed. Comput. Syst., vol. 6, July 2007.

[27] “University of Massachusetts Amhesrst Storage Traces,” http: //traces.cs.umass.edu/index.php/Storage/Storage.

[28] S. Kavalanekar, B. L. Worthington, Q. Zhang, and V. Sharda, “Characterization of storage workload traces from production windows servers,” in IISWC, 2008, pp. 119–128.

[29] “HP Labs - Tools and Traces,” http://tesla.hpl.hp.com/public software/.

Understanding and Exploiting the Full Potential of SSD Address Remapping Understanding and Exploiting the Full Potential of SSD Address Remapping (uta.edu)

Garbage Collection Techniques for Flash-Resident Page-Mapping FTLs 1504.01666.pdf (arxiv.org)

3.1 Abundant RAM Assuming

RAM is plentiful enough to store the entire page-mapping, let us survey a few techniques for performing victim-selection and live-page-identification.

3.1.1 Victim Selection

Greedy: A simple method for victim selection is to maintain a mapping from block id to the number of valid pages in each block. To select a GC victim, we scan this mapping and choose the block with the least number of live pages. To maintain this mapping, we must know the physical address of the before-image of every update. We use the address of the before image to decrement the counter for the block in which the before-image resides.

LRU : A different technique for victim-selection is least recently used (LRU), which selects the block that was erased the longest time ago. The rationale is that this block is likely to contain the least number of live pages. Typically, this requires maintaining a queue of blocks. A block is inserted into the queue when it is written, and a victim is selected by popping the queue. The issue of the LRU scheme is that it may involve more migrations than the greedy scheme since we have no guarantee we have actually chosen the block with the least number of live pages.

Window-greedy: A compromise between the LRU and greedy policies is window-greedy. It implements a block queue like the LRU policy and applies the greedy policy only to the front X blocks in the queue. This allows avoiding the potentially CPU-expensive scan of the greedy algorithm, and to increase the chance of finding a block with few live pages relative to the LRU policy. Note that some methods also choose a victim based on age [8]. Such methods essentially integrate the wear-levelling and garbage-collection schemes. In this work we just concentrate on garbage-collection. Some works also separate pages into groups based on update frequency and perform garbagecollection independently within each group [18]. This is also outside of the current scope.

3.1.2 Live Page Identification

Once a victim has been chosen, we need to check which pages in it are still valid. Three techniques are possible.

page-mapping scan: We can scan the entire page-mapping to find all live pages that are on the target block. However, this scan may become a CPU bottleneck.

page-validity-bitmap (PVB): A less CPU-intensive alternative is to use a page-liveness-bitmap, which tracks which pages in the SSD are valid and which are invalid. Pages clustered based on which block they are on. To maintain this map, we must know the physical location of the before-image of each write to shift the corresponding bit in the bitmap. Note that if we use the greedy policy for victim selection, the PVB can be used to keep track of the number of live pages in a block by taking the Hamming weight of the bits associated with a given block.

flash-reverse-mapping: Yet another alternative is to store a flash-reverse-mapping in the out-of-bound component of each block. This mapping indicates which logical pages are written in each of the physical pages on the block. It is updated when the block is written. In order to identify which pages are valid, we read this map before starting a GC operation. We look up each logical addresses in the page-mapping table and check if the physical address still corresponds to the block we are targeting. If so, then the page is valid. Note that the above three techniques assume that page-mapping is in RAM. In the next section, we examine the problems that arise when most of page-mapping is stored in flash.

Garbage Collection in Single-Level Cell NAND Flash Memory Garbage Collection in Single-Level Cell NAND Flash Memory (micron.com)

Taking Garbage Collection Overheads Off the Critical Path in SSDs Taking Garbage Collection Overheads Off the Critical Path in SSDs (hal.science)

Introduction to NAND flash memory (snu.ac.kr)

Observation and Optimization on Garbage Collection of Flash Memories: The View in Performance Cliff Observation and Optimization on Garbage Collection of Flash Memories: The View in Performance Cliff (nih.gov)

Remap-SSD: Safely and Efficiently Exploiting SSD Address Remapping to Eliminate Duplicate Writes fast21-zhou.pdf (usenix.org)