Thursday, March 10, 2016

Hard Power Off, Solid State Disks, and Flash Memory: The Story Continues

Two more articles have come to my attention on the reliability of flash-based mass storage devices, from two organizations that should have some experience with such devices under their collective belts: Facebook and Google.

From Carnegie Mellon University and Facebook, Inc.:

J. Meza, Q. Wu, S. Kumar, O. Mutlu, A Large-Scale Study of Flash Memory Failures in the Field, ACM SIGMETRICS '15, June 15-19, 2015, Portland, OR, USA

Sparse data layout across an SSD's physical address space (e.g., non-contiguously allocated data) leads to high SSD failure rates;  dense data layout (e.g.,  contiguous data) can also negatively impact reliability under certain conditions, likely due to adversarial access patterns.
From University of Toronto and Google, Inc:

B. Schroeder, R. Lagisetty, A. Merchant, Flash Reliability in Production: The Expected and Unexepected, USENIX FAST '16, February 22-25, 2016, Santa Clara, CA, USA

In summary, we find that the flash drives in our study experience significantly lower replacement rates (within their rated lifetime) than hard disk drives. On the downside, they experience significantly higher rates of uncorrectable errors than hard disk drives.
Nearly five years after writing the first of the articles cited below regarding my own experience with incorporating SSDs and other flash-based mass storage in products, I continue to find challenges with their application. Not that I don't use them myself; semiconductor-based mass storage has replaced magnetic media in virtually all of the more recent systems that I deal with on a regular basis. But I find that the design practices of both hardware and software architects still haven't caught up with the implications of their use.


C. Overclock, Data Remanence and Solid State Drives, 2011-06-20

C. Overclock, The Death of Hard Power Off, 2012-07-17

C. Overclock, Hard Power Off Is Dead But Not Buried, 2013-03-04

No comments: