{"id":1928,"date":"2017-02-04T22:45:12","date_gmt":"2017-02-05T03:45:12","guid":{"rendered":"http:\/\/swildow.darktech.org\/wp\/?p=1928"},"modified":"2017-09-05T10:00:34","modified_gmt":"2017-09-05T15:00:34","slug":"why-raid-6-stops-working-in-2019","status":"publish","type":"post","link":"https:\/\/www.wildow.com\/blog\/?p=1928","title":{"rendered":"Why RAID 6 stops working in 2019"},"content":{"rendered":"<div class=\"topContent container\">\n<div class=\"row\">\n<div class=\"row\">\n<div class=\"col-12\">\n<header class=\"storyHeader article\">\n<h1>Why RAID 6 stops working in 2019<\/h1>\n<p class=\"summary\">Three years ago I warned that RAID 5 would stop working in 2009. Sure enough, no enterprise storage vendor now recommends RAID 5. Now it&#8217;s RAID 6, which protects against 2 drive failures. But in 2019 even RAID 6 won&#8217;t protect your data. Here&#8217;s why.<\/p>\n<div class=\"byline\">\n<div class=\"thumb\"><a href=\"http:\/\/www.zdnet.com\/meet-the-team\/\" rel=\"author\" data-vanity-rewritten=\"true\"><span class=\"img \"><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"http:\/\/zdnet2.cbsistatic.com\/hub\/i\/r\/2014\/07\/22\/5be13b92-1175-11e4-9732-00505685119a\/thumbnail\/40x40\/142c9f347214decc6a468ab3434eda6b\/robin-harris.jpg\" alt=\"Robin Harris\" width=\"40\" height=\"40\" \/><\/span><\/a><\/div>\n<p class=\"meta\">By <a href=\"http:\/\/www.zdnet.com\/meet-the-team\/\" rel=\"author\" data-omniture-track=\"moduleClick\" data-omniture-track-data=\"{&quot;moduleInfo&quot;: &quot;byline-author&quot;, &quot;pageType&quot;: &quot;article&quot;}\" data-vanity-rewritten=\"true\">Robin Harris<\/a> for <a href=\"http:\/\/www.zdnet.com\/blog\/storage\/\" data-omniture-track=\"moduleClick\" data-omniture-track-data=\"{&quot;moduleInfo&quot;: &quot;byline-blog&quot;, &quot;pageType&quot;: &quot;article&quot;}\">Storage Bits<\/a> | <time datetime=\"2010-02-22 06:50:50\">February 22, 2010 &#8212; 06:50 GMT (22:50 PST)<\/time> | Topic: <a href=\"http:\/\/www.zdnet.com\/topic\/storage\/\" data-omniture-track=\"moduleClick\" data-omniture-track-data=\"{&quot;moduleInfo&quot;: &quot;byline-topic&quot;, &quot;pageType&quot;: &quot;article&quot;}\">Storage<\/a><\/p>\n<\/div>\n<\/header>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"mantle_skin\" data-article-num=\"1\">\n<section class=\"leader leader-top\" data-component=\"medusaContentRecommendation\" data-medusa-content-recommendation-options=\"{&quot;promo&quot;:&quot;promo_ZD_recommendation_top_leaderboard_desktop&quot;,&quot;spot&quot;:&quot;content-top-leaderboard&quot;}\">\n<div id=\"leader-plus-top-589e50c387113\" class=\"ad-leader-plus-top\" data-ad=\"leader-plus-top\" data-google-query-id=\"CK_L_57fhtICFUoDNwod87wCEA\"><\/div>\n<\/section>\n<div class=\"livefyre-commentcount social-count loaded\" data-lf-site-id=\"360461\" data-lf-article-id=\"e32c0899-333e-11e4-9e6a-00505685119a\"><\/div>\n<div class=\"contentWrapper \">\n<div class=\"container \">\n<div class=\"row\">\n<div class=\"row\">\n<div class=\"col-12\">\n<div class=\"row\">\n<div class=\"row\">\n<div class=\"col-12\">\n<div class=\"row\">\n<div class=\"col-8\">\n<article>\n<div class=\"storyBody\" data-component=\"lazyloadImages\" data-lazyload-images-options=\"{&quot;threshold&quot;:500}\">\n<p>Three years ago I warned that <a href=\"http:\/\/blogs.zdnet.com\/storage\/?p=162\" target=\"_blank\" rel=\"noopener\">RAID 5 would stop working in 2009<\/a>. Sure enough, no enterprise storage vendor now recommends RAID 5.<\/p>\n<p>They now recommend RAID 6, which protects against two drive failures. But in 2019 even RAID 6 won&#8217;t protect your data. Here&#8217;s why.<\/p>\n<p><strong>The power of power functions<\/strong> I said that even RAID 6 would have a limited lifetime.<\/p>\n<blockquote><p>. . . RAID 6 in a few years will give you no more protection than RAID 5 does today. This isn\u2019t RAID 6\u2019s fault. Instead it is due to the increasing capacity of disks and their steady URE rate.<\/p><\/blockquote>\n<p>Late last year Sun engineer, DTrace co-inventor, flash architect and ZFS developer Adam Leventhal, did the heavy lifting to analyze the expected life of RAID 6 as a viable data protection strategy. He lays it out in the Association of Computing Machinery&#8217;s Queue magazine, in the article <a href=\"http:\/\/queue.acm.org\/detail.cfm?id=1670144\" target=\"_blank\" rel=\"noopener\">Triple-Parity RAID and Beyond<\/a>, which I draw from for much of this post.<\/p>\n<p>The good news: Mr. Leventhal found that RAID 6 protection levels will be as good as RAID 5 was until 2019.<\/p>\n<p>The bad news: Mr. Leventhal assumed that drives are more reliable than they really are. The lead time may be shorter unless drive vendors get their game on. More good news: one of them already has &#8211; and I&#8217;ll tell you who that is.<\/p>\n<p><strong>The crux of the problem<\/strong> RAID arrays are groups of disks with special logic in the controller that stores the data with extra bits so the loss of 1 or 2 disks won&#8217;t destroy the information (I&#8217;m speaking of RAID levels 5 and 6, not 0, 1 or 10). The extra bits &#8211; <i>parity<\/i> &#8211; enable the lost data to be reconstructed by reading all the data off the remaining disks and writing to a replacement disk.<\/p>\n<p>The problem with RAID 5 is that disk drives have read errors. SATA drives are commonly specified with an unrecoverable read error rate (URE) of 10^14. Which means that once every 200,000,000 sectors, the disk will not be able to read a sector.<\/p>\n<p>2 hundred million sectors is about 12 terabytes. When a drive fails in a 7 drive, 2 TB SATA disk RAID 5, you\u2019ll have 6 remaining 2 TB drives. As the RAID controller is reconstructing the data it is very likely it will see an URE. At that point the RAID reconstruction stops.<\/p>\n<p>Here&#8217;s the math: (1 &#8211; 1 \/(2.4 x 10^10)) ^ (2.3 x 10^10) = 0.3835<\/p>\n<p>You have a 62% chance of data loss due to an uncorrectable read error on a 7 drive RAID with one failed disk, assuming a 10^14 read error rate and ~23 billion sectors in 12 TB. Feeling lucky?<\/p>\n<p><strong>RAID 6<\/strong> RAID 6 tackles this problem by creating enough parity data to handle 2 failures. You can lose a disk <i>and<\/i> have a URE and <i>still<\/i> reconstruct your data.<\/p>\n<p>Some complain about the increased overhead of 2 parity disks. But doubling the size of RAID 5 stripe gives you dual disk protection with the same capacity. Instead of a 7 drive RAID 5 stripe with 1 parity disk, build a 14 drive stripe with 2 parity disks: no more capacity for parity and protection against 2 failures.<\/p>\n<p>Digital nirvana, eh? Not so fast, my friend.<\/p>\n<p><strong>Grit in the gears<\/strong> Mr. Leventhal points out is that a confluence of factors are leading to a time when even dual parity will not suffice to protect enterprise data.<\/p>\n<p>Consider:<\/p>\n<ul>\n<li><strong>Long rebuild times.<\/strong> As disk capacity grows, so do rebuild times. 7200 RPM full drive writes average about 115 MB\/sec &#8211; they slow down as they fill up &#8211; which means about 5 hours minimum to rebuild a failed drive. But most arrays can&#8217;t afford the overhead of a top speed rebuild, so rebuild times are usually 2-5x that.<\/li>\n<li><strong>More latent errors.<\/strong> Enterprise arrays employ background disk-scrubbing to find and correct disk errors before they bite. But as disk capapcities increase scrubbing takes longer. In a large array a disk might go for months between scrubs, meaning more errors on rebuild.<\/li>\n<li><strong>Disk failure correlation.<\/strong> RAID proponents assumed that disk failures are independent events, but long experience has shown this is not the case: 1 drive failure means another is much more likely.<\/li>\n<\/ul>\n<p>Simplifying: bigger drives = longer rebuilds + more latent errors -&gt; greater chance of RAID 6 failure.<\/p>\n<p>Mr. Leventhal graphs the outcome:<\/p>\n<p><img decoding=\"async\" class=\"\" src=\"http:\/\/zdnet4.cbsistatic.com\/hub\/i\/r\/2014\/10\/04\/c7d828f3-4b5c-11e4-b6a0-d4ae52e95e57\/resize\/370xauto\/58d3fa92453ffa00d2681ceb9a716cf9\/relativereliabilityr5vsr6.jpg\" alt=\"Courtesy of the ACM\" width=\"370\" height=\"auto\" \/><\/p>\n<p>By 2019 RAID 6 will be no more reliable than RAID 5 is today.<strong>The Storage Bits take<\/strong> For enterprise users this conclusion is a Big Deal. While triple parity will solve the protection problem, there are significant trade-offs.<\/p>\n<p>21 drive stripes? Week long rebuilds that mean arrays are always operating in a degraded rebuild mode? Wholesale move to 2.5&#8243; drives? Functional obsolescence of billions of dollars worth of current arrays?<\/p>\n<p>Home users can relax though. Home RAID is a <a href=\"http:\/\/blogs.zdnet.com\/storage\/?p=116\" target=\"_blank\" rel=\"noopener\">bad idea<\/a>: you are much better off with frequent disk-to-disk backups and an online backup like <a href=\"http:\/\/www9.crashplan.com\/landing\/index.html\" target=\"_blank\" rel=\"noopener\">CrashPlan<\/a> or <a href=\"http:\/\/www.backblaze.com\/\" target=\"_blank\" rel=\"noopener\">Backblaze<\/a>.<\/p>\n<p>What is scarier is that Mr. Leventhal assumes disk drive error rates of 1 in 10^16. That is true of the small, fast and costly enterprise drives, but most SATA drives are 2 orders of magnitude less: 1 in 10^14.<\/p>\n<p>With one exception: Western Digital&#8217;s Caviar Green, model WD20EADS, is <a href=\"http:\/\/www.wdc.com\/en\/products\/products.asp?DriveID=576\" target=\"_blank\" rel=\"noopener\">spec&#8217;d<\/a> at 10^15, unlike Seagate&#8217;s 2 TB <a href=\"http:\/\/www.seagate.com\/ww\/v\/index.jsp?name=st32000542as-bcuda-lp-sata-2tb-hd&amp;vgnextoid=1f70e5daa90b0210VgnVCM1000001a48090aRCRD&amp;locale=en-US#tTabContentSpecifications\" target=\"_blank\" rel=\"noopener\">ST32000542AS<\/a> or Hitachi&#8217;s <a href=\"http:\/\/www.hitachigst.com\/tech\/techlib.nsf\/techdocs\/6A7E7E6848832B7786257603007AAF5E\/%24file\/DS7K2000_DS_final.pdf\" target=\"_blank\" rel=\"noopener\">Deskstar 7K2000<\/a> (pdf).<\/p>\n<p><strong>Comments welcome, of course.<\/strong> Oddly enough I haven&#8217;t done any work for WD, Seagate or Hitachi, although WD&#8217;s indefatigable Heather Skinner is a pleasure to work with. I did work at Sun years ago and admire what they&#8217;ve been doing with ZFS, flash, DTrace and more.<\/p>\n<\/div>\n<\/article>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Why RAID 6 stops working in 2019 Three years ago I warned that RAID 5 would stop working in 2009. Sure enough, no enterprise storage vendor now recommends RAID 5. Now it&#8217;s RAID 6, which protects against 2 drive failures. &#8230; <a class=\"more-link\" href=\"https:\/\/www.wildow.com\/blog\/?p=1928\">Read More &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[46,1],"tags":[41],"class_list":["post-1928","post","type-post","status-publish","format-standard","hentry","category-raid","category-uncategorized","tag-raid"],"_links":{"self":[{"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1928","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1928"}],"version-history":[{"count":2,"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1928\/revisions"}],"predecessor-version":[{"id":1968,"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1928\/revisions\/1968"}],"wp:attachment":[{"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1928"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1928"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wildow.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1928"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}