Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage

Western Digital's SMR Disks Won't Work For ZFS, But They're Okay For Most NASes. (arstechnica.com) 74

An anonymous reader shares a report: Western Digital has been receiving a storm of bad press -- and even lawsuits -- concerning their attempt to sneak SMR disk technology into their "Red" line of NAS disks. To get a better handle on the situation, ArsTechnica purchased a Western Digital 4TB Red EFAX model SMR drive and put it to the test ourselves. [...] Recently, the well-known tech enthusiast site Servethehome tested one of the SMR-based 4TB Red disks with ZFS and found it sorely lacking. The disk performed adequately -- if underwhelmingly -- in generic performance tests. But when Servethehome used it to replace a disk in a degraded RAIDz1 vdev, it required more than nine days to complete the operation -- when all competing NAS drives performed the same task in around sixteen hours.

[...] We want to be very clear: we agree with Seagate's Greg Belloni, who stated on the company's behalf that they "do not recommend SMR for NAS applications." At absolute best, SMR disks underperform significantly in comparison to CMR disks; at their worst, they can fall flat on their face so badly that they may be mistakenly detected as failed hardware. With that said, we can see why Western Digital believed, after what we assume was a considerable amount of laboratory testing, that their disks would be "OK" for typical NAS usage. Although obviously slower than their Ironwolf competitors, they performed adequately both for conventional RAID rebuilds and for typical day-to-day NAS file-sharing workloads. We were genuinely impressed with how well the firmware adapted itself to most workloads -- this is a clear example of RFC 1925 2.(3) in action, but the thrust does appear sufficient to the purpose. Unfortunately, it would appear that Western Digital did not test ZFS, which a substantial minority of their customer base depends upon.

This discussion has been archived. No new comments can be posted.

Western Digital's SMR Disks Won't Work For ZFS, But They're Okay For Most NASes.

Comments Filter:
  • Seagates were fine (Score:3, Interesting)

    by DeHackEd ( 159723 ) on Monday June 08, 2020 @11:34AM (#60159602) Homepage

    Really? Because I have a ZFS pool made of Seagate Archive (8TB) drives that's over 4 years old now. A failed disk replacement took about 2 days, and I estimate that had the pool been full the ETA would have been about 3 days, 3.5 tops. (ZFS only rebuilds active data, not the whole disk. Empty pools "rebuild" nearly instantly).

    Now I have tuned ZFS for the disks. ZFS writes to the pool in bulk once every 5 seconds by default and I raised that to 15 so the disks have more time in the background to work in relative peace, but that doesn't apply to a rebuild which would save its progress every 15 seconds but work during the rest of the time.

    So... why are WD's SMR disks going to absolute hell when this other pool of mine does so well? Better SMR management firmware?

    • I'm not an expert on this s***, but would have thought that writing every 15 seconds can mean 14 seconds of lost data if something goes really pear shaped. I'm sure you have a UPS to minimise the chances of that but Murphy is not necessarily your friend.

      • It is all a question of risk mitigation. If losing up to 15 seconds of data is devastating when already in a catastrophic situation, I would imagine much larger architectural issues need addressed.
        • There are many, many industries and applications in which 15 seconds of data loss could be massive.

          A RAID configuration is specifically intended to prevent a potentially catastrophic situation from becoming one.

          Poor performance recovering from one catastrophe renders those protections useless when a second one strikes.

      • If your data is that important then the application should be issuing some form of sync command when it feels appropriate. ZFS will respect that. If you don't then ZFS writes data to disk in big batches on its own schedule, default 5 seconds but I elected to raise it because I felt it would benefit the disks and the downtime on the machine would be more inconvenient than the data potentially lost during that time.

      • With a ZFS pool, you can actually use a "secondary ZFS intent log" or SLOG to mitigate the risk - put a smaller SSD in the pool as a SLOG and it will write to that disk, and then flush to the magnetic pool on the periodic job - it basically turns the whole NAS into a "hybrid" drive.

    • I have a (Raidz2) ZFS pool of (12) Seagate Archive (8TB) drives too, but I had to substantially increase the timeout value in Linux to prevent disks from dropping out of the RAID from time to time. It has been running for some years now, but I won't choose archive drives again in the future because of this issue.
  • by SuperKendall ( 25149 ) on Monday June 08, 2020 @11:35AM (#60159606)

    Yeah technically the SMR disks in a RAID work and deliver unto you files you seek.

    But nine days vs 16 hours is more than 10 times worse performance in terms of the most dangerous possible time for a RAID, where another failure will kill the whole thing.

    I would think that would not just be totally unacceptable for any business, but honestly almost worse for a consumer use where the backups are likely to not be as good or as tested.

    I would say they are OK for maybe super cheap desktop use where you just need something to store an excess of files, or maybe offsite backup drives. But I just can't see how in any way you can really say they are OK for any RAID use at all.

    • It only takes nine days if you're lucky.
      If you're not lucky, the raid controller times out the disk as non-responsive and failed.

    • by thegarbz ( 1787294 ) on Monday June 08, 2020 @12:16PM (#60159846)

      Except that's false. SMR drives only need to re-write on altered blocks. In a traditional RAID1/5 rebuilt the drive is sequentially rebuilt from start to end. No one has so far demonstrated 9 days vs 16 hours on a traditional RAID1/5 rebuild. The 9 days vs 16 hours comes from ZFS which when it resilvers does so on a block by block basis completely non-sequentially.

      It does so because it is done on a filesystem level without appreciation for hardware layout on the disk itself. The parity block and the redundancy data is spread all over the drives not in a pattern like it is in traditional RAID systems and as such during resilvering there is a shitton of head thrashing and bouncing all over the disk, quite the opposite of repairing a RAID 1/5 mirror which often results in the HDD busy light being on, but no sound of a head moving at all, one thing which would represent the *ideal* write condition for an SMR drive.

      TL;DR: 9 days has been demonstrated for RAIDZ not RAID1 or RAID5. No one has posted evidence so far claiming that it's not suitable for any RAID system.

      • by Bongo ( 13261 )

        I recall a blog years ago saying that RAIDZ was just not worth it, due to how complex the rebuild is, and so I just used mirror ever since.

        • by Wolfrider ( 856 )

          --ZFS Mirrors are good up to a point, and as long as disks are below a certain size. When you get to ~8TB sized disks and 6 disks or more in the pool, RAIDZ2 starts to make more sense. Do the odds, building out a huge pool of mirrors increases the chances of both disks in a "column" failing at once. Whereas with RAIDZ2, you can have 2 of *any* disks fail and still not lose data.

          --Also, the code has changed significantly. ZFS is now at 0.8.4 as of this writing, and scrub + resilver code has been refactored

          • by Bongo ( 13261 )

            --ZFS Mirrors are good up to a point, and as long as disks are below a certain size. When you get to ~8TB sized disks and 6 disks or more in the pool, RAIDZ2 starts to make more sense. Do the odds, building out a huge pool of mirrors increases the chances of both disks in a "column" failing at once. Whereas with RAIDZ2, you can have 2 of *any* disks fail and still not lose data.

            --Also, the code has changed significantly. ZFS is now at 0.8.4 as of this writing, and scrub + resilver code has been refactored. Maybe give modern raidz2 a try with older/disposable disks and see how some failure scenarios play out with rebuild times.

            Thanks, true, and I've been mirroring x 3... for some stuff. :) but that is like RAIDZ2... so in fact the rebuild time / risk is less concerning. I'll have a go. The ability to take the RAIDZ and mirroring levels up to higher levels so easily, is one of the reasons I love ZFS so much, as well as the checksums.

    • by dfghjk ( 711126 )

      "Yeah technically the SMR disks in a RAID work and deliver unto you files you seek."

      LOL SuperKendall using ornate language so you might think he knows what he's talking about.

      RAID neither works with files nor delivers files that "you seek". But, hey, SuperKendall may now know what a file is.

    • You're misreading it, the RaidZ1 vdev that took 9 days is the example of how the SMR drives fall on their face for ZFS. RaidZ1 is a mode of ZFS.
    • I would make it simple and just say that SMR disks are unacceptable for RAID use, period.

  • They're not OK (Score:2, Interesting)

    by Anonymous Coward

    Even the Ars test showed significant performance drop with these disks, just not as bad with some Linux specific configurations than ZFS.

    Further, the ars test doesn't mention testing more than once. They did a test in several configs. It's not scientifically accurate.

    The worst case scenario is a rebuild. While the ars test showed that linux soft raid works OK with EXT4 or whatever, it's not tested on a hardware raid controller either. If we're saying that NAS drives are only for lowend consumer use, then

  • by Khopesh ( 112447 ) on Monday June 08, 2020 @11:41AM (#60159646) Homepage Journal

    There seems a dearth of online resources discussing these two terms and they were new to me, so here are some references for you.

    SMR is shingled magnetic recording [wikipedia.org], "a magnetic storage data recording technology used in hard disk drives (HDDs) to increase storage density and overall per-drive storage capacity."

    PMR is perpendicular magnetic recording [wikipedia.org], aka conventional magnetic recording (CMR), "a technology for data recording on magnetic media, particularly hard disks ... first commercially implemented in 2005."

    LMR is longitudinal magnetic recording [wikipedia.org] (broken link), which appears to be the older standard

  • by Black Diamond ( 13751 ) on Monday June 08, 2020 @11:48AM (#60159678)

    They tested with one single SMR disk and declare that this is fine? What happens if you introduce additional drives into the array? Replace the array one by one with SMR disks and take a look at the performance, then we'll talk about valid conclusions.

    • by dfghjk ( 711126 )

      Why do you believe this is required for "valid conclusions"? What interactions do you believe would not be exposed otherwise? What makes you think SMR disks acting as source drives will matter?

    • Imagine if you read the article and it answered this question.

  • You run a ZFS pool you want enterprise grade drives. NAS drives are typically for archival and mirror sets. Nobody with a ZFS pool is buying drives as small as 4Tb.

    • by Kokuyo ( 549451 )

      Correct, I don't run 4TB disks... I run 14 2 TB disks.

    • by laffer1 ( 701823 )

      That's just not accurate. There are a lot of people using FreeNAS or similar products at home. While I don't have 4TB drives anymore, I did for many years and replaced some recently with 6TB WD Red drives. I've also used some IronWolf drives as well.

      Just because you won't do it doesn't mean it's true for everyone.

      The primary reason one buys NAS drives is because they're supposed to be RAID friendly. Not everyone has the budget for enterprise drives and even then they have very different drives within the

      • SMR drives will probably work perfectly fine with RAID 0, 1, 10. With RAID 5 or ZFS - not so much.

        • by laffer1 ( 701823 )

          Some people have reported problems with hardware raid with these drives too. So be careful what you say. It works with LINUX soft raid.

          • Define "works". If they drop off the hardware raid configuration then you absolutely need to blame the RAID controller for not obeying the drive's TLER and other signaling.

            Also I'm curious hour Linux software raid handles rebuilds. I don't run it myself but with a hardware RAID controller the drive is rebuilt sequentially which is a fine workload for an SMR drive. With ZFS it is not as RAIDZ works at a file system level not on a block level. Any idea how MDRaid works?

    • Actually, I have several pools and none of them have drives of 4 Tb or larger. They're all 1 and 2 Tb for "production" (does home use count as "production"?), and sometimes smaller for older systems and test. Also, larger is pretty much a non-starter for 2.5" drives (though I do have 3.5" as well).

      ZFS is very reliable even on older drives, and a mirror or RAID-Z pool made up of a bunch of second-hand, dirt-cheap SAS drives (next to no second-hand market) on older generations HCA (idem) is a) [relatively] f

    • Re:Makes sense (Score:5, Informative)

      by darkain ( 749283 ) on Monday June 08, 2020 @12:18PM (#60159858) Homepage

      Not entirely true. ZFS is the default filesystem for FreeBSD. It is designed to be used at ALL scales. I'm using it on systems as small as a Raspberry Pi.

    • Re:Makes sense (Score:5, Insightful)

      by thegarbz ( 1787294 ) on Monday June 08, 2020 @12:25PM (#60159886)

      Nobody with a ZFS pool is buying drives as small as 4Tb

      What kind of nonsense is that? I'll wager the opposite. I be you my kidney that 90% of ZFS pools out there have drives lower than 4TB. The overwhelming majority of redundant and data critical applications do not include serving an entire AWS cloud to clients or providing a Netflix library to users. Hell I'll go one stage further: I bet you the overwhelming majority of ZFS pools out there are smaller than 4TB in total, even the ones made of more than just a couple of drives.

      As for what you want, that's absolute garbage. What you *want* for ZFS is any HDD not SMR. The entire premise of ZFS's design is a lack of any trust in underlying hardware. Not using "enterprise grade" marketing garbage was precisely ZFS's use case and there's zero benefit to enterprise marketing garbage over a normal drive (discounting of course the benefits of the SAS interface for expandability).

      The other thing you fail to realise is that enterprise clients don't just toss their arrays every couple of years to throw in larger drives. The vast majority of arrays are setup and then operated replacing failed disks with like disks or sometimes for financial reasons, replacing them with large drives but not actually using the extra space (pool growth causes performance problems on ZFS). Most arrays are run for many years and then decommissioned, and those years extend way beyond when it was practical to buy 4TB drives.

      • by Anonymous Coward
        The better drives are designed to tolerate vibration better. When you put multiple drives in the same enclosure, the vibrations from one transmit to the others and this causes early failure. The different grades of drives are rated for how many drives can be together.
      • Nobody with a ZFS pool is buying drives as small as 4Tb

        The other thing you fail to realise is that enterprise clients don't just toss their arrays every couple of years to throw in larger drives. The vast majority of arrays are setup and then operated replacing failed disks with like disks or sometimes for financial reasons, replacing them with large drives but not actually using the extra space (pool growth causes performance problems on ZFS). Most arrays are run for many years and then decommissioned, and those years extend way beyond when it was practical to buy 4TB drives.

        I was thinking the same thing. My NAS has 10x3TB drives in it...not because of trying to penny pinch, but because 3TB was the 'sweet spot' for capacity when I bought them three years ago. If I were building today I'd likely use 6TB drives, but I spent over $1,000 in drives when I bought them. The GP might possibly have a point about new arrays going into service *today*, but that's not going to be the case for lots of arrays that are still well within their service life.

    • I have 2 3TB drives in a mirror for my ZFS vault.

    • Nobody with a ZFS pool is buying drives as small as 4Tb.

      Better watch that word "nobody". Sun made their fortune selling big ZFS stacks of sub-terabyte drives to people with large, well-funded Legal Departments, and though disks just kept getting bigger there is still no magic demon waiting to get you if you're stacking 250GB or 500GB drives. It's nonsense to say otherwise. Enterprise drives do get better specs and lifetimes. At least that's what Corporate Marketing and eyeball-hungry web testers tell me. RAID was originally sold as "Redundant Array of Inexpensiv

    • You run a ZFS pool you want enterprise grade drives.

      No, you can run a ZFS pool with enterprise grade drives. You can run them with other drives however these particular drives are marketed as NAS drives and historically have been used in ZFS pools. No mention or warning was issued when SMR was introduced to this line of drives.

      Nobody with a ZFS pool is buying drives as small as 4Tb.

      I would imagine there are lots of these 4TB drives used in ZFS pools. I would think that they are being purchased as replacements for failed NAS drives as well. Not everyone can afford to replace all their NAS drives every few years.

  • by ReneR ( 1057034 ) on Monday June 08, 2020 @11:59AM (#60159746)
    ridiculous summary, ...
  • by OverlordQ ( 264228 ) on Monday June 08, 2020 @11:59AM (#60159754) Journal

    They gimped the rest of their drives to create the Reds with TLER, and now the reds get further gimped to be not that good at the thing they were designed to be for.

    Stop apologizing for them.

  • Drobo (Score:4, Informative)

    by jolyonr ( 560227 ) on Monday June 08, 2020 @12:05PM (#60159790) Homepage

    Drobo systems flag SMR disks as failed within hours of installing them. Drobo specifically say their systems don't support SMR disks, so not knowing the recording method is a real pain.

    I've already had three Seagate drives returned to the supplier for a full refund because it was not disclosed they were SMR and they didn't function with the Drobo.

    I wouldn't use SMR in ANY raid arrangement. Just not worth it.

  • Mistakenly? (Score:4, Funny)

    by Ultra64 ( 318705 ) on Monday June 08, 2020 @12:10PM (#60159824)

    "they may be mistakenly detected as failed hardware. "

    Sounds pretty accurate to me.

  • by rickyslashdot ( 2870609 ) on Monday June 08, 2020 @12:18PM (#60159856)

    I'm retired, haven't worked in the industry since 2000 - disabled veteran with serious back and knee problems.
    OK, so much for the basic issue of qualifications - - - yet I remain an active home/hobbiest engineer in the computer, electronic, and bio-med fields, with continuous contact with leading engineers in half dozen companies.
    The following BOTE off-the-top of my head calculations have been examined, and accepted, by these engineers, and the issue has been incorporated as product-purchase holds on ALL WD devices for the duration.

        - - -

    1) ASSUME an 8 layer shingle - which I believe is the standard

    2) randomized data position for updates will be 1/2, or 4 layers

    3) data has to be READ and buffered before WRITING - 4 revolutions of the platter to READ
        ASSUMING a single-track step is within the time interval of a single spin - 8.33 mSec @ 7200 RPM
        7200 RPM is the sweet spot for speed & longevity - anyway, not too relevant to the time PENALTY

    4) Then you need a revolution to reset the heads for WRITE - MAYBE TWO TWIRLS if it's a full 8-track

    5) 4 more spins for the WRITE set

    6) TOTAL = 9 or 10 spins, call it 9.5? - 8-track change may need 10+ mSec

    7) 17 hours for a Conventional format rebuild - X - 9.5 (spin ratio factor) == 6 days, 17.5 hours for SMR

    8) NOW add-in the diddle factor added for some use during rebuild, and it's 8 to 9+ days ! ! !

    THIS IS EQUAL TO THE OBSERVED REBUILD TIME IN THE ARTICLE === 9+ days, AND it describes the processing that causes the excess time for Shingle format rebuilds.

    ALSO, remember that this assumes only a few contiguous sectors. IF there are too many, then you will get ANOTHER spin tacked onto the rebuild time factor.

    BTW - here's a short list of the SMR WD drives if you didn't get to it
    3.5 WD Red 2TB, 3TB, 4TB, 6TB (SKUs: WD20EFAX, WD30EFAX, WD40EFAX, WD60EFAX)
    3.5 WD Blue 2TB, 6TB (SKUs: WD20EZAZ, WD60EZAZ)
    2.5 WD Blue 1TB, 2TB (SKUs: WD10SPZX, WD20SPZX)
    2.5 WD Black 1TB (SKU: WD10SPSX)

    cheers, y'all . . .

  • What is that? A substantial minority?
    Is it a minority that should be feared? lol
  • by testus ( 123588 ) on Monday June 08, 2020 @12:21PM (#60159876)

    Synology has just marked WD's SMR drives as incompatible. So there goes one big NAS manufacturer...

    WD60EFAX - 68SHWN0:
    "None of the models is compatible with this device. Due to performance characteristics of SMR drives, this model is only suitable for light workload environments. Performance degradation may occur under continuous write operations."

    https://www.synology.com/en-global/compatibility?search_by=category&category=hdds_no_ssd_trim&filter_brand=Western%20Digital&p=4

  • by gnasher719 ( 869701 ) on Monday June 08, 2020 @01:06PM (#60160048)
    The idea was that SMR drives can have twice the capacity of CMR drives at the same cost. So if you are informed about what you get, and you prefer a slow 4TB drive to a fast 2TB drive, that would be fine.

    But that didn't happen. They are producing drives with the same capacity and sell them at the same price, putting the savings in their pocket, and not telling the customer about it, so the customer gets substandard drives without knowing it. There is nothing anywhere near acceptable about it.

    Having the drives fail completely in some use cases just makes it worse.
    • Why not just make CMR/SMR a customer settable setting on the drives? You can buy a 16gb SMR drive, but if you need the option you can reformat it to CMR and take a storage capacity hit.
      • by hoggoth ( 414195 )

        Are you just trolling or do you not understand anything about the words you are saying? CMR and SMR are physically different ways of storing the data on the surface. Why doesn't my car have a setting to be an airplane?

      • by parker9 ( 60593 )

        SMR uses a narrower read head and wider write head than CMR, so if you did this, your CMR would have a higher bit error rate. Or to keep the same bit error rate, you would have to reduce the linear density (not just track density), so your hit on capacity would be significant. The firmware would also get twice as big, so Si cost would go up for the controller.

        The idea of SMR was to find a way to increase AD without new recording technology (e.g. HAMR). If you treat it as a WORM (write once, read many), it w

      • For the same reason you can't flip a switch on your dashboard and turn your gasoline car into a diesel car. There are physical differences in the hardware, and operation of the drive.

    • by ngc5194 ( 847747 ) on Monday June 08, 2020 @02:46PM (#60160528)

      I agree. I have no problems with the use of SMR technology, what I have a problem with are hard disk manufacturers who are switching their drives to it *without* *labeling* them as such. I just want to be able to look at the spec sheet or packaging and be able to tell which technology a given disk uses. *This* is the reason I have a real problem with what Western Digital has done in this case.

  • I had 5 WD60EFAX drives waiting to be installed into a ZFS NAS. Hearing about the SMR and ZFS issue, I decided to call WD support to see if they would replace them with PMR drives. After a bit of back and forth, they agreed to replace them. They didn't have any 6TB drives available so they offered either 8TB that were not in stock or 10TB that were available. Guess which one I asked for? Anyways I was pleased with the outcome. Hope this helps others who have WD SMR drives under warranty.
  • .. is that WD sees two NAS markets
    1) Real Enterprise users who would never (except for niches like cold storage) want SMR NAS drives -or NAS drives as small as 4TB (generally at least 8TB),who expect to pay a premium for Enterprise-grade NAS drives.
    2) Home and SMB users who are using plug and play appliances that either have drives provided by NAS device vendor, or drives purchased as a quick add-in to what is essentially a consumer product. For these users, who have no idea what SMR is, TB/$ is biggest met

  • SMR could actually be ideal for ZFS because ZFS is based on a copy-on-write pronciple and does not write in place. However, this would require exposing the low-level properties of the disk and making ZFS aware of them.

  • It's all very well to try to pick benchmarks that are good or bad but to settle this they have to provide details for the SMR. When will a track overwrite? How far will it overwrite? What other mechanisms do they have?

    I'm getting the impression part of the problem is that they're not just exposing the raw SMR disk and letting software or the FS decide but instead have their own complicated system for arranging and trying to optimise data including things such as a CMR write buffer.

    It's all very well t

Living on Earth may be expensive, but it includes an annual free trip around the Sun.

Working...