Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

Ask Slashdot: Are There Storage Devices With Hardware Compression Built In? 120

Slashdot reader dryriver writes: Using a compressed disk drive or hard drive has been possible for decades now. But when you do this in software or the operating system, the CPU does the compressing and decompressing. Are there any hard drives or SSDs that can work compressed using their own built in hardware for this?

I'm not talking about realtime video compression using a hardware CODEC chip -- this does exist and is used -- but rather a storage medium that compresses every possible type of file using its own compression and decompression realtime hardware without a significant speed hit.

Leave your best thoughts and suggestions in the comments. Are there storage devices with hardware compressiong built in?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Are There Storage Devices With Hardware Compression Built In?

Comments Filter:
  • by fahrbot-bot ( 874524 ) on Saturday November 09, 2019 @06:50PM (#59398422)

    Are There Storage Devices With Hardware Compression Built In?

    Here are a few [homedepot.com], though some are a bit pricey.

  • Glad we cleared that up.
  • by Fly Swatter ( 30498 ) on Saturday November 09, 2019 @06:57PM (#59398440) Homepage
    The controller compressed data on the fly to save on NAND writes and make less look like more. They were bought out or something. SandForce [wikipedia.org] wiki. If you google you will see many complaints of reliability and slowness, I unwittingly got stuck with one in a laptop - it's not exactly fast.
    • by phantomfive ( 622387 ) on Saturday November 09, 2019 @08:03PM (#59398584) Journal
      In my (admittedly limited) experience, hard-drives that try to do smart things in hardware tend to be slower than if you did the same in software.
      • I had $1,000 enterprise raid cards, so I of course used the hardware raid. Until I actually compared the performance to Linux raid. The RAID provided by Linux provided better throughout (using the $1,000 cars as dumb controller), with CPU usage very close 0.

        • Btw the other advantage I discovered is that when a hardware device wrjt out, I had to replace it with the same series I order to read the data. With software raid (or compression), I can use any hardware, and any Linux of that age or newer.

        • by kriston ( 7886 )

          Pretty sure your settings were wrong and your cache battery was dead.

          Hardware RAID accelerators always beat software RAID unless you've done something wrong.

    • Here's an old Anandtech article talking about the Sandforce controller.

      https://www.anandtech.com/show... [anandtech.com]

    • by AmiMoJo ( 196126 )

      The problem is that these days much of the data tends to be non-compressable.

      Most media is compressed already. A lot of stuff is encrypted which increases entropy to the point where it can't be compressed.

      There is just no justification for implementing a complex compression scheme. For all that extra work and energy consumption (i.e. heat) you get some marginal space saving that may let you reduce the amount of reserved space on the SSD a little, but you can't rely on it because the customer might be using

  • Seagate Nytro (Score:4, Insightful)

    by Distan ( 122159 ) on Saturday November 09, 2019 @06:59PM (#59398446)

    Some Seagate Nytro drives have hardware compression built in.

    The old Sandforce controllers did hardware compression. LSI acquired Sandforce. Avago acquired LSI. Avago spun-off Sandforce division to Seagate. The Seagate Nytro drives contain the legacy of the Sandforce technology.

  • Tape drives (Score:4, Informative)

    by Tapewolf ( 1639955 ) on Saturday November 09, 2019 @07:07PM (#59398458)

    Tape drives do this. The quoted capacity of e.g. an LTO cart is generally twice the physical capacity, assuming that the drive controller will be able to achieve a 2:1 compression ratio. Which it generally doesn't. You tend to get faster throughput if you disable hardware compression.

    • I don't see how you can get better throughput by disabling tape drive compression, it's not like the drive is going to spin the tape any faster. The drive has variable speeds, not THAT variable, but I've been out of the game a while.

      With compression you get higher throughput because the drive is moving the tape at a constant speed but you can send 1-3x more data over the wire. Your compression ratio directly affects your throughput, so 1.1x compression is a 10% increased throughput.

      You can get better comp

      • by madbrain ( 11432 )

        This all depends on the size of the buffer in the drive, and the ratio of compression. If you don't do any software compression, and are sending highly compressible data, say, a large file made of all 0s, then you might fill the drive's buffer and/or bottleneck at the drive's I/O interface, and the tape drive is going to stop streaming while waiting for more data from the host.

        I think what the parent was trying to say is, better to enable compression in the software, and send already compressed data to the

      • I don't see how you can get better throughput by disabling tape drive compression, it's not like the drive is going to spin the tape any faster.

        Frankly, I don't know why it was doing that either, but that's what I was seeing. Big increases by switching off hardware compression. As you likely know, trying to compress data that's already compressed will tend to result in the output becoming bigger instead of smaller, so it's possible it was filling up the buffer more quickly and stalling.

        • OK, ignore that - I just checked my scripts and it looks like I'm misremembering things. I do in fact have the compression turned on. (And yes, 256K blocks).

          • by kriston ( 7886 )

            Compression for tape drives should always be turned on. Even if lots of data doesn't compress, the other data that does compress makes up for it 50-100x when backing up to a tape drive.

            Source: operated and owned many different types of tape drives both personally and professionally back when tape drives still mattered. All kinds of QIC formats, Travan, Sony Data8 and DLT, Sony AIT (based on DAT), the LTO variants, then to present-day (non-tape) nearline low-spindle-RPM hard drive storage when many companie

      • I don't see how you can get better throughput by disabling tape drive compression,

        Simple: attempting to compress [that which has already been compressed] results in larger file sizes.

    • Depends on the backup program. If you have something that can do deduplication, the hardware compression may not matter as much. However, for a lot of things, the hardware compression is good enough, especially if one is using a USB LTO-8 drives with their laptop and LTFS for drag and drop.

  • by mikemraz ( 6372120 ) on Saturday November 09, 2019 @07:10PM (#59398466)
    This functionality exists in storage arrays, but there's one core problem that stop it being used in individual drives - the predictability of size. Your computer/OS is designed to believe that a drive has a fixed size (eg, 500GB). With hardware-level compression, the number becomes variable - store all video files, and you might fit 510GB. Fit all text files, and it's 2TB. But your OS doesn't understand that concept, so the drive has to present a fixed size to the OS. If it presents as 1TB, what happens when you write 511GB of video files and there's no space left on the drive to store them? Or when you write 1TB of text files and the drive is only half full.
    • by Jamu ( 852752 )
      Yeah. SSDs could use it to minimise writes though. e.g. a "500GB" drive with 500GB (or more) actual storage could compress data to improve endurance.
      • by Mal-2 ( 675116 )

        If that is the intent, you want to do the compression before sending the data down a pipe that, compared to the bandwidth to RAM, is rather slow. In the time you need to push the data out to the device to be compressed, you could compress it and push it across using only half as much bandwidth. Otherwise you're just trading CPU saturation for I/O saturation.

      • by tlhIngan ( 30335 )

        Yeah. SSDs could use it to minimise writes though. e.g. a "500GB" drive with 500GB (or more) actual storage could compress data to improve endurance.

        Most SSDs already use compression. Again, they use it to enhance endurance by reducing write amplification. This has been going on for years.

        The OS still sees the guaranteed space - your 500GB SSD still has 512GiB of storage (the extra is used for bookkeeping as well as holding spare sectors - as much as 2% of flash can be bad from the get-go). But the SSD comp

    • Comment removed based on user account deletion
    • I wonder if there are filesystems that accept variable storage size from the hardware.

  • by N_Piper ( 940061 ) on Saturday November 09, 2019 @07:11PM (#59398468)
    Hasn't this been a feature of tape drives since time immemorial? Like mid 1980's? I'm pretty sure my "What Is A Computer?" book from like 1990 listed native compression as an advantage of tape drives next to cheap MO disks and CD WORM disks being the way of the future.
  • by gweihir ( 88907 ) on Saturday November 09, 2019 @07:15PM (#59398476)

    In most cases you have a trade-off. Say you are storing code. Using XZ is slow but compresses really well. Using lzop compresses badly, but you get amazing speed (for a compressor). Your system cannot really know what trade-off you want. Also, hardware-compression is never really good, because it needs to use small block-sizes.

    That said, there are filesystems and storage devices that do compression.

  • Well LTO tapes have completely wrong sizes printed on their case and labels (and the real size in a much smaller font size). That is because they do use hardware compression.

    Besides all the other problems of tape drives, the problem with compression is that it's unpredictable and most of the time useless anyway.

    What do you want to compress? What's the point? Anything big enough to be worth compressing is compressed already. I have never seen a drive with so many compressible files that it would be worth the

  • It is called RLL encoding. Hard drives (spinning rust) have done it for a quarter century or more.

    • RLL is, as you.said, encoding. It's not compression. It doesn't reduce the number of bytes. It does it makes sure that on the medium, a series of a million zeroes will be stored as a million bytes that are mixed ones and zeroes. That makes it easier to tell the difference between a million zeroes vs a million and one zeroes. Your length measurement doesn't have to be as precise.

      • The proletariat often mean "compression" as storing "more" in "less". By that token, RLL is compression. On a given magnetic substrate you can "store more" when the data is RLL encoded than when it is (for example) MFM encoded. About 35 years ago Spinning Rust switched from FM encoding to MFM encoding, resulting in the ability to store "more" in the same space. 25 years ago MFM encoding yielded to RLL encoding, resulting in the ability to store yet "more" is the same space. This process has repeated ev

        • On a given magnetic substrate you can "store more" when the data is RLL encoded than when it is (for example) MFM encoded.

          Thats only because "MFM" is crap and expands rather than compresses. It was, however, better than NRZI, which was worse crap.

          The compression in the more recent LTO drives works quite well - and all LTO drives try compressing, and store the uncompressed if smaller.

          Compression impacts reliability - maybe not a lot, but it does. You can afford to have two tapes, but not two platters,

  • Check your CPU Usage (Score:4, Informative)

    by jellomizer ( 103300 ) on Saturday November 09, 2019 @08:13PM (#59398594)
    Hardware compression isn't really going to help you.

    For the most part, the speed of your CPU has gotten faster at a higher rate than the speed the storage media can work. Using the CPU to compress your data for most use case scenarios is actually faster because you are writing fewer bits to the storage making the write much faster. Also reading you read less data, then your CPU can decompress it rather fast.

    Oddly enough I rarely see a PC or CPU when running properly has 100% CPU, where Compression becomes a factor.

    Having hardware compression will probably be slower as the Hardware CPU will not be as fast as your Main Computer, as well you will not have the ability to manage the compression algorithm.
    • by glitch! ( 57276 )

      Ditto. Your CPU is so much faster than the controller on the hard drive. Also factor in the speed limit of the bus between the CPU and the drive. Software compression makes so much more sense. Choose ZFS and compression and you are done.

      • by Bert64 ( 520050 )

        Your cpu has other things to do, the controller does not. Using CPU cycles for compression takes performance away from other activities and will be less power efficient than a dedicated controller... Plus the controller only has to be as fast as the media its writing too - any faster is pointless.

        • by Jeremi ( 14640 ) on Sunday November 10, 2019 @12:40AM (#59398950) Homepage

          Your cpu has other things to do, the controller does not.

          On a modern desktop/laptop/cell-phone, that is generally no longer true. CPUs have become so insanely fast relative to the rest of the hardware on the machine, that scenarios where the CPU is the limiting factor in performance are uncommon.

          Instead, system bottlenecks are usually found in RAM or I/O bandwidth -- so it's usually faster to reduce the amount of data the CPU pumps out (e.g. by compressing it on the CPU) than to lighten the CPU's workload, since the CPU would only use those extra cycles to wait for I/O to complete anyway.

          As for relative power efficiency -- well, maybe, but mobile CPUs benefit from dumptrucks full of R&D money every year to make them more power-efficient, while a one-off "intelligent" I/O controller probably won't.

  • Storage with compression is a bad idea. You cannot further compress data which already has been compressed. Compression is a method of representing the data with number of bits closely approaching the entropy of the data, so the hardware would have to measure whether the data is already compressed. Here we are coming to a problem - it is impossible to properly specify the size of a drive with compression - because this requires knowledge of the data being written. That is just asking for problems.
    • Oversubscription helps with that, the drive reports itself as have an actual capacity, a used capacity and a subscribed capacity ... so long as used doesn't hit actual it doesn't really matter what the subscribed capacity is.
      Arrays that do compression tend to compress the data then compare it to the original to see if it was worth writing the compressed version to the media.

  • Naively, I don't think this approach would work well. Why? Storage devices work with fixed-length blocks of data at given locations. The problem with compression is that the compressed size is variable: A text file compresses well, but media, like video and audio files, are usually already compressed.

    I could see an operating system working with hardware-assisted compression, but honestly, given the tradeoffs, I suspect that a bigger drive is probably cheaper and faster in the long run.

  • by thegarbz ( 1787294 ) on Saturday November 09, 2019 @09:12PM (#59398682)

    The SATA link is a bottleneck on modern drives. Having it get compressed before passing over that link is a benefit to speed and software compression often gives speed boosts.

    The law of diminishing returns applies, and in this case likely won't benefit much on modern NVMe drives.

  • by JBMcB ( 73720 ) on Saturday November 09, 2019 @09:13PM (#59398684)

    For various and sundry reasons, you generally don't want your storage device doing this for you. Use the built-in filesystem compression. Unless you have some strange workload, a compression task eating up one of your cores won't be a big deal usually.

    If you are on an industrial scale, you want a dedicated file-server doing this for you. Ideally, use ZFS with it's built-in compression. It works great. If you are moving a ton of stuff over and overwhelm the compression algorithms, you can use an uncompressed SSD to cache the writes to the array and maintain performance.

  • By definition, there is no compression algorithm that will work well with every single type of file. In fact, every compression algorithm must have a worst case performance of doing nothing with the file (in most cases, it's often worse that that).

    It's obvious why. If a compression algorithm like that did exist, you could just feed it's output back into it over and over again until it was a single bit of data. But that means you could repeat the decompression to recreate every file that could ever exist fro

  • I can drop a 12TB HDD in a desktop. And probably still have room to put in a second drive.

    SSDs now hold multiple terabytes, so performance and capacity have been addressed.

    I don't even give a shit about compressing music files anymore. Native Blu-Ray rips? Sure, why not. Who the hell is bothering to count gigabytes these days? Is storage really a problem anymore for more than 2% of computing professionals who already rely on RAID-based solutions to maximize capacity and data protection? I highly doub

    • compressed file is more efficient to move around. An mp3 gets 20 to 1 compression easily for normal music listening. network-IO/ disk-file copy/cache usage everything goes down using compressed file. Also the wear-n-tear on physical components like disk-drives falls and hence time to replace broken hardware. All those backup sync systems (like cloud sycn) will work more efficiently using smaller file size.
      • MP3 does not get 20 to 1 compression. MP3 is a lossy compressor. It obtains almost all of its compression by throwing away data that you will never get back.

  • Back in the days of good ol' DoubleSpace, compression made sense, since most of your hard drive was full of EXE, COM, DLL, TXT, DOC, and lots of other uncompressed files. But now, there's no point in trying to add some sort of hardware abstraction layer for compression.

    First off, the files that suck up the most hard drive space these days -- music, videos, photos -- are *already* compressed. Same goes with modern Microsoft Office files. Compressing them again just isn't worth the effort, and results in litt

  • There are some situations in which you don't want to compress a file, so it really should be software control over it, even if the calculations are done in hardware.

  • My understanding was the compression happens faster than the writing, so it's faster to compress and write the smaller files than write the big uncompressed ones.

  • There would be no speed hit. As it turns out, decompression is much faster than the bottleneck of the data pipe, so it's faster to do push compressed data through it then decompress it in the processor than to stream it uncompressed. It's a rare win win situation.

    I assume this would also be the same, but as most data is compressed it would be of nominal space savings.

  • Drive compression with RLL [ebay.com] instead of MFM controllers

  • has had hardware compression since forever.

  • Your OS sees the drive as a block device. It's just a big array of storage blocks. If the drive compresses the data destined for a block, the whole block is still used as far as the os/filesystem is concerned. The compressed data doesn't create more space, it creates little bits of wasted space.

    The only benefit would be fewer writes to the physical medium. Only really an advantage for flash memory.

  • Encrypted data cannot be effectively compressed. Storage device-level compression's big shortcoming is that it's fundamentally incompatible with encryption (except for storage device-level encryption). In other words, to get any benefit from compression you have to make sure that whatever you're sending to the storage device is UNencrypted. Do you really want to shut off all or most higher level encryption? To disable dm-crypt/LUKS2, FileVault, BitLocker, and PGP, as examples?

    If you want to see where compu

  • You cannot compress encrypted data (since it becomes random). Therefore all self encrypting drives also do compression as part of the encryption process.
  • >"Are There Storage Devices With Hardware Compression Built In?"

    Why? There really isn't much need for it. We store most files already compressed. Video, audio, photos, log files... anything of any reasonable size is already being compressed before storage and very efficiently, based on the TYPE of file- something compressed drives would not be able to do. Even LibreOffice files are all compressed as part of their native format (and those usually aren't even very large). And since probably 90+% of th

  • Disks/storage/SDD work at block level; not file-level; also they don't have intelligence on what the file represents and what kinda redundancy can be eliminated. Also a storage can't be lossless; while various compressions of smooth data like image/video/audio get big bang for buck (like 1 to 20 ratio compression) when they eliminate high frequency data that is not that perceptible to a human (using FFT /fourier) - like in jpg/mp3/mp4. A block level compression can utmost do run-length encoding which overa
  • Some time ago, Seagate experimented with Kintex object storage devices where each drive ran its own operating system and therefore file system. In this case, a later version could have included hardware accelerated compression.

    The problem is that compression of block storage requires and underlying file system or block database. To manage this, each block is stored similar to a file or a database record on another disk. This is also how we manage deduplication at a block level.

    Remember that compressed data
  • Hardware compression is usually external. [youtube.com]

    In my experience, hard drives work best in their uncompressed state.

  • Since a hard drive or SSD is a block device, the only way that would work is if it prevented more blocks to address than available. So if the amount of data you have written can’t be compressed and you fill the drive up, then what?
    You still need a storage system(not device) to do the compression and/or deduplication so you can overprovision the space available using compression, deduplication and thin provisioning.
    With mixed data workloads I have had no problems with a 200%-300% over provisions in the

  • The only way this makes sense is if you have large files that compress well.

    Drive space has become soooooo cheap that compressing stuff for storage is nearly pointless, and of course there's the fact that many file formats are already compressed.

    You're spending a dollar to save a nickel. Just buy bigger drives and stop worrying about saving 0.00002% of your storage.

  • ... storage devices there are. Tape drives have had hardware-based compression since the DLT days (at least).

    But disk drives? Haven't heard of any with embedded compression.

  • Even SSD is cheap. Don't use hardware compression, just buy a bigger drive. Compression reduces your MTBF. If you want to compress to increase your IO throughput, do it in the CPU.
  • ZFS compression usually uses less than 10% of a modern CPU core per writer thread. The CPU compression is faster than the disk bus so you actually lose I/O performance if you don't have compression on, for the average workload.

    If there were some reason that 10% of one core (say 1% of an average server) was too much or if say there were a hundred concurrent write threads then most people would use a NAS with gigabit or 10Gbe connections.

  • Not worth it, IMHO. The device's internal write speed is going to be constant throughput; however, depending on the compressability of the content, you may see higher write speeds. So if your goal is to increase transfer rates, this will only work for compressible text, etc.

    OTOH, the compression also cannot use more complex analyses of the content, as it only sees a few kb of the bytestream at a time. On top of which, as newer compression techniques develop you won't have access.

    I'm curious what the use

There's no sense in being precise when you don't even know what you're talking about. -- John von Neumann

Working...