Ask Slashdot: Are There Storage Devices With Hardware Compression Built In? 120
Slashdot reader dryriver writes:
Using a compressed disk drive or hard drive has been possible for decades now. But when you do this in software or the operating system, the CPU does the compressing and decompressing. Are there any hard drives or SSDs that can work compressed using their own built in hardware for this?
I'm not talking about realtime video compression using a hardware CODEC chip -- this does exist and is used -- but rather a storage medium that compresses every possible type of file using its own compression and decompression realtime hardware without a significant speed hit.
Leave your best thoughts and suggestions in the comments. Are there storage devices with hardware compressiong built in?
I'm not talking about realtime video compression using a hardware CODEC chip -- this does exist and is used -- but rather a storage medium that compresses every possible type of file using its own compression and decompression realtime hardware without a significant speed hit.
Leave your best thoughts and suggestions in the comments. Are there storage devices with hardware compressiong built in?
Here you go ... (Score:5, Funny)
Are There Storage Devices With Hardware Compression Built In?
Here are a few [homedepot.com], though some are a bit pricey.
Re: (Score:2)
Re: (Score:2)
Garbage in, smaller garbage out. Thus a net improvement on the amount of garbage in the world.
Re: (Score:2)
Smaller maybe, but the mass certainly is the same.
No. (Score:2)
Re: (Score:2)
Re: No. (Score:2)
When the dedup block size is 8k, and gzip brings that to 4k on average, compression on dedup helps.
Re:No. (Score:5, Informative)
Re: (Score:2)
Re: No. (Score:2)
There was the Sandforce SSD controller (Score:4, Informative)
Re:There was the Sandforce SSD controller (Score:4, Interesting)
Same with $1,000 raid cards (Score:2)
I had $1,000 enterprise raid cards, so I of course used the hardware raid. Until I actually compared the performance to Linux raid. The RAID provided by Linux provided better throughout (using the $1,000 cars as dumb controller), with CPU usage very close 0.
Ps when the card died, the raid didn't (Score:2)
Btw the other advantage I discovered is that when a hardware device wrjt out, I had to replace it with the same series I order to read the data. With software raid (or compression), I can use any hardware, and any Linux of that age or newer.
Re: (Score:2)
Pretty sure your settings were wrong and your cache battery was dead.
Hardware RAID accelerators always beat software RAID unless you've done something wrong.
Re: (Score:2)
Here's an old Anandtech article talking about the Sandforce controller.
https://www.anandtech.com/show... [anandtech.com]
Re: (Score:2)
That's a ten year old article for those who don't bother to RTFA.
Re: (Score:3)
The problem is that these days much of the data tends to be non-compressable.
Most media is compressed already. A lot of stuff is encrypted which increases entropy to the point where it can't be compressed.
There is just no justification for implementing a complex compression scheme. For all that extra work and energy consumption (i.e. heat) you get some marginal space saving that may let you reduce the amount of reserved space on the SSD a little, but you can't rely on it because the customer might be using
Seagate Nytro (Score:4, Insightful)
Some Seagate Nytro drives have hardware compression built in.
The old Sandforce controllers did hardware compression. LSI acquired Sandforce. Avago acquired LSI. Avago spun-off Sandforce division to Seagate. The Seagate Nytro drives contain the legacy of the Sandforce technology.
Re: (Score:2)
Re: (Score:2)
They didn't "take him down". They extracted him and brought him home before the FSB could catch him and parade him at the UN.
Re: Seagate Nytro (Score:2)
Some Seagate Nytro drives have hardware compression built in.
Didn't get the memo?? Seagate doesn't count.
Tape drives (Score:4, Informative)
Tape drives do this. The quoted capacity of e.g. an LTO cart is generally twice the physical capacity, assuming that the drive controller will be able to achieve a 2:1 compression ratio. Which it generally doesn't. You tend to get faster throughput if you disable hardware compression.
Re: Tape drives (Score:3)
I don't see how you can get better throughput by disabling tape drive compression, it's not like the drive is going to spin the tape any faster. The drive has variable speeds, not THAT variable, but I've been out of the game a while.
With compression you get higher throughput because the drive is moving the tape at a constant speed but you can send 1-3x more data over the wire. Your compression ratio directly affects your throughput, so 1.1x compression is a 10% increased throughput.
You can get better comp
Re: (Score:2)
This all depends on the size of the buffer in the drive, and the ratio of compression. If you don't do any software compression, and are sending highly compressible data, say, a large file made of all 0s, then you might fill the drive's buffer and/or bottleneck at the drive's I/O interface, and the tape drive is going to stop streaming while waiting for more data from the host.
I think what the parent was trying to say is, better to enable compression in the software, and send already compressed data to the
Re: (Score:2)
I don't see how you can get better throughput by disabling tape drive compression, it's not like the drive is going to spin the tape any faster.
Frankly, I don't know why it was doing that either, but that's what I was seeing. Big increases by switching off hardware compression. As you likely know, trying to compress data that's already compressed will tend to result in the output becoming bigger instead of smaller, so it's possible it was filling up the buffer more quickly and stalling.
Re: (Score:2)
OK, ignore that - I just checked my scripts and it looks like I'm misremembering things. I do in fact have the compression turned on. (And yes, 256K blocks).
Re: (Score:2)
Compression for tape drives should always be turned on. Even if lots of data doesn't compress, the other data that does compress makes up for it 50-100x when backing up to a tape drive.
Source: operated and owned many different types of tape drives both personally and professionally back when tape drives still mattered. All kinds of QIC formats, Travan, Sony Data8 and DLT, Sony AIT (based on DAT), the LTO variants, then to present-day (non-tape) nearline low-spindle-RPM hard drive storage when many companie
Re: Tape drives (Score:2)
I don't see how you can get better throughput by disabling tape drive compression,
Simple: attempting to compress [that which has already been compressed] results in larger file sizes.
Re: (Score:2)
Depends on the backup program. If you have something that can do deduplication, the hardware compression may not matter as much. However, for a lot of things, the hardware compression is good enough, especially if one is using a USB LTO-8 drives with their laptop and LTFS for drag and drop.
The problem is predictability of size (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
If that is the intent, you want to do the compression before sending the data down a pipe that, compared to the bandwidth to RAM, is rather slow. In the time you need to push the data out to the device to be compressed, you could compress it and push it across using only half as much bandwidth. Otherwise you're just trading CPU saturation for I/O saturation.
Re: (Score:2)
Most SSDs already use compression. Again, they use it to enhance endurance by reducing write amplification. This has been going on for years.
The OS still sees the guaranteed space - your 500GB SSD still has 512GiB of storage (the extra is used for bookkeeping as well as holding spare sectors - as much as 2% of flash can be bad from the get-go). But the SSD comp
Re: (Score:2)
Re: (Score:2)
I wonder if there are filesystems that accept variable storage size from the hardware.
Old Tech (Score:3)
Re: (Score:2)
Re: Old Tech (Score:2)
Does not really make much sense (Score:5, Informative)
In most cases you have a trade-off. Say you are storing code. Using XZ is slow but compresses really well. Using lzop compresses badly, but you get amazing speed (for a compressor). Your system cannot really know what trade-off you want. Also, hardware-compression is never really good, because it needs to use small block-sizes.
That said, there are filesystems and storage devices that do compression.
Yes, tapes. But it makes no sense (Score:2)
Well LTO tapes have completely wrong sizes printed on their case and labels (and the real size in a much smaller font size). That is because they do use hardware compression.
Besides all the other problems of tape drives, the problem with compression is that it's unpredictable and most of the time useless anyway.
What do you want to compress? What's the point? Anything big enough to be worth compressing is compressed already. I have never seen a drive with so many compressible files that it would be worth the
Yes. (Score:2)
It is called RLL encoding. Hard drives (spinning rust) have done it for a quarter century or more.
RLL isn't compression (Score:2)
RLL is, as you.said, encoding. It's not compression. It doesn't reduce the number of bytes. It does it makes sure that on the medium, a series of a million zeroes will be stored as a million bytes that are mixed ones and zeroes. That makes it easier to tell the difference between a million zeroes vs a million and one zeroes. Your length measurement doesn't have to be as precise.
Re: (Score:2)
The proletariat often mean "compression" as storing "more" in "less". By that token, RLL is compression. On a given magnetic substrate you can "store more" when the data is RLL encoded than when it is (for example) MFM encoded. About 35 years ago Spinning Rust switched from FM encoding to MFM encoding, resulting in the ability to store "more" in the same space. 25 years ago MFM encoding yielded to RLL encoding, resulting in the ability to store yet "more" is the same space. This process has repeated ev
Re: (Score:2)
Thats only because "MFM" is crap and expands rather than compresses. It was, however, better than NRZI, which was worse crap.
The compression in the more recent LTO drives works quite well - and all LTO drives try compressing, and store the uncompressed if smaller.
Compression impacts reliability - maybe not a lot, but it does. You can afford to have two tapes, but not two platters,
Check your CPU Usage (Score:4, Informative)
For the most part, the speed of your CPU has gotten faster at a higher rate than the speed the storage media can work. Using the CPU to compress your data for most use case scenarios is actually faster because you are writing fewer bits to the storage making the write much faster. Also reading you read less data, then your CPU can decompress it rather fast.
Oddly enough I rarely see a PC or CPU when running properly has 100% CPU, where Compression becomes a factor.
Having hardware compression will probably be slower as the Hardware CPU will not be as fast as your Main Computer, as well you will not have the ability to manage the compression algorithm.
Re: (Score:3)
Ditto. Your CPU is so much faster than the controller on the hard drive. Also factor in the speed limit of the bus between the CPU and the drive. Software compression makes so much more sense. Choose ZFS and compression and you are done.
Re: (Score:2)
Your cpu has other things to do, the controller does not. Using CPU cycles for compression takes performance away from other activities and will be less power efficient than a dedicated controller... Plus the controller only has to be as fast as the media its writing too - any faster is pointless.
Re:Check your CPU Usage (Score:5, Informative)
Your cpu has other things to do, the controller does not.
On a modern desktop/laptop/cell-phone, that is generally no longer true. CPUs have become so insanely fast relative to the rest of the hardware on the machine, that scenarios where the CPU is the limiting factor in performance are uncommon.
Instead, system bottlenecks are usually found in RAM or I/O bandwidth -- so it's usually faster to reduce the amount of data the CPU pumps out (e.g. by compressing it on the CPU) than to lighten the CPU's workload, since the CPU would only use those extra cycles to wait for I/O to complete anyway.
As for relative power efficiency -- well, maybe, but mobile CPUs benefit from dumptrucks full of R&D money every year to make them more power-efficient, while a one-off "intelligent" I/O controller probably won't.
Storage with compression is a bad idea (Score:2)
Re: (Score:2)
Oversubscription helps with that, the drive reports itself as have an actual capacity, a used capacity and a subscribed capacity ... so long as used doesn't hit actual it doesn't really matter what the subscribed capacity is.
Arrays that do compression tend to compress the data then compare it to the original to see if it was worth writing the compressed version to the media.
I think it would be difficult (Score:2)
Naively, I don't think this approach would work well. Why? Storage devices work with fixed-length blocks of data at given locations. The problem with compression is that the compressed size is variable: A text file compresses well, but media, like video and audio files, are usually already compressed.
I could see an operating system working with hardware-assisted compression, but honestly, given the tradeoffs, I suspect that a bigger drive is probably cheaper and faster in the long run.
Do it on the CPU. (Score:3)
The SATA link is a bottleneck on modern drives. Having it get compressed before passing over that link is a benefit to speed and software compression often gives speed boosts.
The law of diminishing returns applies, and in this case likely won't benefit much on modern NVMe drives.
Do Not Want (Score:3)
For various and sundry reasons, you generally don't want your storage device doing this for you. Use the built-in filesystem compression. Unless you have some strange workload, a compression task eating up one of your cores won't be a big deal usually.
If you are on an industrial scale, you want a dedicated file-server doing this for you. Ideally, use ZFS with it's built-in compression. It works great. If you are moving a ton of stuff over and overwhelm the compression algorithms, you can use an uncompressed SSD to cache the writes to the array and maintain performance.
Such a thing can't exist... (Score:2)
By definition, there is no compression algorithm that will work well with every single type of file. In fact, every compression algorithm must have a worst case performance of doing nothing with the file (in most cases, it's often worse that that).
It's obvious why. If a compression algorithm like that did exist, you could just feed it's output back into it over and over again until it was a single bit of data. But that means you could repeat the decompression to recreate every file that could ever exist fro
Uh, what are we solving for again? (Score:2)
I can drop a 12TB HDD in a desktop. And probably still have room to put in a second drive.
SSDs now hold multiple terabytes, so performance and capacity have been addressed.
I don't even give a shit about compressing music files anymore. Native Blu-Ray rips? Sure, why not. Who the hell is bothering to count gigabytes these days? Is storage really a problem anymore for more than 2% of computing professionals who already rely on RAID-based solutions to maximize capacity and data protection? I highly doub
Re: (Score:2)
Re: (Score:2)
MP3 does not get 20 to 1 compression. MP3 is a lossy compressor. It obtains almost all of its compression by throwing away data that you will never get back.
Why? (Score:2)
Back in the days of good ol' DoubleSpace, compression made sense, since most of your hard drive was full of EXE, COM, DLL, TXT, DOC, and lots of other uncompressed files. But now, there's no point in trying to add some sort of hardware abstraction layer for compression.
First off, the files that suck up the most hard drive space these days -- music, videos, photos -- are *already* compressed. Same goes with modern Microsoft Office files. Compressing them again just isn't worth the effort, and results in litt
should be optional (Score:2)
There are some situations in which you don't want to compress a file, so it really should be software control over it, even if the calculations are done in hardware.
"without a speed hit" (Score:2)
My understanding was the compression happens faster than the writing, so it's faster to compress and write the smaller files than write the big uncompressed ones.
Once upon a time, there was this (Score:2)
Cat (Score:2)
There would be no speed hit. As it turns out, decompression is much faster than the bottleneck of the data pipe, so it's faster to do push compressed data through it then decompress it in the processor than to stream it uncompressed. It's a rare win win situation.
I assume this would also be the same, but as most data is compressed it would be of nominal space savings.
RLL controllers (Score:2)
Drive compression with RLL [ebay.com] instead of MFM controllers
Linear Tape Open (Score:2)
has had hardware compression since forever.
Not practical (Score:2)
Your OS sees the drive as a block device. It's just a big array of storage blocks. If the drive compresses the data destined for a block, the whole block is still used as far as the os/filesystem is concerned. The compressed data doesn't create more space, it creates little bits of wasted space.
The only benefit would be fewer writes to the physical medium. Only really an advantage for flash memory.
Drive-Level Compression Incompatible w/ Encryption (Score:2)
Encrypted data cannot be effectively compressed. Storage device-level compression's big shortcoming is that it's fundamentally incompatible with encryption (except for storage device-level encryption). In other words, to get any benefit from compression you have to make sure that whatever you're sending to the storage device is UNencrypted. Do you really want to shut off all or most higher level encryption? To disable dm-crypt/LUKS2, FileVault, BitLocker, and PGP, as examples?
If you want to see where compu
All Self Encrypting Drives do Compression also (Score:2)
Re: (Score:2)
No they don't.
Why? (Score:2)
>"Are There Storage Devices With Hardware Compression Built In?"
Why? There really isn't much need for it. We store most files already compressed. Video, audio, photos, log files... anything of any reasonable size is already being compressed before storage and very efficiently, based on the TYPE of file- something compressed drives would not be able to do. Even LibreOffice files are all compressed as part of their native format (and those usually aren't even very large). And since probably 90+% of th
Not possible - compression needs higher level view (Score:2)
Bad idea (Score:2)
The problem is that compression of block storage requires and underlying file system or block database. To manage this, each block is stored similar to a file or a database record on another disk. This is also how we manage deduplication at a block level.
Remember that compressed data
Not really (Score:2)
Hardware compression is usually external. [youtube.com]
In my experience, hard drives work best in their uncompressed state.
Not a good idea (Score:2)
Since a hard drive or SSD is a block device, the only way that would work is if it prevented more blocks to address than available. So if the amount of data you have written can’t be compressed and you fill the drive up, then what?
You still need a storage system(not device) to do the compression and/or deduplication so you can overprovision the space available using compression, deduplication and thin provisioning.
With mixed data workloads I have had no problems with a 200%-300% over provisions in the
What's the point? (Score:2)
The only way this makes sense is if you have large files that compress well.
Drive space has become soooooo cheap that compressing stuff for storage is nearly pointless, and of course there's the fact that many file formats are already compressed.
You're spending a dollar to save a nickel. Just buy bigger drives and stop worrying about saving 0.00002% of your storage.
If you broaden your definition of ... (Score:2)
... storage devices there are. Tape drives have had hardware-based compression since the DLT days (at least).
But disk drives? Haven't heard of any with embedded compression.
Buy a bigger drive (Score:2)
zfs set compress=lz4 dataset (Score:2)
ZFS compression usually uses less than 10% of a modern CPU core per writer thread. The CPU compression is faster than the disk bus so you actually lose I/O performance if you don't have compression on, for the average workload.
If there were some reason that 10% of one core (say 1% of an average server) was too much or if say there were a hundred concurrent write threads then most people would use a NAS with gigabit or 10Gbe connections.
Variable, unpredictable throughput (Score:2)
OTOH, the compression also cannot use more complex analyses of the content, as it only sees a few kb of the bytestream at a time. On top of which, as newer compression techniques develop you won't have access.
I'm curious what the use
Re: (Score:3)
you should be sorry. what, are you trying to save slashdot? It's dead man. It's run by like one guy in his spare time now.
Re: (Score:2)
It's run by like one guy in his spare time now.
So, just like in the beginning?
Re: Ready? (Score:2)
what, are you trying to save slashdot?
Save it from these reject editors?
Re: (Score:2)
Re: (Score:2)
It's not really an "article" but an "Ask Slashdot" post.
Slashdot is a bit more than "News for Nerds" nowadays.
Re: (Score:3)
Yes, but compression is pointless for the types of files that are probably overflowing your PC (JPEG, AVI, MKV, etc.)
Re: (Score:3)
Re: (Score:3)
That depends a lot. Mostly it depends on the tradeoffs of the compression algorithms. Many compression algorithms for common end-user files, especially ones that have existed for a long time now, are biased more towards speed than towards size. The resulting files, when compressed with an algorithm biased towards size rather than speed, (especially a newer algorithm) can often be shrunk further.
That's not to say that it makes sense to do so though. At this point generic storage is super cheap, paying to get
Re: Yep (Score:2)
Re: (Score:2)
Yep [netapp.com], although pricey for non-enterprise customers.
Among other things:
"Shrink your data footprint with inline deduplication, compression, and compaction."
That's software compression (Score:3)
Note NetApp is software compression. Software running on a $4,000 server you bought for $40,000 after they spent $4,000 to fly two salespeople out to sell you on it.
ZFS is a similar integrated software storage suite with all of the same features. Salespeople and ridiculous invoices are optional with ZFS.
The same features are also in the Linux storage stack, mostly md and lvm. The Linux approach is different in that raid, volume management, dedupe, etc are independent modules that you can mix and match any w
Re: (Score:2)
The OP asked for a storage device, so I pointed him at a storage device. Sure, you could build your own dedicated storage device as well, although the netapp devices have some decent features built-in which make a nice out of the box setup. The compression, deduplication, etc... runs on a general purpose cpu, but at least it's dedicated for the storage device, rather than relying on the storage client to do everything.
Re: (Score:2)
The OP asked for a storage device, so I pointed him at a storage device. Sure, you could build your own dedicated storage device as well, although the netapp devices have some decent features built-in which make a nice out of the box setup. The compression, deduplication, etc... runs on a general purpose cpu, but at least it's dedicated for the storage device, rather than relying on the storage client to do everything.
The dude is asking about a drive that does the compression on its own (Without using any processor time).
Re:That's software compression (Score:5, Insightful)
The dude is asking about a drive that does the compression on its own (Without using any processor time).
Yes, but Slashdot is a discussion site, not an "answers" site. So rather than just giving him what he thinks he needs, you should explain why it is likely NOT what he needs.
In-device compression is dumb for the following reasons:
1. Many files are already compressed, including photos, movies, pdf, zip, tgz, etc.
2. By sending uncompressed files to/from the device, you need more I/O bandwidth, which is likely to be a bigger bottleneck than the CPU.
3. Your main CPU is likely FAR more powerful than a small embedded coprocessor in the drive. So the coprocessor may be compressing/decompressing while the CPU is idle, waiting for it to finish.
4. Your main CPU can use the latest algorithms for compression, can easily update those algorithms, and has more information about the filesystem and the types of files being compressed. A "smart drive" co-processor may only have a block-level view.
5. If you have N dollars to spend, instead of buying a "smart drive" that will speed up only one aspect of your system (and likely not even that if you are I/O bound), it is usually better to spend the same N dollars on a BETTER CPU or MORE CORES, which will speed up many more aspects of your system.
Re: (Score:2)
The dude is asking about a drive that does the compression on its own (Without using any processor time).
Yes, but Slashdot is a discussion site, not an "answers" site. So rather than just giving him what he thinks he needs, you should explain why it is likely NOT what he needs.
In-device compression is dumb for the following reasons:
1. Many files are already compressed, including photos, movies, pdf, zip, tgz, etc.
2. By sending uncompressed files to/from the device, you need more I/O bandwidth, which is likely to be a bigger bottleneck than the CPU.
3. Your main CPU is likely FAR more powerful than a small embedded coprocessor in the drive. So the coprocessor may be compressing/decompressing while the CPU is idle, waiting for it to finish.
4. Your main CPU can use the latest algorithms for compression, can easily update those algorithms, and has more information about the filesystem and the types of files being compressed. A "smart drive" co-processor may only have a block-level view.
5. If you have N dollars to spend, instead of buying a "smart drive" that will speed up only one aspect of your system (and likely not even that if you are I/O bound), it is usually better to spend the same N dollars on a BETTER CPU or MORE CORES, which will speed up many more aspects of your system.
Thank you for explaining how everything works.
The "answer" i provided was part of a "discussion" surrounding a "question" and a discussion "response".
Someone specifically asked about "x" and the responding party addressed "y".
compression needs some "real" advances. Shorthand was a neat trick. Is anything further even possible without serious loss?
Re: That's software compression (Score:2)
compression needs some "real" advances
What the fuck do you think h.265 is??
Re: That's software compression (Score:2)
Tldr - Just buy another CPU core or 5. AMD gives them out by the dozen.
Meta again (but need a joke subject) (Score:2)
The question was ill posed and I strongly concur with your comments and with the insightful mod points you received. Unfortunately, I more strongly concur with other comments about what a poor question this was in particular and about the decline of Slashdot discussions in general.
Seeking a better solution approach, I'll describe my current one. I go through the front page looking for stories that interest me or that have a lot of comments. These days I rarely see interesting stories, but if I find one, I'l
Re: (Score:2)
Maybe the storage device is meant to be for archival purposes. Hardware compression would be useful in that scenario.
Archival storage is the worst place to use hardware compression. You would have to use the exact same hardware, with the exact same firmware, to retrieve your data, maybe 10 or 20 years in the future.
Far better to compress on the host with a standard program like zip or gzip, and include an uncompressed copy of the source code at the beginning of the archive.
Absolutely true. Unless it's hardware gzip (Score:2)
That's a great point. I mentioned that when comparing it to raid.
Well, if the drive CONTROLLER os doing it. If the drive itself used compression - well yeah you need the drive to get the data off it. So nothing lost aside the the possibility of a fried compression chip on a otherwise healthy drive.
Of course you could do gzip in hardware. Such that reading it the the controller is equivalent to gunzip /dev/sdb.
I wouldn't in any normal circumstance but you could. It could be used in an iot data lo
Re: That's software compression (Score:2)
The dude is asking about a drive that does the compression on its own (Without using any processor time).
I didn't get the impression that the dude even knew what he was asking. I did get the impression that this was a "fake" submission on the part of a /. "editor" to drum-up "debate" on a weekend.