Samsung Announces Standards-Compliant Key-Value SSD Prototype (anandtech.com) 74
Samsung has announced a new prototype key-value SSD that is compatible with the first industry standard API for key-value storage devices. "Earlier this year, the Object Drives working group of Storage Networking Industry Association (SNIA) published version 1.0 of the Key Value Storage API Specification," reports AnandTech. "Samsung has added support for this new API to their ongoing key-value SSD project." From the report: Samsung has been working on key-value SSDs for quite a while, and they have been publicly developing open-source software to support KV SSDs for over a year, including the basic libraries and drivers needed to access KV SSDs as well as a sample benchmarking tool and a Ceph backend. The prototype drives they have previously discussed have been based on their PM983 datacenter NVMe drives with TLC NAND, using custom firmware to enable the key-value interface. Those drives support key lengths from 4 to 255 bytes and value lengths up to 2MB, and it is likely that Samsung's new prototype is based on the same hardware platform and retains similar size limits.
Samsung's Platform Development Kit software for key-value SSDs originally supported their own software API, but now additionally supports the vendor-neutral SNIA standard API. The prototype drives are currently available for companies that are interested in developing software to use KV SSDs. Samsung's KV SSDs probably will not move from prototype status to being mass production products until after the corresponding key-value command set extension to NVMe is finalized, so that KV SSDs can be supported without needing a custom NVMe driver. The SNIA standard API for key-value drives is a high-level transport-agnostic API that can support drives using NVMe, SAS or SATA interfaces, but each of those protocols needs to be extended with key-value support.
Samsung's Platform Development Kit software for key-value SSDs originally supported their own software API, but now additionally supports the vendor-neutral SNIA standard API. The prototype drives are currently available for companies that are interested in developing software to use KV SSDs. Samsung's KV SSDs probably will not move from prototype status to being mass production products until after the corresponding key-value command set extension to NVMe is finalized, so that KV SSDs can be supported without needing a custom NVMe driver. The SNIA standard API for key-value drives is a high-level transport-agnostic API that can support drives using NVMe, SAS or SATA interfaces, but each of those protocols needs to be extended with key-value support.
I really want to like this (Score:2)
But I know how many shops write code. 2MB should be enough for anyone?
Re: I really want to like this (Score:3)
The simplest approach is just to brute force it by deciding on a block size smaller than the limit for value sizes and then defining a bunch of keys that are LBA addresses for the 'block' stored in the
Re: (Score:2)
And you have precisely described what I expect to happen with it. Well done, sir.
Re: (Score:2)
This is just a flashback to those days.
Just make all your keys "1", "2", "3" ... and pretend they're sector numbers and your traditional filesystems will work with them just fine.
Re: (Score:2)
FTFA:
This allows a key-value drive to be used more or less as a drop-in replacement for software key-value databases like RocksDB, and as a backend for applications built atop key-value databases.
Re: I really want to like this (Score:2)
I totally understood it. I fully expect it to be abused as described by another poster above.
I think there's plenty of actual use cases. Like the subject says, I want to like it.
But then you give a mouse a cookie, and next thing you know, it's come down from on high that, "You will store 10 MB photos in those values, or else". Not in every shop, but certainly some.
Re: (Score:2)
To wit, my comments amount to, "it's a wonderful idea for the right application, but there's a fresh stream of daily WTF posts coming."
Re: (Score:2)
There seems to be some confusion about the justification for KV SSD technology .
So here :
https://www.snia.org/sites/def... [snia.org]
The point? (Score:1)
What is the point? This is a solution in search of a problem. The young-uns seem to like solutions to non-existent problems -- maybe it is because they are completely at a loss how to solve problems that actually exist.
Re: (Score:2)
Re:The point? (Score:4, Insightful)
Contents Addressable Memory
Of course. But why implement it in a hardware API?
Why not just use a "normal" SSD, and store the keys in an index that can be cached in RAM?
Then when you need keys longer than 255 bytes, or values over 2MB, you can just upgrade your software.
Commodity off-the-shelf drives will certainly be cheaper, and can work with existing backup software.
Re: (Score:2)
Re: (Score:2)
Did you read the article or other articles on it?
Hah, you expected informed grumpy bitching? This is slashdot, where it's a point of pride to ramble uninformed when the answers are right in TFA. I've scanned through the comments, and already found about a half-dozen "what's the point of this?" comments.
Re: (Score:3)
It's so weird, it's like these people think everyone else is just a complete retard. Samsung has about 500 people who are just too stupid to realize that this was supposedly a bad idea and they should have just "used a normal SSD". If only this lone voice of reason had been there at Samsung HQ to convince all of their apparently very stupid engineers and executives who greenlit this project that they are wrong and can just "use a normal SSD".
SlashDot: Unique Home to the Voice of Reason.
Re: (Score:2)
Did you read the article or other articles on it? They implemented it because it will provide considerably better performance for this narrow purpose.
Yes, I RTFA, and it does NOT say it gives better performance. What it says is that it has "the potential to offload significant work from a server's CPUs". That is a BS justification. The server CPU is going to be FAR more powerful that an embedded CPU in the drive, and is very unlikely to be compute bound.
Rather than spending money on some custom drive, it is almost certainly better to invest that money on a faster main CPU and more RAM for indexing.
Re: The point? (Score:2)
the people this is targetted at already have the fastest highest core count cpu money can buy and a couple of terrabytes of ram.
still not enough.
Obviously if all you do all day is play csgo and post on slashdot, this api is not the api you are looking for.
Re: (Score:2)
That is a BS justification. The server CPU is going to be FAR more powerful that an embedded CPU in the drive
Will it also be far more powerful than the embedded ASIC in the drive?
Re: (Score:3)
Since you can't figure out that offloading significant KV searches from the server CPU to a KV coprocessor gives better performance, this would be a good place to stop posting and making yourself look even more clueless.
Re: (Score:2)
The server CPU is going to be FAR more powerful that an embedded CPU in the drive, and is very unlikely to be compute bound.
And for people looking at KV SSD the CPU is also likely to be saturated with requests and lagging as a result.
Rather than spending money on some custom drive, it is almost certainly better to invest that money on a faster main CPU and more RAM for indexing.
The point is that the KV SSD will be less expensive at some point compare to your suggested upgrade. Perhaps even eventually allowing for saving on the base CPU and RAM compared o today's requirements.
Basically the KV SSD is offering more parallelism. The other poster with GPU analogy was spot on.
Re: (Score:2)
Re:The point? (Score:5, Informative)
Re: (Score:3)
Re: (Score:2)
So what? This isn't that. This is flash, not dram. It can't do that.
Re: (Score:2)
Re: (Score:2)
Anything you cook up to "instantly" search across its contents can be done on a normal disk with a normal filesystem, too. Indexing, for example. DRAM can be accessed in a very parallel fashion, Flash RAM can't.
Re: (Score:2)
Thank you. Now everyone knows what ridiculous belief you hold that makes you form such absurd conclusions. But as usual your stupidity just keeps getting deeper. What makes you think that a disk drive can access the data it has spread all over its various platters in "massively parallel fashion"? I
Re: (Score:2)
What makes you think that a disk drive can access the data it has spread all over its various platters in "massively parallel fashion"?
The only person in this discussion who has mentioned spinning rust is you. You're going to have to figure out what the rest of us are talking about if you want to make a useful contribution.
Re: The point? (Score:2)
Re: (Score:2)
It would have been obvious to an intelligent person what i meant. QED, you are not that person. Now run along and let the smart people have the discussion, as you have nothing to add.
Re: The point? (Score:2)
Re: (Score:2)
The point is to make database operations faster. If it speeds up databases enough to offset the additional hardware cost (relative to traditional/commodity hardware), that's the point. If not, there is no point.
By analogy: what's the point of a GPU when you already have a perfectly good, Turing-complete CPU that can already perform all the same calculations?
Re: (Score:2)
The point of a GPU is to offload the processing of the display from the main CPU. The point of I/O Channel Processors is to offload I/O processing from the main CPU (bittyboxen don't have these yet, though they did exists at one time for SCSI using a WD7000 SCSI controller). This is nothing more than an offloading of some very specific I/O to a co-processor.
Re: (Score:2)
A hardware implemented hash table is kind of cool. I don't think laptops are going to be coming with them standard any time soon, but I'm sure there are lots of uses for it.
Like "Ethernet" & "Fibrechannel" drives no fut (Score:2)
They'll basically be used for some marketing and concept systems, but what is the value-add to a system like this? It's locking you into a specific vendor, it's locking in your software system to the hardware limitations.
There is a reason we want Software-Defined. Hardware anything is too slow and inflexible these days.
Re: (Score:2)
I don't get it (Score:2)
My SSD is performing as a key-value device right now. The keys are filenames. The values are the contents of files.
Re:I don't get it (Score:5, Informative)
Re: (Score:2)
Content-Addressable Memory (CAM)
No way. This isn't a CAM its literally flash with a different interface that isn't even remotely on the same planet as DRAM let alone SRAM in terms of performance.
Re: (Score:3)
No, it is not.
It is a block device.
It has numbers which address a block.
On top of that is a filesystem. That loosely resembles a key/value thing. However the key again does not map to a value, but a list of blocks.
The difference might be subtile ... but it is there. Especially when one of the above is solved by the hardware itself and does not need an OS or an API or a filesystem.
Re: (Score:2)
"On top of that is a filesystem. That loosely resembles a key/value thing. However the key again does not map to a value, but a list of blocks."
Yeah, and the SSD does all the same shit internally.
The difference is that i don't need to support any new standard to read my disk. Just the old ones.
If support for this new standard is as spotty as i expect, it will create the same kind of problems as hardware raid.
Re: (Score:2)
Details differ a bit depending on just what kind of filesystem you're using, but generally looking up a file name is a matter of searching a sorted list. Average and worst case lookup are order N operations.
Lookup times in a properly implemented hash table are constant complexity.
In most file systems you could probably get constant order lookup times by implementing the hashing algorithm yourself using a predefined directory tree structure, but the filesystem would still have lots of overhead having to do a
Count Key Data format from 1964 IBM mainframes! (Score:2)
Why is this needed? (Score:3)
Are people now too incompetent to put a hash-table on raw storage or what?
Re:Why is this needed? (Score:5, Funny)
To implement a hash table in 2019 you need to download 200MB of JAR files.
Re: (Score:2)
It is utterly pathetic, but I completely agree. For the average moron coder that is. My last custom hash-table keeps a fortune-100 in business.
Re: (Score:2)
Re: (Score:2)
No, I mean a hash table om a raw block device. You may have heard of those. Some people are capable of reading data-sheets pr even do their own benchmarks toady. Yes, I do know it is an almost lost black art, but not everybody has gone completely incompetent and only capable of coding for high-level interfaces.
Re: (Score:2)
Re: (Score:2)
You think universal storage specialized to a single purpose is an _advantage_? Talk about being incompetent and pathetically grateful for some vendor catering to your incompetence.
Re: Why is this needed? (Score:2)
Re: (Score:2)
Re: (Score:2)
I, on the other hand, think this is basically a scam and only possible because more and more and mode coders do not even understand the basics of their trade.
Obligatory (Score:2)
https://imgs.xkcd.com/comics/s... [xkcd.com]
So what are the stats? (Score:2)
Software vs Hardware Performance (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
Around 2014, I think, when "bootcamps" started to catch on and everyone decided they wanted to be a "coder."
Re: (Score:2)
My linux box is hip? When did that happen?
In every year Linux failed to take over the desktop and remained the edgy outsider. :-)
Symlinks are "key-value" storage (Score:2)
The symlink's name is the key, what it points at is the value. Any modern filesystem has them (including NTFS apparently [wikipedia.org]), and they all use hashing to speed up directory lookups, making this a perfectly practical alternative to dedicated libraries with complicated file-formats.
Maybe, by doing it all in dedicated hardware you can gain some speed, but your filesystem can do it already. I'd be curious in performance benchmarks of this device vs. the same SSD formatted and mounted as filesystem.
Re: (Score:2)
This SSD is not a hard drive, but a special kind of memory aka RAM extension.
Re: (Score:2)
This SSD is not a hard drive, but a special kind of memory aka RAM extension.
No its an SSD full of flash memory. It's not special.
Personally I'm amused by the whole thing after reading Samsung's benchmark experiment which basically exclusively measured writes to show substantial performance improvements.
How many minutes could you sustain that kind of load before flash cells self destructed? 5? 10? 1 day? A week? Pick a number you think is fair. Whatever it is the benchmark itself has no purchase on reality for the simple reason if you did that kind of thing in the real world u
Re: (Score:3)
No its an SSD full of flash memory. It's not special.
Personally I'm amused by the whole thing after reading Samsung's benchmark experiment which basically exclusively measured writes to show substantial performance improvements.
I'm not sure to which benchmark you're referring to, but I found for example this one: https://www.samsung.com/semico... [samsung.com] - where they are stating that the performance scales almost linearly with adding KV-SSDs and they mention 15x improvement in QPS.
How many minutes could you sustain that kind of load before flash cells self destructed? 5? 10? 1 day? A week? Pick a number you think is fair. Whatever it is the benchmark itself has no purchase on reality for the simple reason if you did that kind of thing in the real world underlying storage hardware would self destruct quicker than it would be feasible to replace.
Well, they also mention a much lower write amplification, so at the same amount of operations, they will last longer. I don't know about any later endurance tests of SSDs, but at least the Samsung 850 Pro in one test lasted 9100 TB of writing. That means that ev
Re: (Score:2)
The performance of normal disks in a stripe also scales almost linearly. So what? As for SSD degradation testing, in the real world devices have data on them, and degrade a lot quicker than in ideal test conditions.
Re: (Score:2)
So much for end to end data integrity... (Score:1)
Moving a key-value store behind a hardware API implies a large amount of trust in the vendor. After skimming the spec, there does not appear to be a single word about data integrity. This is a complex API, hiding an even more complex implementation, which does not inspire confidence. What are the chances that the internals will be as solid as ZFS, within commodity devices? If integrity must be implemented on top of this new API, what has been gained?
This appears to be backwards; if performance is critical,
Seagate Kinetic was the first a long time ago (Score:3)
[1] https://www.storagereview.com/... [storagereview.com]
[2] https://www.crn.com.au/news/se... [crn.com.au]
Imagine (Score:1)
Re: (Score:2)
Would that imply that malware has value?
.
.
.
.
I'll see myself out.