Gaining RAM For Free, Through Software 68
wakaramon writes with a piece from IEEE Spectrum about an experimental approach to squeezing more usable storage out of a device's existing RAM; the researchers were using a Linux-based PDA as their testbed, and claim that their software "effectively gives an embedded system more than twice the memory it had originally — essentially for free." "Although the price of RAM has plummeted fast, the need for memory has expanded faster still. But if you could use data-compression software to control the way embedded systems store information in RAM, and do it in a way that didn't sap performance appreciably, the payoff would be enormous."
Stack is back (Score:3, Informative)
That's an old idea - using transparent compression to gain more memory...
Re:Stack is back (Score:5, Informative)
If you had RTFA, you'd have found that the difference they've made is they've developed a compression scheme that doesn't have the huge performance penalty that old techniques had. (Specifically, they claim "0.2 percent on average and 9.2 percent in the worst case.")
Stack is back indeed (Score:5, Informative)
>If you had RTFA, you'd have found that the difference they've made is they've developed a compression scheme that doesn't have the huge performance penalty that old techniques had.
Back then, the real Stack (not Microsoft's poorly implemented and unstable clone) didn't have a huge impact on the performance either, as it used a big cache and had significantly less amount of data to transfer from/to a harddisk which back then didn't shine bandwidth-wise.
The reason stack died isn't the preformance hit. The reason stack died is a combination of :
- Microsoft managing to instill paranoia about RT-compression thank to their double-crap
- huge drops in storage price wich made on-the-fly compression irrelevant
- newer data formats which are hard to compress any way (Stack could be efficient back then when most graphics where RLE-encoded bitmaps. Now that everything is stored as JPEGs and MP3, there's not much an additional layer of compression could do).
Ultra-fast compression algorithms like LZO aren't something new, and could easily be implemented in a hardware chip for even faster performance.
Such compression *could* have been useful a decade ago, when PDAs still had limited memory and did cost a lot.
Now, with the price drops of memory and the increased popularity of solid state memory (micro-SD have just insane capacity these days), it hard to be short on memory even on embed device.
So it's nice that they have gone through all the technical difficulties to have real-time compression at RAM-level of bandwith.
But they developed it a decade too late to have any marketable product.
Re: (Score:3, Informative)
Such compression *could* have been useful a decade ago, when PDAs still had limited memory and did cost a lot.
"Limited memory" is relative to the number of programs I want to be running.
Now, with the price drops of memory and the increased popularity of solid state memory (micro-SD have just insane capacity these days), it hard to be short on memory even on embed device.
1. memory aka RAM uses electricity, which is an issue in portable devices
2. micro-SD is not RAM, making it irrelevant to the discussion.
But they developed it a decade too late to have any marketable product.
Did you RTFA?
They've already licensed it to NEC for commercial use.
Re: (Score:2)
"Limited memory" is relative to the number of programs I want to be running.
no, it's relative to the design.
Re: (Score:2)
They've already licensed it to NEC for commercial use.
They might have a patent for an actual implementation of the compression algorithm, but those kind of patents are a dime a dozen: http://www.gzip.org/#faq11 [gzip.org]
For the actual technique and there's copious prior art, for example here [unipi.it]:
In this paper we discuss the use of compression techniques to make a more eective use of the
available RAM, thus reducing the need for secondary storage. We show that in many cases memory
pages contain highly compressible data, with a very large amount of zero-valued elements. This
suggests the use of very fast compression algorithms based on static Human codes, rather than
adaptive algorithms such as LZRW1 and similar ones
[...]
Using the proposed compression algorithm, it becomes possible to use a small portion of the
physical memory as a fast, low latency, compressed swap device. This technique can greatly reduce
the latency of page faults, thus improving program's performance
Re: (Score:1)
Back then, the real Stack (not Microsoft's poorly implemented and unstable clone) didn't have a huge impact on the performance either, as it used a big cache and had significantly less amount of data to transfer from/to a harddisk which back then didn't shine bandwidth-wise.
The reason stack died isn't the preformance hit. The reason stack died is a combination of :
- Microsoft managing to instill paranoia about RT-compression thank to their double-crap
This is nothing to do with Stac or Microsoft. That was on the fly compression of a FAT filesystem, this is on the fly compression of RAM in a PDA.
Re: (Score:2)
This is nothing to do with Stac or Microsoft. That was on the fly compression of a FAT filesystem, this is on the fly compression of RAM in a PDA.
Yeah and both case it's real-time compression either hardware assisted (old stac) or pure software (newer stac) of chunks of memory, done on the fly as data is accessed in small chunks.
Those chunks are either "clusters" (harddisk) or "memory pages" (RAM). But both situations are terribly similar. Probably even both systems try hard to group together small batches of data to better avoid wasting space. (tail packing in the hd world. Dunno the equivalent name in RAM compressed-pages)
The only difference are in
Re: (Score:2)
Sounds like the old scam that occurred in the end of the 80's.
A piece of software that was installed on MS-DOS claiming to increase the amount of available memory but all it did was to change how much free memory that was reported.
You will get better bang for the bucks with better written applications.
You mean like RAMDoubler (Score:5, Funny)
The 80s and 90s called - they want their technology back.
Re: (Score:3, Insightful)
Software patents (Score:4, Interesting)
I haven't managed to find the patent application yet, but I wonder if Connectix's RAM Doubler product would be considered prior art.
Timothy was mistaken (Score:2)
Ram Doubler's primary techniques were to (a) more tightly define the memory space a given application was allocated and (b) adding virtual RAM functionality to the old Mac OS. It had some rudimentary compression techniques as well, but nothing formidable.
I'd actually like to see this work. Hell, it might even make the PSP usable for web browsing if they can get it right.
Re: (Score:3, Funny)
I haven't managed to find the patent application yet, but I wonder if Connectix's RAM Doubler product would be considered prior art.
If not, this [imdb.com] probably is...
Re:You mean like RAMDoubler (Score:5, Funny)
There, fixed it for ya!
Actually, that's informative (Score:5, Informative)
Actually, I'd rate that informative, rather than funny. I've actually tried a couple of such programs back then, and invariably it was just a fancy way to slow your computer down. (Mildly.)
Basically, the way it worked was:
1. Report more RAM to the OS. That's actually what your swap file does too. Virtually any modern processor has _some_ way to pretend it has more memory than physically present, with the extra bytes being in a swap file.
2. Set aside half the memory as a kind of compressed, virtual (in memory) swap file.
So at this point, let's say your computer had 4 MB RAM (hey, back then we didn't measure RAM in Gigabytes). So now you'd only have 2 MB of it free as physical memory for your programs, and 2 MB set up as a compressed swap file. But your OS thought you have 8 MB, with 2 MB being the free RAM left and 6 MB of it being swap space.
3. However, you typically still wanted some actual swap space, because you don't know, and can't guarantee, how well that swap space compresses. If you swap out, say, a table of random numbers, you may not be able to compress at all. Funky things can happen when the OS thinks it has room to swap a page out, but it turns out that it doesn't fit there. The actual HDD swap file would be, at the very least, the safety net to catch whatever doesn't fit into that RAM buffer.
Now the thing is:
A. That virtual compressed swap space was typically faster than HDD (we didn't have 15,000 RPM drives with huge caches, back then), but, here's the important part, _much_ slower than just plain old free RAM. ("Free" as in "available to the OS as it is.") Even the page fault itself, never mind the compression, was _much_ slower than the few cycles required to just read a memory page.
Compression didn't make it much better. Almost any decent compression algorithm is fast when deconpressing, but slow when compressing. When handling a page fault in that context, you had to do both. Compress the page you want swapped out, and decompress the page you want swapped in. Not only that took time, but it was CPU time. Unlike IO time, which happens on DMA in an ideal world, and lets your CPU schedule some other task in that time.
B. However, now you had less free RAM _and_ were encouraged to load more into it. If you had 5 MB of memory in use on the above described computer, without RamDoubling scams, you'd have 4 MB of physical memory in use and 1 MB swapped to disk. With such a RamDoubling scheme, you had 2 MB in actual normal RAM, and 3 MB swapped out.
In almost all cases, the "ram doubling" inherently increased the number of pages swapped in and out per second. In some cases, dramatically. (E.g., Java's GC didn't play nice at all with swapping anyway. It already tended to push everything else out. Play with it in even less space, and things could get funny.)
So a lot of the time, sometimes even most of the time, all you'd get for your efforts was slowing your computer down. And a useless number telling you "now you have 8 MB RAM!!!!11oneeleventeen", but not what the cost there is, or even what it really means.
Re:Actually, that's informative (Score:4, Informative)
I've briefly tried RamDoubler on Windows, but on MacOS, RamDoubler actually was very effective. On Windows, I didn't seem to notice much of a difference, but on MacOS, it was essential.
The reason is that MacOS's memory manager, to be honest, sucked. The basis of the memory management scheme was "Minimum Memory"/"Preferred Memory" - the OS will check RAM free, and if it was greater than minimum, would start the app. If not, it wouldn't. If there was more RAM free than Preferred, it would give the app that amount, if not, it would give it somewhere in-between. Mac apps were responsible for monitoring how much RAM was free, and to not do operations when they were running low. Problem is, if you have a big document, or big dataset (email inbox, say), the "minimum" wasn't often enough, and enlarging both was a task that was necessary.
So now you have the problem in MacOS - if you set it too big, you can't launch the app when you need it. If you set it too small, it crashed or error'd out. Too big and the app does't need it, and it becomes wasted.
Using swap wasn't always an option, for it inevitably made MacOS slower, so many people ran MacOS without swap.
What RAMDoubler did, effectively, was manage this more effectively. It first created a swapfile as big as RAM (so it can "double" it if things went badly), then managed the free space more effectively - if an app wasn't using the RAM it was allocated, RD would reclaim it as its buffers to compress unused pages. It worked remarkably well - if you kept the app's "minimum" and "preferred" sizes to under the physical memory size, it had a very small impact on system performance (much less than MacOS' swapfile). You only got thrashing when you tried to use an app that had it's memory allocations higher than physical memory.
In the end, the general recommendations was that the tool was to let you keep more apps running at once, rather than let you launch apps with less physical RAM than you actually had. In that regard, it actually succeeded fairly well, but only because the general awfulness of the MacOS memory model. Not entirely MacOS' fault, for it ran on systems without an MMU.
Windows 3.1 (protected mode), I'm not so sure if it had that great of an effect - sure the Windows memory manager was horrible, but RD on Windows didn't actually improve matters much.
MacOS though, it was needed. There was even a hack that you could apply to it to change the multiplier it used, so you could use it for boasting. Of course, it used more disk space as swapfile and caused more slowdowns, but still an improvement. Its life ended shortly after Apple moved to the PowerPC chips, where to the surprise of most people, you wanted to turn ON the swapfile on MacOS if you had a PowerPC machine - the system ran markedly faster (but only if the swapfile was between a certain range of values). RD worked (I believe it was available as a fat binary), but it didn't accomplsh much over what PowerPC MacOS could do, and the benefit wasn't worth the cost.
Re: (Score:2)
Compression didn't make it much better. Almost any decent compression algorithm is fast when deconpressing, but slow when compressing. When handling a page fault in that context, you had to do both. Compress the page you want swapped out, and decompress the page you want swapped in. Not only that took time, but it was CPU time. Unlike IO time, which happens on DMA in an ideal world, and lets your CPU schedule some other task in that time.
Obviously, it begs for a hardware solution, simple coprocessor of sorts, which would seize the memory bus (while CPU grazes in its cache), perform the comdec algorithm and mem-to-mem DMA copying.
It could probably fit in an ASIC, FPGA, whatever, but it would be even better if it was integrated with MMU.
Compression could be done in advance, because of the selection criteria for page swap: the page in working memory which was least recently accessed is the most probable candidate for it.
Pages containing 'code'
Re: (Score:2)
Re:You mean like RAMDoubler (Score:5, Funny)
Customer: "32 megs."
Tech Support: "Are you using any RAM doubling software?"
Customer: "Yes."
Tech Support: "So you have 16 megs of actual, physical RAM?"
Customer: "No. I have 8 megs. I installed [a RAM expanding product], and that gave me 16. I liked it so much I went out and got [another RAM expanding product]. So now I have 32."
source [rinkworks.com]
Re:You mean like RAMDoubler (Score:5, Funny)
The 80s and 90s called - they want their technology back.
The 80s and 90s called again - they want their joke back.
-
News? (Score:5, Informative)
These days, with RAM bandwidth being a major bottleneck, it might actually make a lot of sense if you could do the compression / decompression in hardware between the cache controller and the RAM - you'd get more bandwidth to RAM at the cost of slightly more latency.
Re: (Score:3, Informative)
It depends on the application.
If you want a stream of data to come through at a certain rate, and you don't care whether it starts now or 1 second later, just as long as it comes through at that rate, then latency isn't a concern.
On the other hand, if you want to read something from memory and have it available in a few clock cycles, then latency is a concern.
Re: (Score:2)
Re: (Score:1)
Well, much of data processing is processing massive amounts of it in a similar fashion. This is done in an assembly-line fashion, with multiple stages of operation (called pipelining in computer science) with a chunk of data being worked on at each different stage. The stages typically involve pieces of loading, massaging, and unloading* the data.
So if the whole process takes 5 minutes, who cares if there's a 10 second overhead time prepping and kicking off the process.
But if they're trying to do it in ge
Re: (Score:2)
Using MetaRAM hardware to double or triple the physical RAM each slot will accept makes more sense in a desktop or server situation, I think. This might be handy in the embedded and pocket computer spaces, where they're talking about in TFA.
Even Johnny Mnemonic already did this, though, and even in the crappy movie. If they get a patent, it'd better be on a specific method of increasing the performance of something like RAMDoubler.
Re: (Score:3, Interesting)
When I was at Kodak in the mid 80s we were actively investigating lossless image compression to get by disk speed bottlenecks. Sadly, our lossless algorithms weren't hitting the compression ratios we needed. This was not for consumer product
Re: (Score:2)
So which do you think is the deciding difference today - better compression algorithms, or faster CPUs? I've been amazed to realize how much of my "conventional wisdom" was based on the assumption that "lots of CPU is slower than lots of disk". I grew up believing that compressed data would always be slower than, but smaller than, raw data; it was a tradeoff of space vs. time. RAMDoubler let you run larger programs, but at a huge performance hit - usually, IIRC, worse than just using virtual memory and l
Re: (Score:2)
I would think the variable size of compressed memory blocks would throw a monkey wrench in trying to recover a decent amount of free RAM, without killing cache performance, which relies on fixed-size memory blocks at fixed locations, after all.
You might be able to work out a solution with some additional indirection, but you're talking about adding more and more cycles of latency.
Re: (Score:2)
Ram encryption would be kind of awesome. I wouldn't mind a hardware chip that encrypts/decrypts ram to the system and performs some kind of slight compression. It would prevent the full disk encryption hack of freezing ram chips to extract data from them after system shutdown.
Re: (Score:2)
Your saying every time I plan to walk away from my office I need to shutdown my computer, unplug it from the docking station, remove the battery, press the power button, put the battery back, and finally plug it back into the docking station?
I think encrypted ram would be much simpler.
Re: (Score:1)
If I told you that is what a TPM module can do for you right now, would you still think its awesome?
It also comes with full bus encryption...
Re: (Score:2)
I'm not against the technology. I am against using that technology to limit what software or operating systems I can run on my purchased hardware.
Re: (Score:2)
Yes, IBM shipped one server with MXT memory compression.
SoftRAM (Score:3, Insightful)
SoftRAM claimed to do this, but the product didn't do anything except report to the user that it was doing something.
I didn't realize there were similar products that actually worked; I thought the whole concept was snake oil.
Re:SoftRAM (Score:5, Informative)
Well, it can work -- sort of. You lose CPU cycles to do the compression and decompression. You have to use part of your uncompressed memory to store the compression and decompression routines.
Not everything compresses very tightly. In the case of lots of audio, video, and still graphics formats you're trying to compress an already compressed format, and decompress it twice for working with it (once from the RAM compression, which barely compressed anything anyway, and once in the viewing or editing package) instead of once.
The thing is, if you're already tight on memory, using a good portion of it to double what's left may not gain you much. Something like this should cost a lower percentage of memory then on a 1 megabyte or 4 megabyte system, because the compression and decompression itself shouldn't need to be that much bigger than it was ten or twenty years ago.
The thing is, even if you're getting double the effective RAM, you're still burning all kinds of cycles if you're doing this on the CPU. If you're doing it in the controller, you'd better be able to do it faster than the CPU needs the bits, because memory throughput is already the bottleneck for most applications.
People put more memory into their machines for better performance. The virtual memory swapping to disk/flash is a problem long solved, after all. For this scheme to be worthwhile, several things have to work out:
Re: (Score:2)
You're not just looking at media encoding, but also media decoding. That's a pretty common task these days. Hell, media encoding is pretty common, too. YouTube is evidence of that.
The typical hand-held computing device these days is the cell phone. Voice calls, multimedia messaging, video recording using the camera, and mobile video streaming (like VCast) all have to do with encoding and/or decoding streams of data.
Database applications can search through thousands of rows even in something as simple as an
Re: (Score:2)
SoftRAM/SoftRAM95 was non-diluted snakeoil. [ddj.com]
RAM Doubler for the Mac was a real and non-snakeoil product. That was mostly due to the "classic" Mac OS doing a horrible job of managing memory, so the potential for improvement was huge.
There were similar products available for win3.x and Win9x, some of which at least tried to do what they advertised. The performance benefit of using them (at least the RAM-compression [ora.com]) was pretty much non-existent though.
Linux / OS X / WinNT already has quite decent virtual memor
Not free (Score:3, Insightful)
All you have to say is... (Score:1)
Old tech, with one glaring flaw... (Score:3, Interesting)
Memory compression had one major drawback, aside from CPU use (which I suspect we would notice less today, with massively more powerful CPUs which tend to sit at 5-10% load the vast majority of the time)... It makes paging (in the 4k sense, not referring to the pagefile) into an absolute nightmare, and memory fragmentation goes from an intellectual nuissance that only CS majors care about, to a real practical bottleneck in performance. Consider the behavior of a typical program - Allocate a few megs on startup and zero it out - That compresses down to nothing. Now start filling in that space, and your compression drops from 99.9% down to potentially 0%.
Personally, I think it could work as an optional (to programs) OS-level alternative to memory allocation... The programmer can choose to use slightly slower compressed memory where appropriate (loading 200MB of textual customer data, for example), or full-speed uncompressed memory by default (stack frames, hash tables, digital photos, etc).
Well for a mobile device (Score:2)
It might work. Since you are talking about a system where you design all the hardware and customize the OS, might be something that isn't as problematic to integrate. Also might be perfectly possible to do as you suggest and only compress some things.
That seems to be the market they are talking about. I doubt such a product would seriously have much of a desktop market today. I mean you are going to have a hard time convincing someone to spend $50 on RAM doubling software when $50 or less gets you 2GB of ph
Cautionary Note (Score:2)
Others have noted that this is old technology, but I will point out that if you overload your doubled-RAM for long periods of time, you risk cerebral hemmorage and data loss.
Fools, download more! (Score:3, Funny)
Re: (Score:3, Funny)
I've GOT to show this to our CEO. I will get promoted for sure!
Uh? (Score:2)
IBM MXT (Score:2)
There have been such products before:
see http://www.pcmag.com/article2/0,2817,54458,00.asp [pcmag.com]
The funny thing is that this technology has not hit the mainstream PC market yet. Maybe it was not quite as successful as expected?
Re: (Score:3, Interesting)
If you d
I remember that... (Score:3, Funny)
You just use a hole punch on your page file, and you can write to it from the other side!
Embedded Systems Only (Score:2, Insightful)
It appears that the key phrase here is "embedded systems".
FTA, they appear to be making use of the regularity of certain patterns of data found commonly in embedded systems, and tailoring their compression algorithm to it.
I'm not sure that it is really a great feat to engineer a special-purpose compression algorithm that out-performs general-purpose algorithms.
Re: (Score:3, Insightful)
Notice 99% of the posts ignore that crucial word.
In the larger context around this issue, "embedded" means "mass produced" means "tremendous pressure to reduce per-unit costs" means "cheaper parts, plzzz!" means "32k chip is much better than 64k chip".
So that's what's going on here. It's not about the $3000 car super-radio. It's about the millions of $14-at-cost standard basic installed AM/FM radios. Or processing units in a handheld computer, watch, or game, etc. Or whatever.
The Black Viper (Score:2)
Linux RAM Compression? (Score:2)
I was talking about this a few weeks ago with friends, and some of thought there used to be RAM compression support for linux, but none could find it. Anybody?
Also of discussion was whether the relative speed of the CPU would improve RAM access time, or whether DMA trumps it.
For free what? (Score:2)
For free hat? For free Tibet? For free space?
Sorry, pet peeve. You wouldn't say "for expensive", so don't say "for free".
The kernel I'm running already does this (Score:1)
Here [google.com]. This is basically a ramdisk with lzo (or something; I'm too lazy to check) to be used as swap space.
The neat thing about doing it this way is that you can just set vm.swappiness=100 and let the kernel juggle stuff between fast and slow areas however it thinks best. It automatically resizes the ramdisk to whatever size it needs. I think I've seen something that does compression on the disk cache too - that might be better but it all depends on usage patterns.