Slashdot Log In
Is Video RAM a Good Swap Device?
Posted by
kdawson
on Thu Oct 11, 2007 10:37 AM
from the it's-there-why-not-use-it dept.
from the it's-there-why-not-use-it dept.
sean4u writes "I use a 'lucky' (inexplicably still working) headless desktop PC to serve pages for a low-volume e-commerce site. I came across a gentoo-wiki.com page and this linuxnews.pl page that suggested the interesting possibility of using the Video RAM of the built-in video adapter as a swap device or RAM disk. The instructions worked a treat, but I'm curious as to how good a substitute this can be for swap space on disk. In my (amateurish) test, hdparm -t tells me the Video RAM block device is 3 times slower than the aging disk I currently use. If you've used this technique, what performance do you get? Is the poor performance report from hdparm a feature of the hardware, or the Memory Technology Device driver? What do you use to measure swap performance?"
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
AGP or PCI-Express (Score:5, Informative)
Re:AGP or PCI-Express (Score:5, Informative)
Parent
Re:Built in still uses the bus.... (Score:5, Informative)
I think the differences might be as noticeable as turning DMA (direct memory access) on and off. And yes, you can see a big bit of difference. It was actually worth me buying new drives just to have DMA access when it first started becoming available. I remember earlier versions of windows 98 and (95 I think), that wouldn't turn it on by default. After making sure the drives supported it and enabling it, people would almost think they had a new computer. There was that much of a difference in performance.
Parent
Re:I'd Say...Neither (Score:5, Insightful)
I think one of the points of confusion here seems to be that most people don't realize that while something is built into a motherboard it doesn't have some magical interface that makes the bits fly differently than if it was in a slot. I think that is what is attempted to be said by the multiple posts this comment has generated
Parent
Probably a good idea, provided you have PCIe (Score:5, Informative)
PCIe will likely give you performance more in-line with main memory (most implementations now are hitting 1-2 GB/s).
Re:Probably a good idea, provided you have PCIe (Score:5, Informative)
And I used to regularly get sustained 25-30 MB/s from single drives (40 GB or so) on ATA 33 interfaces. Going to ATA 66, 100 or 133 may increase the speed when hitting the on-drive cache, but the drives themselves usually can't go that fast. How fast are the fastest IDE drives nowadays for sustained, sequential transfers -- 50 MB/s or so?
Parent
Re:Probably a good idea, provided you have PCIe (Score:5, Informative)
On a current-model 7200RPM SATA drive, you can expect to see around 80MB/sec at the outer edge of the disk. And the rule of thumb is, you see half that at the inner edge, and three-quarters in the middle. So call it a (nearly) guaranteed 40MB/sec, and an average of 60MB/sec.
These are not hard-and-fast numbers, but it's a pretty good estimate for a modern drive.
Parent
Re:Probably a good idea, provided you have PCIe (Score:4, Insightful)
I hope so, because that's where I like to keep my swap partition.
IIRC, NTFS has some of its main data structures in the middle of the partition for that reason.
Parent
size (Score:4, Informative)
Re:size (Score:5, Informative)
If it is really old it may be running one one of the early Intel Pentium Triton chipsets. The TX will not cache any memory above 64 and the HX needs to be reconfigured to cache above 64. Even after reconfiguration it will just about work for 512MB. There are other similar vagaries related to most old hardware. Ali depending on release and version tanks at 384 or 768 and so on. Even chipsets as recent as Intel 815e while capable of 2G were deliberately bastardised to support only 512MB in order not to undercut the inexistent market for high-end Rambus/i810 based workstations.
So there are quite a few cases where it is more cost effective to use an old and long past its hayday high end video card as a swap device. All the way up to around 2001-2002. From there onwards nearly everything supported sane memory sizes so it is pointless.
Parent
Maybe, but need GPU specs (Score:5, Interesting)
But there is a fundamental problem: vidRAM is optimized for writes from main RAM. Not reads. In many cases, reading vidram is extremely slow because the raster generator is busy reading it. Writes are buffered. Reads cannot be.
Performance != Stability (Score:5, Insightful)
Are you looking at the right timings? (Score:5, Informative)
One of the biggest advantages of using VRAM for disks is the nearly 0 seek latency.
As a result even if the card is slower than disk on read you are still likely to have an overall performance gain.
In addition to that there is a number of architectural vagaries to consider. AGP is asymmetric. Reading is considerably slower than writing (can't find anywhere by how much. Damn...).
video RAM (Score:5, Insightful)
yeah, I know it means no screen (Score:5, Funny)
Useful even if not so fast (Score:4, Insightful)
Heck, I remember RAM expansion cards for ISA slots. I'm sure this is faster, though I didn't get any meaningful boost when I tried this once. Nevertheless, if you're running headless system, it's better IMHO if you get some use of the display hardware, rather than no use. Even if it's a little slow. You shouldn't rely on swap as a memory expansion anyway, it's just a way to gracefully degrade performance when you hit the limit.
I think it's also nice to have swap on a different physical device/bus from your main hard drive. Maybe the swap isn't any faster, but at least it isn't slowing down any other hard drive usage.
Works even better with really old video cards (Score:5, Funny)
Now if you want truly blazing speed, you can track down some of that dual-ported static RAM that came in 40-pin DIPs. Full random access on both ports would let you serve dynamic web pages while you run customer transactions, all with zero wait states on the ISA bus!
PS3 VRAM Swap (Score:4, Interesting)
Just misinformed (Score:4, Interesting)
Parent
Re:Just misinformed (Score:4, Insightful)
Video RAM is designed for performance, not for stability. If a bit flips in your video RAM, a pixel is going to be bad or a texture will be slightly different. You're not going to notice.
A bit flip in your swap space (or main RAM), now that is something you really don't want to happen....
Parent
Re:Just misinformed (Score:5, Informative)
Bzzt - wrong.
Even high-end systems use swap space, because it allows for swapping out parts of memory that isn't called, freeing up that memory for things like disk cache, which does have a positive effect.
Doing "free" on a system here, I see that there's 886492 kB of free memory, of which 879896 kB is used for disk cache. 72892 kB is swapped to disk, and if there were no swap, the disk cache would have been that much smaller. Even if I had umpteen gigabytes of RAM free, that still would be 70 MB of extra cache by using a swap partition. That's a Good Thing.
What's a Bad Thing is when swap is used because you run low on memory -- then you get trashing and a seriously slow system. But on a healthy system with enough free memory, where the kernel can swap out pages not because it has to, but because it makes sense, using swap is a Good Thing.
Parent
Re:Just misinformed (Score:4, Insightful)
Swap is great for a server or workstation, once set on a single task it needs never do anything else till shut down, but for a windows PC that could at any time have anything run on it (not to mention a sub-standard disk cache system) having parts swapped out to make room for a disk cache that doesn't do a whole awful lot is less than optimal.
This is of course the point where you point out that converting all your junk to windows vista and training all your staff to use the new office 2007 "ribbon" is about the same cost as training them to use linux and OOo, the latter being a lot cheaper too
Parent
Re:Just misinformed (Score:5, Interesting)
Uh... no. I've heard that argument before, but I don't buy it. An ideal system is one that doesn't page anything out to disk. In fact, I make it a point to always have enough memory that my pageout count does not increase during normal use. As soon as you page out anything, you're taking a performance hit. Period. Paging out data to disk in order to make room for disk cache is almost never a good idea, as the changes of needing to later access a data page page in an application are typically far greater than the chances of reusing a randomly read block on disk.
A disk cache (a limited amount of readahead notwithstanding) is only useful for data that is used more than once, which makes it a highly transient data store. Storing most data in cache longer than a few minutes usually doesn't buy you anything in terms of performance because if data isn't reused fairly soon after initial use, odds are it won't ever be.
By contrast, data explicitly loaded into RAM by an application (assuming the app is reasonably well written) is in memory for a reason, and if the data were transient, the app would have repeatedly reused a single chunk of temporary storage instead of keeping the data around. The odds that any data won't ever be used again should be vanishingly small unless an app is written poorly. For example, in a word processor, the majority of memory pages associated with a file will probably get touched when you save changes to disk even if you never actually scroll to the end of the file. Yes, there are ways to avoid that by manually organizing your data structures in memory, but it usually doesn't make sense to optimize memory organization that heavily.
In any case, regardless of memory organization, it is safe to assume that the vast majority of application data pages (not the actual executable code pages) will be reused at some point in the future. As such, paging out any of this data will require that the data be paged back in at some point, causing a noticeable stall for the user. This is slightly less significant for background daemons, but still true.
Thus, cache is a great example of the principe of diminishing returns. Doubling cache does not necessarily double the benefits. Once cache gets to a certain point, doubling it no longer significantly increases the number of additional hits in the cache. Every increase beyond that point will likely hurt performance by increasing the management overhead without actually increasing the number of successful hits.
Indeed, the only thing that makes sense to not keep in core is infrequently used code text, but that can be thrown away without ever paging it out; it can always be paged back in from the original executable if needed. Even then, an optimal system should not throw out anything except in small physical memory configurations. If you don't have enough RAM, increasing the inherently small disk cache by a small amount actually will result in a significant increase in hits and throwing out infrequently used code pages won't result in a significant performance penalty by comparison. In a large memory configuration, increasing the amount of disk cache won't have much benefit at all, and throwing out those pages probably has a much greater chance of resulting in a performance hit than throwing out a previously read random block on disk; if a page has been used once, the odds are better that it will be used again.
Note: this all assumes that your OS is smart enough to only load in application pages as they are used rather than loading the entire app in at launch. If it isn't, then you have bigger problems, of course.... :-)
Parent
Re:Just misinformed (Score:4, Interesting)
That's true, but only because a programmer's usage patterns are highly atypical. Typical usage patterns for normal users do not involve lots of short-lived processes that read in a chunk of data, process it, and exit; the UNIX way of programming (small tools with pipes between them) just didn't catch on in the general computing space, and for good reason---it's an excellent design for programmer tools, but a poor design for end user tools because users generally prefer to work on a task to completion.
Indeed, it could be argued that this is the result of a toolchain that is not well designed. One could easily imagine an IDE that memory maps the source tree and binary build results into RAM, sharing those pages with the short-lived compiler. This would end up being a much faster workflow because instead of having to go through filesystem lookups and read the blocks (which is relatively slow even from cache), the compiler would simply be handed the data it needs, or at least a series of VM pages that will contain the needed content after it is paged in. You would pay the penalty once at launch time (or better yet, defer mapping until the data from a particular file is needed) and never pay the lookup penalty again. That would make it more in line with a typical user's usage patterns.
The typical computer user (not programmer) reboots their computer every few days, whether because they need to swap batteries in a laptop, because the kernel is leaking memory, etc. For them, that gives a fairly short upper bound to the lifespan of data in the cache. They typically run an application that loads or memory maps a file into RAM, work with it for a period of minutes or even hours, then close the file and work on something else. They don't open the file and close it repeatedly, and neither do their applications. That's a very degenerate usage pattern.... :-)
Not at all. If you double the size of the cache, you double the size of the data structure that maintains the cache. If you are looking up a block in a tree, for example, that means the average time to do the lookup will be log_2(2n) instead of log_2(n). While in theory, that's not a big difference, in practice, that translates to an average of one extra tree node before you get to the node you're looking for. Multiply that times the number of cache lookups, and it adds up. Such a hit must be justified by the increased size of the cache. If the computer reboots before that data is needed again, you've just made every disk operation take an extra few dozen CPU cycles with no benefit.
Put another way, if you have 512 megs of cache and are working with 512KB blocks, you have 1024 blocks cached, and your tree depth is 10 nodes (log_2 of 1024). If you double the size of the cache, you are adding one extra hop, so you have increased the amount of cache lookup time by an average of 10% for every lookup, including those that get served from the cache. That means that your cache is now effectively 10% slower than it was before. If most of your data is already cached, that's a pretty huge impact, and if you are pulling data from cache frequently, it can easily exceed the gains you'd get from occasionally saving a disk read.
Also, bear in mind that there is a psychological advantage to using the smaller cache in such a case. Pausing once a minute for a full second to read bits in from disk is far less annoying to the user than adding a .1 second delay every time the user pulls down a
Parent
Re:Short answer: (Score:5, Insightful)
You seem to be advocating wasting perfectly good VRAM in favor of buying more system RAM. If the VRAM is essentially free (ie. comes with the system no matter what), there is no good reason not to try to put it to good use.
Also, your "No" is completely unqualified. You offer no details of how VRAM performs worse as swap space than hard drives, let alone actual benchmarks or citations. (And I have the feeling that most graphics memory would be significantly better than your average IDE hard drive for swapping.)
Mod parent overrated.
Parent