Is Video RAM a Good Swap Device? 235
sean4u writes "I use a 'lucky' (inexplicably still working) headless desktop PC to serve pages for a low-volume e-commerce site. I came across a gentoo-wiki.com page and this linuxnews.pl page that suggested the interesting possibility of using the Video RAM of the built-in video adapter as a swap device or RAM disk. The instructions worked a treat, but I'm curious as to how good a substitute this can be for swap space on disk. In my (amateurish) test, hdparm -t tells me the Video RAM block device is 3 times slower than the aging disk I currently use. If you've used this technique, what performance do you get? Is the poor performance report from hdparm a feature of the hardware, or the Memory Technology Device driver? What do you use to measure swap performance?"
AGP or PCI-Express (Score:5, Informative)
Re: (Score:2)
What would be the bottlenecks if he was using PCI video RAM?
Re: (Score:2)
Re: (Score:2)
Re:AGP or PCI-Express (Score:5, Informative)
Re: (Score:3, Insightful)
If it's lower bandwidth and higher latency than the rest of system memory, then it makes perfect sense to NOT use it as primary storage but as a secondary storage. Currently, swap is the only straightforward mechanism Linux offers for doing so.
Re: (Score:3, Informative)
Re: (Score:3, Informative)
However (Score:2, Informative)
Re: (Score:2)
No, nearly all CURRENT built-in adapters use system memory. In the past (think pentium/pentium II era) most on-board graphics adapters used dedicated vram soldered to the motherboard. The OP said that it was older hardware - it's reasonable to assume that it is separate memory rather than assume he is using system memory.
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Informative)
The PCI bus and its derivatives (AGP, PCI-X, PCI-Express) is basically send-only. If you try to fetch across it, things can't burst properly, and access is very slow. Most PCI and AGP graphics cards don't have proper DMA engines for sending, so the CPU has to fetch. Many PCI-Express graphics cards can be told to do DMA.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I'd Say...Neither (Score:3, Funny)
Re: (Score:2, Informative)
Re: (Score:2)
The bus defines that, and the onboard chip still uses either bus.
"Neither" is only a valid option if it is really old hardware and uses yet another bus (VESA? EISA?)... but I haven't seen those in... 10 years by now?
Re:I'd Say...Neither (Score:5, Insightful)
I think one of the points of confusion here seems to be that most people don't realize that while something is built into a motherboard it doesn't have some magical interface that makes the bits fly differently than if it was in a slot. I think that is what is attempted to be said by the multiple posts this comment has generated
Built in still uses the bus.... (Score:2)
Re:Built in still uses the bus.... (Score:5, Informative)
I think the differences might be as noticeable as turning DMA (direct memory access) on and off. And yes, you can see a big bit of difference. It was actually worth me buying new drives just to have DMA access when it first started becoming available. I remember earlier versions of windows 98 and (95 I think), that wouldn't turn it on by default. After making sure the drives supported it and enabling it, people would almost think they had a new computer. There was that much of a difference in performance.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
most i have everseen was the dell c400 latop which has 64mb of onboard (non shared) video memory... now if only it had had a better video chipset than the cheep intel one..
Re: (Score:2)
Certainly the actual swap performance would be greatly affected by where the swap file is on the disk, and what else the disk is used for. If the disk is being called on to do other things at the same time the swap performance would take a huge hit. Then there's the matter of just how often he needs to use the swap, and how much is needed. If the system simply is over-taxed and has too litt
Re: (Score:3)
Re: (Score:3, Informative)
Re: (Score:2)
Tom
Re: (Score:2)
i have this lucky/still working... (Score:2, Interesting)
I also have this 12-15 year-old beater of an Intergraph box. i think it has like 64MB of vram.
Just 'cause it's old doesn't mean it was or is lame.
That beater of an Onyx can still thrash your SLi.
And the Intergraph's video card was EISA or microchannel, i can't remember which.
But mostly i'm just pointing out corner cases because other repliers to the parent felt it necess
Probably a good idea, provided you have PCIe (Score:5, Informative)
PCIe will likely give you performance more in-line with main memory (most implementations now are hitting 1-2 GB/s).
Re: (Score:2)
It's hard to imagine a PCIe card being slower.
Re: (Score:2)
Re:Probably a good idea, provided you have PCIe (Score:5, Informative)
And I used to regularly get sustained 25-30 MB/s from single drives (40 GB or so) on ATA 33 interfaces. Going to ATA 66, 100 or 133 may increase the speed when hitting the on-drive cache, but the drives themselves usually can't go that fast. How fast are the fastest IDE drives nowadays for sustained, sequential transfers -- 50 MB/s or so?
Re: (Score:2)
Some reach >60. So even UDMA66 would limit them non-trivially.
Re: (Score:2)
I've seen newer 500GB IDE drives do 80MB/s at the start of the drive.
IDE and SATA are still equivalent for sequential transfers. Until drives reach sustained 100MB/s or 133MB/s, IDE won't be a bottleneck in that regard.
Re: (Score:2)
Re:Probably a good idea, provided you have PCIe (Score:5, Informative)
On a current-model 7200RPM SATA drive, you can expect to see around 80MB/sec at the outer edge of the disk. And the rule of thumb is, you see half that at the inner edge, and three-quarters in the middle. So call it a (nearly) guaranteed 40MB/sec, and an average of 60MB/sec.
These are not hard-and-fast numbers, but it's a pretty good estimate for a modern drive.
Re: (Score:2)
I hope so, because that's where I like to keep my swap partition.
Your answer doesn't need to be universal, I use Seagate drives almost exclusively. Most often FC or LVD SCSI, often sun-branded.
Re:Probably a good idea, provided you have PCIe (Score:4, Insightful)
I hope so, because that's where I like to keep my swap partition.
IIRC, NTFS has some of its main data structures in the middle of the partition for that reason.
Re: (Score:2)
Sustained transfer is a function of data density and spin rate - it's all about the rate at which bytes travel under the read heads. Spin rates are more or less fixed by mechanical limitations, with consumer disks running at 7.2k RPM, and high-end disks running at up to 16k, so you'll get about twice the sustained transfer rate from the really expensive models.
Aside from that, it basically depends on the size o
Re: (Score:2)
size (Score:4, Informative)
Re:size (Score:5, Informative)
If it is really old it may be running one one of the early Intel Pentium Triton chipsets. The TX will not cache any memory above 64 and the HX needs to be reconfigured to cache above 64. Even after reconfiguration it will just about work for 512MB. There are other similar vagaries related to most old hardware. Ali depending on release and version tanks at 384 or 768 and so on. Even chipsets as recent as Intel 815e while capable of 2G were deliberately bastardised to support only 512MB in order not to undercut the inexistent market for high-end Rambus/i810 based workstations.
So there are quite a few cases where it is more cost effective to use an old and long past its hayday high end video card as a swap device. All the way up to around 2001-2002. From there onwards nearly everything supported sane memory sizes so it is pointless.
Re: (Score:2)
Re: (Score:3, Informative)
Howeve
Re: (Score:2)
In that case, the best advice would be "Go into the BIOS and dial it down to 1MB."
Re: (Score:2)
Maybe, but need GPU specs (Score:5, Interesting)
But there is a fundamental problem: vidRAM is optimized for writes from main RAM. Not reads. In many cases, reading vidram is extremely slow because the raster generator is busy reading it. Writes are buffered. Reads cannot be.
Re: (Score:2)
But, given the story that the machine is a server running an e-commerce website that, presumably, should be up most of the time, I would suggest buying a cheap entry level server from Dell with enough memory and just forgetting about it. Running a server on luck generally indicates someone will have to deal with an emergency migration on
Re: (Score:2)
Seriously, you write as though your brain hasn't been used since the heyday of Windows 3.1. Go learn what a real operating system is these days.
Performance != Stability (Score:5, Insightful)
Re: (Score:2)
This is more an exercise for novelty/home enthusiasts.
Are you looking at the right timings? (Score:5, Informative)
One of the biggest advantages of using VRAM for disks is the nearly 0 seek latency.
As a result even if the card is slower than disk on read you are still likely to have an overall performance gain.
In addition to that there is a number of architectural vagaries to consider. AGP is asymmetric. Reading is considerably slower than writing (can't find anywhere by how much. Damn...).
video RAM (Score:5, Insightful)
Re: (Score:2)
Re: (Score:3, Informative)
Not always. For example, I have a machine that has 32MB of video RAM, and can use additional system RAM if necessary.
yeah, I know it means no screen (Score:5, Funny)
Re: (Score:2, Funny)
There, fixed that for you.
Re: (Score:2)
Useful even if not so fast (Score:4, Insightful)
Heck, I remember RAM expansion cards for ISA slots. I'm sure this is faster, though I didn't get any meaningful boost when I tried this once. Nevertheless, if you're running headless system, it's better IMHO if you get some use of the display hardware, rather than no use. Even if it's a little slow. You shouldn't rely on swap as a memory expansion anyway, it's just a way to gracefully degrade performance when you hit the limit.
I think it's also nice to have swap on a different physical device/bus from your main hard drive. Maybe the swap isn't any faster, but at least it isn't slowing down any other hard drive usage.
Re: (Score:2)
Re: (Score:3, Informative)
Oh, a friend of mine had a rather rich father who bought such a thing for a 286. Don't know how they used it exactly, as a RAM drive, swap space or even some sort of main memory (I always thought the latter, but that seems extremely unlikely for a 286).
It was main memory, but back then we had the XMS/EMS hack instead of flat memory.
http://en.wikipedia.org/wiki/Expanded_Memory_Specification#Expansion_boards [wikipedia.org]
Works even better with really old video cards (Score:5, Funny)
Now if you want truly blazing speed, you can track down some of that dual-ported static RAM that came in 40-pin DIPs. Full random access on both ports would let you serve dynamic web pages while you run customer transactions, all with zero wait states on the ISA bus!
Optimizations are necessary (Score:3, Insightful)
Memory architecture on a GPU is very different from system memory. Memory there is not linear and the video memory controller will go through a lot of remapping to present it as such, something that's probably very slow because of the VBIOS. Then there's the issue of tuning the bus so that reads and writes are using its full bandwidth, and again a poor VBIOS implementation may be the bottleneck.
The best but harder solution would be to have a means to program the video memory controller directly to map pages of system memory and do all the copying and moving itself. Of course, this is hardly ever going to happen, but some improvements can still make it into the VBIOS, some of which will probably happen once GPGPU-style programming starts getting more attention as both nVidia and AMD/ATI are seemingly interested in pushing with things like CUDA [nvidia.com] and Stream Computing [amd.com].
The concept as it is now, however, remains extremely cool. It might still be orders of magnitude slower in terms of latency and throughput compared to system memory, but it should be a lot more responsive than a hard drive just because there are no seek times involved. That said, hdparm -t may not be the best tool for measuring performance, so i'd be more interested in a random access benchmark since it may make some use of the parallel memory architecture inherent on a video card.
Ha! (Score:3, Funny)
Use a SATA Ramdisk... (Score:2, Informative)
It works wonderfully for Windows swap file! (and better still for Photoshop/Premiere swapfile) It is limited to 4GB (draws power from the PCI bus) and it is driver-less. (works with ANY PC motherboard supported OS)
It connects to the PC using a SATA1 connection (but a continuous 1.5 Gb/s is still better than most HDDs) and it uses 4x 1024MB DDR1 RAM Modules.
There is a future 8GB DDR2 SATA2 3.0 Gb/s model (allegedly) coming out soon that fits in a 5.25
DMA? (Score:3, Informative)
There isn't enough memory on a videocard to really make it worth doing the right way either. Mine only has 256M which is not that big compared to systems that have 1G to 2G of system RAM and 1-4G of swap.
Re: (Score:2)
PS3 VRAM Swap (Score:4, Interesting)
Just misinformed (Score:4, Interesting)
Re:Just misinformed (Score:4, Insightful)
Video RAM is designed for performance, not for stability. If a bit flips in your video RAM, a pixel is going to be bad or a texture will be slightly different. You're not going to notice.
A bit flip in your swap space (or main RAM), now that is something you really don't want to happen....
Re: (Score:3, Insightful)
Although, I can only imagine the senior engi
Re: (Score:2)
What if that bit flipped is in the data structure that defines a collection of of surfaces? How about when the bit flipped is in header data for a texture image? A 100x100 pixel texture now reads as 2,147,483,748x100 pixels. Wonder what happens when that corrupted texture is swapped out for another texture. Ouch.
Yea. I'm sure you'll hardly noticed when the rendered scene look
Re: (Score:2)
I still doubt the RAM is overly faulty (if it's even separate from system RAM) since most video memory is soldered in place. A random bit in an image flipping is one thing, but enough of those and it's a whole card or motherboard being RMAed and not just a chip. Reliability would still seem to b
Re: (Score:2)
Re: (Score:3, Interesting)
Swap is more than that... It also allows you to recover a machine that runs out of memory due to a runaway process. Login remotely won't work if no process gets memory anymore, so you can't kill the runaway process. With swap, you'll be able to log in, kill the process and recover the machine. That said, it won't be fast, but at least you've got an option.
Read up on Virtual Memory [wikipedia.org], because there is much more behind it that just "dumping memory that's not used to disk".
Re:Just misinformed (Score:5, Informative)
Bzzt - wrong.
Even high-end systems use swap space, because it allows for swapping out parts of memory that isn't called, freeing up that memory for things like disk cache, which does have a positive effect.
Doing "free" on a system here, I see that there's 886492 kB of free memory, of which 879896 kB is used for disk cache. 72892 kB is swapped to disk, and if there were no swap, the disk cache would have been that much smaller. Even if I had umpteen gigabytes of RAM free, that still would be 70 MB of extra cache by using a swap partition. That's a Good Thing.
What's a Bad Thing is when swap is used because you run low on memory -- then you get trashing and a seriously slow system. But on a healthy system with enough free memory, where the kernel can swap out pages not because it has to, but because it makes sense, using swap is a Good Thing.
Re:Just misinformed (Score:4, Insightful)
Swap is great for a server or workstation, once set on a single task it needs never do anything else till shut down, but for a windows PC that could at any time have anything run on it (not to mention a sub-standard disk cache system) having parts swapped out to make room for a disk cache that doesn't do a whole awful lot is less than optimal.
This is of course the point where you point out that converting all your junk to windows vista and training all your staff to use the new office 2007 "ribbon" is about the same cost as training them to use linux and OOo, the latter being a lot cheaper too
Re:Just misinformed (Score:5, Interesting)
Uh... no. I've heard that argument before, but I don't buy it. An ideal system is one that doesn't page anything out to disk. In fact, I make it a point to always have enough memory that my pageout count does not increase during normal use. As soon as you page out anything, you're taking a performance hit. Period. Paging out data to disk in order to make room for disk cache is almost never a good idea, as the changes of needing to later access a data page page in an application are typically far greater than the chances of reusing a randomly read block on disk.
A disk cache (a limited amount of readahead notwithstanding) is only useful for data that is used more than once, which makes it a highly transient data store. Storing most data in cache longer than a few minutes usually doesn't buy you anything in terms of performance because if data isn't reused fairly soon after initial use, odds are it won't ever be.
By contrast, data explicitly loaded into RAM by an application (assuming the app is reasonably well written) is in memory for a reason, and if the data were transient, the app would have repeatedly reused a single chunk of temporary storage instead of keeping the data around. The odds that any data won't ever be used again should be vanishingly small unless an app is written poorly. For example, in a word processor, the majority of memory pages associated with a file will probably get touched when you save changes to disk even if you never actually scroll to the end of the file. Yes, there are ways to avoid that by manually organizing your data structures in memory, but it usually doesn't make sense to optimize memory organization that heavily.
In any case, regardless of memory organization, it is safe to assume that the vast majority of application data pages (not the actual executable code pages) will be reused at some point in the future. As such, paging out any of this data will require that the data be paged back in at some point, causing a noticeable stall for the user. This is slightly less significant for background daemons, but still true.
Thus, cache is a great example of the principe of diminishing returns. Doubling cache does not necessarily double the benefits. Once cache gets to a certain point, doubling it no longer significantly increases the number of additional hits in the cache. Every increase beyond that point will likely hurt performance by increasing the management overhead without actually increasing the number of successful hits.
Indeed, the only thing that makes sense to not keep in core is infrequently used code text, but that can be thrown away without ever paging it out; it can always be paged back in from the original executable if needed. Even then, an optimal system should not throw out anything except in small physical memory configurations. If you don't have enough RAM, increasing the inherently small disk cache by a small amount actually will result in a significant increase in hits and throwing out infrequently used code pages won't result in a significant performance penalty by comparison. In a large memory configuration, increasing the amount of disk cache won't have much benefit at all, and throwing out those pages probably has a much greater chance of resulting in a performance hit than throwing out a previously read random block on disk; if a page has been used once, the odds are better that it will be used again.
Note: this all assumes that your OS is smart enough to only load in application pages as they are used rather than loading the entire app in at launch. If it isn't, then you have bigger problems, of course.... :-)
Re: (Score:3, Interesting)
That depends on how you use your computer. Think about at a programmer's workload. You start the compiler, it processes your code along with a ton of other files it depends on, then you get the results. And usually a few minutes later you'll have another build
Re:Just misinformed (Score:4, Interesting)
That's true, but only because a programmer's usage patterns are highly atypical. Typical usage patterns for normal users do not involve lots of short-lived processes that read in a chunk of data, process it, and exit; the UNIX way of programming (small tools with pipes between them) just didn't catch on in the general computing space, and for good reason---it's an excellent design for programmer tools, but a poor design for end user tools because users generally prefer to work on a task to completion.
Indeed, it could be argued that this is the result of a toolchain that is not well designed. One could easily imagine an IDE that memory maps the source tree and binary build results into RAM, sharing those pages with the short-lived compiler. This would end up being a much faster workflow because instead of having to go through filesystem lookups and read the blocks (which is relatively slow even from cache), the compiler would simply be handed the data it needs, or at least a series of VM pages that will contain the needed content after it is paged in. You would pay the penalty once at launch time (or better yet, defer mapping until the data from a particular file is needed) and never pay the lookup penalty again. That would make it more in line with a typical user's usage patterns.
The typical computer user (not programmer) reboots their computer every few days, whether because they need to swap batteries in a laptop, because the kernel is leaking memory, etc. For them, that gives a fairly short upper bound to the lifespan of data in the cache. They typically run an application that loads or memory maps a file into RAM, work with it for a period of minutes or even hours, then close the file and work on something else. They don't open the file and close it repeatedly, and neither do their applications. That's a very degenerate usage pattern.... :-)
Not at all. If you double the size of the cache, you double the size of the data structure that maintains the cache. If you are looking up a block in a tree, for example, that means the average time to do the lookup will be log_2(2n) instead of log_2(n). While in theory, that's not a big difference, in practice, that translates to an average of one extra tree node before you get to the node you're looking for. Multiply that times the number of cache lookups, and it adds up. Such a hit must be justified by the increased size of the cache. If the computer reboots before that data is needed again, you've just made every disk operation take an extra few dozen CPU cycles with no benefit.
Put another way, if you have 512 megs of cache and are working with 512KB blocks, you have 1024 blocks cached, and your tree depth is 10 nodes (log_2 of 1024). If you double the size of the cache, you are adding one extra hop, so you have increased the amount of cache lookup time by an average of 10% for every lookup, including those that get served from the cache. That means that your cache is now effectively 10% slower than it was before. If most of your data is already cached, that's a pretty huge impact, and if you are pulling data from cache frequently, it can easily exceed the gains you'd get from occasionally saving a disk read.
Also, bear in mind that there is a psychological advantage to using the smaller cache in such a case. Pausing once a minute for a full second to read bits in from disk is far less annoying to the user than adding a .1 second delay every time the user pulls down a
Re: (Score:2)
What he said. I'm 184MB into swap on a mostly-idle system with 120MB free out of 1280MB installed. That's a good thing because that old data isn't competing for RAM with new processes.
Re: (Score:2)
Bzzztt wrong.
Given effectively unlimited RAM for the task at hand swap won't matter even a little bit.
Bzzzt yourself (Score:2)
No more juddering, no more half-burnt CDs because you dared to move the mouse around while it was burning
Guess which feature they took out of XP?
That's right - the ability to limit the size of the disk cache!
So we're back to using stone-age machines which grind to a halt for half a minute
Re: (Score:2)
In fact, Level 2 cache is for... Oh well, nevermind.
Re: (Score:2)
Because your idea of "unused" and Linux's idea of "unused" don't always match up well.
Re: (Score:2)
Re: (Score:3)
Re: (Score:3, Funny)
You are not very good at this. People on slashdot are more or less immune to trolls that use racial slurs. In troll 101, you should have at least learned to disparage an OS or programming language if you really want to rile people up. This is a good troll, that is also topical:
"Linux is not a very good OS to use for swapping to the video card. It's video bus support is hopelessly dated and slow, though you can use the experimental driver if you patch the kernel."
That simple statement will get you
Re: (Score:3, Insightful)
Besides, Slashdotters have never bought the "why are you running if you have nothing to hide" argument.
Re:Quote: Direct Rendering or fast swap. Your choi (Score:2)
He also stated the videocard was built in, so he cant even put it to good use in another machine.
Re: (Score:2)
Re:Short answer: (Score:5, Insightful)
You seem to be advocating wasting perfectly good VRAM in favor of buying more system RAM. If the VRAM is essentially free (ie. comes with the system no matter what), there is no good reason not to try to put it to good use.
Also, your "No" is completely unqualified. You offer no details of how VRAM performs worse as swap space than hard drives, let alone actual benchmarks or citations. (And I have the feeling that most graphics memory would be significantly better than your average IDE hard drive for swapping.)
Mod parent overrated.
Re: (Score:2)
Video RAM is used to store everything, including shader programs. Card makers don't like having their hardware getting a reputation for crashing peoples games when a shader divides by zero. Plus, RAM manufacturers make a commodity, they don't have a bin of "low grade flakey RAM that will just go into video cards". Well, maybe Acer does.