Forgot your password?
typodupeerror
Data Storage Operating Systems Software Hardware

Is Video RAM a Good Swap Device? 235

Posted by kdawson
from the it's-there-why-not-use-it dept.
sean4u writes "I use a 'lucky' (inexplicably still working) headless desktop PC to serve pages for a low-volume e-commerce site. I came across a gentoo-wiki.com page and this linuxnews.pl page that suggested the interesting possibility of using the Video RAM of the built-in video adapter as a swap device or RAM disk. The instructions worked a treat, but I'm curious as to how good a substitute this can be for swap space on disk. In my (amateurish) test, hdparm -t tells me the Video RAM block device is 3 times slower than the aging disk I currently use. If you've used this technique, what performance do you get? Is the poor performance report from hdparm a feature of the hardware, or the Memory Technology Device driver? What do you use to measure swap performance?"
This discussion has been archived. No new comments can be posted.

Is Video RAM a Good Swap Device?

Comments Filter:
  • by redelm (54142) on Thursday October 11, 2007 @10:43AM (#20939913) Homepage
    This is certainly a clever idea for small amounts of swap (~256 MB). But to make it work well, you'd have to find the GPU commands for block moves from main RAM to vidRAM. That's the only way to activate the AGPx2 and higher modes.

    But there is a fundamental problem: vidRAM is optimized for writes from main RAM. Not reads. In many cases, reading vidram is extremely slow because the raster generator is busy reading it. Writes are buffered. Reads cannot be.

  • Just misinformed (Score:4, Interesting)

    by spun (1352) <loverevolutionary@@@yahoo...com> on Thursday October 11, 2007 @10:48AM (#20939983) Journal
    It doesn't come across as troll or offtopic, just misinformed. If you can swap out an unused page of code or data to provide more room for disk cache, why not do it? You should take a look at what your OS is actually doing with memory some time.
  • Re:Just misinformed (Score:3, Interesting)

    by Corporate Troll (537873) on Thursday October 11, 2007 @11:17AM (#20940423) Homepage Journal

    Swap is more than that... It also allows you to recover a machine that runs out of memory due to a runaway process. Login remotely won't work if no process gets memory anymore, so you can't kill the runaway process. With swap, you'll be able to log in, kill the process and recover the machine. That said, it won't be fast, but at least you've got an option.

    Read up on Virtual Memory [wikipedia.org], because there is much more behind it that just "dumping memory that's not used to disk".

  • Re:Just misinformed (Score:5, Interesting)

    by dgatwood (11270) on Thursday October 11, 2007 @12:36PM (#20941641) Journal

    Uh... no. I've heard that argument before, but I don't buy it. An ideal system is one that doesn't page anything out to disk. In fact, I make it a point to always have enough memory that my pageout count does not increase during normal use. As soon as you page out anything, you're taking a performance hit. Period. Paging out data to disk in order to make room for disk cache is almost never a good idea, as the changes of needing to later access a data page page in an application are typically far greater than the chances of reusing a randomly read block on disk.

    A disk cache (a limited amount of readahead notwithstanding) is only useful for data that is used more than once, which makes it a highly transient data store. Storing most data in cache longer than a few minutes usually doesn't buy you anything in terms of performance because if data isn't reused fairly soon after initial use, odds are it won't ever be.

    By contrast, data explicitly loaded into RAM by an application (assuming the app is reasonably well written) is in memory for a reason, and if the data were transient, the app would have repeatedly reused a single chunk of temporary storage instead of keeping the data around. The odds that any data won't ever be used again should be vanishingly small unless an app is written poorly. For example, in a word processor, the majority of memory pages associated with a file will probably get touched when you save changes to disk even if you never actually scroll to the end of the file. Yes, there are ways to avoid that by manually organizing your data structures in memory, but it usually doesn't make sense to optimize memory organization that heavily.

    In any case, regardless of memory organization, it is safe to assume that the vast majority of application data pages (not the actual executable code pages) will be reused at some point in the future. As such, paging out any of this data will require that the data be paged back in at some point, causing a noticeable stall for the user. This is slightly less significant for background daemons, but still true.

    Thus, cache is a great example of the principe of diminishing returns. Doubling cache does not necessarily double the benefits. Once cache gets to a certain point, doubling it no longer significantly increases the number of additional hits in the cache. Every increase beyond that point will likely hurt performance by increasing the management overhead without actually increasing the number of successful hits.

    Indeed, the only thing that makes sense to not keep in core is infrequently used code text, but that can be thrown away without ever paging it out; it can always be paged back in from the original executable if needed. Even then, an optimal system should not throw out anything except in small physical memory configurations. If you don't have enough RAM, increasing the inherently small disk cache by a small amount actually will result in a significant increase in hits and throwing out infrequently used code pages won't result in a significant performance penalty by comparison. In a large memory configuration, increasing the amount of disk cache won't have much benefit at all, and throwing out those pages probably has a much greater chance of resulting in a performance hit than throwing out a previously read random block on disk; if a page has been used once, the odds are better that it will be used again.

    Note: this all assumes that your OS is smart enough to only load in application pages as they are used rather than loading the entire app in at launch. If it isn't, then you have bigger problems, of course.... :-)

  • by conspirator57 (1123519) on Thursday October 11, 2007 @12:58PM (#20941951)
    giant, slow, purple monster of a computer that is like 10 years old. It has the name Onyx on the front. i think it has like 1GB of vram.

    I also have this 12-15 year-old beater of an Intergraph box. i think it has like 64MB of vram.

    Just 'cause it's old doesn't mean it was or is lame.

    That beater of an Onyx can still thrash your SLi.

    And the Intergraph's video card was EISA or microchannel, i can't remember which.

    But mostly i'm just pointing out corner cases because other repliers to the parent felt it necessary to trash old gear. remember, in 10 years today's gear will be bupkus.

  • Re:Just misinformed (Score:3, Interesting)

    by edwdig (47888) on Thursday October 11, 2007 @01:50PM (#20942609)
    Paging out data to disk in order to make room for disk cache is almost never a good idea, as the changes of needing to later access a data page page in an application are typically far greater than the chances of reusing a randomly read block on disk.

    That depends on how you use your computer. Think about at a programmer's workload. You start the compiler, it processes your code along with a ton of other files it depends on, then you get the results. And usually a few minutes later you'll have another build with more changes.

    By contrast, data explicitly loaded into RAM by an application (assuming the app is reasonably well written) is in memory for a reason, and if the data were transient, the app would have repeatedly reused a single chunk of temporary storage instead of keeping the data around. The odds that any data won't ever be used again should be vanishingly small unless an app is written poorly.

    Again, programmer's workflow. Take a program like Visual Studio. It'll read in and parse your entire program and all the headers it's dependent on. It'll store a tree in memory of all the symbols within the scope of the project. On a large project, only a very small portion of that symbol data will be useful to what you're currently working on, but the IDE has no way of knowing what code you're going to write, so it can't trim down that data.

    Thus, cache is a great example of the principe of diminishing returns. Doubling cache does not necessarily double the benefits. Once cache gets to a certain point, doubling it no longer significantly increases the number of additional hits in the cache. Every increase beyond that point will likely hurt performance by increasing the management overhead without actually increasing the number of successful hits.

    You probably need a ridiculous amount of RAM and an extreme habit of opening files once and never using them again before that would have a detectable impact.

    Indeed, the only thing that makes sense to not keep in core is infrequently used code text, but that can be thrown away without ever paging it out; it can always be paged back in from the original executable if needed. ...
    Note: this all assumes that your OS is smart enough to only load in application pages as they are used rather than loading the entire app in at launch.


    You're forgetting about one big factor. Code relocation. Any time code calls a shared library, the program loader has to fill in the appropriate memory address at load time. The code loaded does not directly match the code on disk anymore. In a C++ program, there's a LOT of relocations that need to be done. They're slow. Look into all the "Why does KDE take so long to start up?" stories if you don't believe me. The relocations have two side effects relevant to this discussion:

    1) A much larger portion of the executable gets loaded at startup than you would otherwise think.

    2) Sections of code with relocations can't simply be reread from disk and must instead be swapped as needed, unless you move the relocation logic from userspace and into the kernel.
  • PS3 VRAM Swap (Score:4, Interesting)

    by Doc Ruby (173196) on Thursday October 11, 2007 @02:56PM (#20943693) Homepage Journal
    Since the PlayStation 3 has only a small main memory that's hardwired and nonexpandable (Sony's lamest design decision of all), the Linux that runs on it is severely constrained. PS3 Linux is constantly swapping to compensate for the small memory. But the PS3 does have another small VRAM bank (that's extremely fast XDR). PS3 Linux hackers are working on using VRAM as swap, out of necessity. Their design analysis is probably instructive for anyone considering any platform's VRAM as swap.
  • Re:Just misinformed (Score:4, Interesting)

    by dgatwood (11270) on Thursday October 11, 2007 @03:43PM (#20944505) Journal

    That depends on how you use your computer. Think about at a programmer's workload. You start the compiler, it processes your code along with a ton of other files it depends on, then you get the results. And usually a few minutes later you'll have another build with more changes.

    That's true, but only because a programmer's usage patterns are highly atypical. Typical usage patterns for normal users do not involve lots of short-lived processes that read in a chunk of data, process it, and exit; the UNIX way of programming (small tools with pipes between them) just didn't catch on in the general computing space, and for good reason---it's an excellent design for programmer tools, but a poor design for end user tools because users generally prefer to work on a task to completion.

    Indeed, it could be argued that this is the result of a toolchain that is not well designed. One could easily imagine an IDE that memory maps the source tree and binary build results into RAM, sharing those pages with the short-lived compiler. This would end up being a much faster workflow because instead of having to go through filesystem lookups and read the blocks (which is relatively slow even from cache), the compiler would simply be handed the data it needs, or at least a series of VM pages that will contain the needed content after it is paged in. You would pay the penalty once at launch time (or better yet, defer mapping until the data from a particular file is needed) and never pay the lookup penalty again. That would make it more in line with a typical user's usage patterns.

    The typical computer user (not programmer) reboots their computer every few days, whether because they need to swap batteries in a laptop, because the kernel is leaking memory, etc. For them, that gives a fairly short upper bound to the lifespan of data in the cache. They typically run an application that loads or memory maps a file into RAM, work with it for a period of minutes or even hours, then close the file and work on something else. They don't open the file and close it repeatedly, and neither do their applications. That's a very degenerate usage pattern.... :-)

    You probably need a ridiculous amount of RAM and an extreme habit of opening files once and never using them again before that would have a detectable impact.

    Not at all. If you double the size of the cache, you double the size of the data structure that maintains the cache. If you are looking up a block in a tree, for example, that means the average time to do the lookup will be log_2(2n) instead of log_2(n). While in theory, that's not a big difference, in practice, that translates to an average of one extra tree node before you get to the node you're looking for. Multiply that times the number of cache lookups, and it adds up. Such a hit must be justified by the increased size of the cache. If the computer reboots before that data is needed again, you've just made every disk operation take an extra few dozen CPU cycles with no benefit.

    Put another way, if you have 512 megs of cache and are working with 512KB blocks, you have 1024 blocks cached, and your tree depth is 10 nodes (log_2 of 1024). If you double the size of the cache, you are adding one extra hop, so you have increased the amount of cache lookup time by an average of 10% for every lookup, including those that get served from the cache. That means that your cache is now effectively 10% slower than it was before. If most of your data is already cached, that's a pretty huge impact, and if you are pulling data from cache frequently, it can easily exceed the gains you'd get from occasionally saving a disk read.

    Also, bear in mind that there is a psychological advantage to using the smaller cache in such a case. Pausing once a minute for a full second to read bits in from disk is far less annoying to the user than adding a .1 second delay every time the user pulls down a

  • Who uses swap? (Score:1, Interesting)

    by Anonymous Coward on Thursday October 11, 2007 @08:24PM (#20948031)
    No one seems to be asking the question as to why swap is even being used these days. I have dozens of servers and the swap space is barely touched. Main memory is cheap so just forget about swap. It's a hack from the days when memory wasn't cheap.

All this wheeling and dealing around, why, it isn't for money, it's for fun. Money's just the way we keep score. -- Henry Tyroon

Working...