Forgot your password?
typodupeerror
Power Hardware Technology

ARM Designer Steve Furber On Energy-Efficient Computing 195

Posted by timothy
from the tell-us-how-it's-done dept.
ChelleChelle writes "By now, it has become evident that we are facing an energy problem — while our primary sources of energy are running out, the demand for energy is greatly increasing. In the face of this issue, energy-efficient computing has become a hot topic. For those looking for lessons, who better to ask then Steve Furber, the principal designer of the ARM (Acorn RISC Machine), a prime example of a chip that is simple, low power, and low cost. In this interview, conducted by David Brown of Sun's Solaris Engineering Group, Furber shares some of the lessons and tips on energy-efficient computing that he has learned through working on this and subsequent projects."
This discussion has been archived. No new comments can be posted.

ARM Designer Steve Furber On Energy-Efficient Computing

Comments Filter:
  • by Darkness404 (1287218) on Thursday February 25, 2010 @05:54PM (#31278356)

    (record the uptime with a pen+paper if you want to keep a running total)

    ...That isn't true uptime. The point of uptime being a bragging right isn't that you have an APC, but rather your computer is configured correctly so it doesn't randomly run out of memory and crash, your hardware doesn't overheat after a week of constant usage, etc. Almost -any- computer can work 12 hours and not fail. Finding a computer that will last a year or more without rebooting is hard. Simply adding up the amount the computer has been on is not uptime, not in the least.

  • by Michael Kristopeit (1751814) on Thursday February 25, 2010 @06:17PM (#31278722)
    my mac mini only pulls 14W and isn't far behind your current desktop in performance. my fit-pc2 only pulls 6W. relative to power consumption, both of those machines are easily besting super computers from the 80s as well as your current desktop.
  • Re:Bull... (Score:4, Informative)

    by Colin Smith (2679) on Thursday February 25, 2010 @06:46PM (#31279106)

    Look up "Energy Return on Energy Invested".

    Saudi oil has been 100:1.

    Shale... 5:1 maybe, 3:1.
    http://en.wikipedia.org/wiki/Oil_shale [wikipedia.org]

    When they really start using shale, you know the shit is really hitting the fan.

    And no matter how much is left (quoted in the reserve figures as recoverable), could be a trillion trillion barrels, nobody is going to bother trying to get it out when it takes a unit of energy in to get a unit of energy out.

  • by farble1670 (803356) on Thursday February 25, 2010 @07:13PM (#31279350)

    Computers have -always- tried to be energy-efficient in the portable sector. And quite honestly, its about the only sector that needs work on energy-efficiency to gain any benefit.

    that couldn't be further from the truth. energy costs are just going up. for households it's mildly important as they can usually sleep their computers when not in use. for businesses, energy efficiency ranges from very important to critical when they have massive server rooms full of tens of thousands of CPUs powered and busy 24x7.

    moreover, for developing countries, it's again critical. while the L in OLPC stands for laptop and therefore technically qualifies as mobile, it's more about having a battery to deal with locales where the electric grid is often shut down either on purpose to save energy or inadvertently because of a poor / out of date infrastructure.

  • by farble1670 (803356) on Thursday February 25, 2010 @07:18PM (#31279404)

    I can 'suspend' but none of that junk ever works properly on WinTel

    every computer i've owned or used in the last 10 years has been able to hibernate or sleep. that includes macs, linux and win98 to win 7. if you buy a computer that can't reliably sleep, you should return it and get your $ back.

  • by BestNicksRTaken (582194) on Thursday February 25, 2010 @07:37PM (#31279548)

    It was Acorn RISC Machine way before it became Advanced RISC Machines Ltd; by almost a decade, when Furber ran the show.

  • by h4rr4r (612664) on Thursday February 25, 2010 @07:43PM (#31279594)

    Nasty, dirty shitty coal. Coal power should just be illegal already.

    Nuke, wind, solar, natural gas all are alternatives with far less pollution and co2 release.

  • by emt377 (610337) on Thursday February 25, 2010 @07:48PM (#31279644)

    Just last year they found a new oilfield off of Brazil bigger than anything found yet. Last year. After everyone said no new large fields would ever be found.

    The Tupi field is estimated to hold 8b barrels of oil. Given our current global consumption that's a three month extension. It's the biggest field discovered in 30 years - which is pretty telling. Find ten of these and we've got a few extra years. Find only another one or two and it makes no difference. Meanwhile, when the global business cycle points up again our oil consumption is going to follow likewise - again. Prices will rocket, and economic growth will be choked. Oil is really a limited resource and the way we've built our entire economy around it is going to limit our capacity for global growth.

  • by TheRaven64 (641858) on Thursday February 25, 2010 @07:53PM (#31279678) Journal
    Probably from a reliable source. The chip that he designed was the Acorn RISC Machine. When ARM was spun out as a joint venture with Apple, it was renamed. Advanced RISC Machines is a backronym intended to keep the same initials but remove the Acorn branding (which Apple didn't want).
  • by TheRaven64 (641858) on Thursday February 25, 2010 @07:59PM (#31279746) Journal

    Actually, you're only half right. On ARM, there is typically no double hardware, so you get a very slow path for 64-bit floating point arithmetic. On your Core 2, it's more complicated. The x87 unit only supports 80-bit floating point values. This means that any float or double will be sign-extended when it is loaded into a register. You gain a bit better cache usage from using 32-bit floats, but that's it.

    No both, however, if you compile with SSE then you will be using a vector for all floating point operations. With floats, the compiler can pack concurrent operations on four of them into a single instruction, with doubles it can only pack two. I'm not sure about the Atom, but I vaguely remember that it splits SSE ops in half, so you really do two 64-bit operations. Either way, you can do twice as many float operations in the same power envelope, as long as your code is suited to vectorisation.

    Modern compilers prefer to target SSE instead of x87, because register allocation with x87 is painful. Most operations only work between the top two registers in the 'stack' so you need a lot of register-register copies in a typical bit of x87 code (which burns i-cache too). This is one of the main reasons why you see a performance improvement in x86-64; if you have a 64-bit chip you can guarantee the presence of SSE, so the compiler will always use SSE instead of x87 when compiling 64-bit code. If you're someone like Apple and don't support pre-SSE chips, you can also do this and get the same benefit in 32-bit mode.

  • by Cajun Hell (725246) on Thursday February 25, 2010 @08:02PM (#31279786) Homepage Journal

    And quite honestly, its about the only sector that needs work on energy-efficiency to gain any benefit.

    Google disagrees with you, in a really big way.

    Also, anyone who has hooked up a Kill-A-Watt to their computer, and then calculated how much money per year they're spending on it, disagrees with you.

    This one asshole [nbc-2.com] spent an estimated half a million dollars (of someone else's money) on electricity (which is probably the main reason he really got in trouble), not counting the harder-to-measure increased electric bill for the air conditioning (he was doing this in Arizona).

    Energy costs money. People care about money.

  • by marcansoft (727665) <hector.marcansoft@com> on Thursday February 25, 2010 @11:31PM (#31281190) Homepage

    Run PowerTOP on Linux (and use a tickless kernel, of course). There are some offenders, but most of those background services aren't using any power. As long as the processes are sleeping most of the time and don't wake up often (once every few seconds at most), they aren't going to increase your power usage.

    There are a few slightly annoying ones (ntp tends to wake up once per second, and I think mysql wakes up twice per second), but most of the crap comes from poorly-written GUI apps that poll for stuff or feel the need to wake up tens or hundreds of times per second. Bad user preferences also don't help (hint: anything that's moving on the screen at any sort of framerate while the computer is otherwise idle is going to massively increase power usage over a truly quiescent CPU).

  • by i8degrees (410294) on Friday February 26, 2010 @04:20AM (#31282532)

    Well, by computer, I meant not a server and one that has constant general use. Of course its easy to find a server that will run for years. Its a lot harder to find a computer being constantly used, updated and on the internet that hasn't been rebooted in a year.

    Huh? You just defined the very such piece of computer technology otherwise known as a "server". I don't post here often, but this comment just makes my jaw drop unusually low for some reason, so I apologize ahead of time if I sound offensive...

    1) A server, as described in a TCP/IP client & server topology, is a computer that is "always on" the internet, or so you very well hope so, especially if it is being used for "business" oriented tasks. Those packets get served to your "clients" whom serve you back, and this process cycles over until satisfied. Does this sound like another real world scenario to you?

    2) A server is often updated at least once a week, sometimes more, depending on the sort of update this is. You could even argue such a hardware piece is often the most updated piece of equipment, generally speaking. (Some development systems could certainly be an exception, for instance!)

    3) A server, by definition, is tasked with the role of serving, or "hosting" information back to said requesting clients, whom could and often are making all sorts of requests all at the same time (concurrently), hitting up your resources significantly as the volume of concurrency rises. This often taxes the hardware in several different key areas, sometimes very much like a video game (on steroids, even!) especially if you do not have a network topology in which allows better distribution of said serving tasks across.

    In summary, have you ever tried keeping one, much less several parallel video surveillance systems up for months upon years at a time, with minimum downtime as any loss of said outage could leave you liable to your real-life business clients? Sometimes you don't have the luxury of redundancy to save your ass. When incoming bandwidth is not your bottleneck, you've gotta keep your disks seeking every bit as fast as they can keep up to, along with fast concurrency capabilities, along with allowing proper time for distributed mass A/V encoding of said daily video surveying operations to occur before considering that a solid "tape" and pushing those off to another system for backup, which mind you, is getting hit by many other requests in parallel to this first example.

    Okay, so I did a terrible "summary", especially now as I am continuing onward with this new paragraph, but in other words ... real life servers do not often have it easy in their life! Their CPU, memory and disk I/O are often pushing near or at maximum utilization if you are trying to get every dollar out of your time & money spent. I push my personal workstations plenty hard, but not anywhere close to what some of my servers have gone through on a daily basis!

    P.S. My examples given are of only my own personal real life business experiences in the "small time" playing league -- this represents 0.01% of what big name brand servers running many of our web sites we most all depend on every day, free or paid. Yes, my "servers" were indeed a mixture between the real stuff and the commodity desktop class hardware. Whereas I cannot prove this, I've got both win32 and personal *nix workstations that have indeed been alive w/o reset for a year. When your hardware does not fail, satisfied setups and plenty of other systems running to do various testing on, with the right style of administrated setup you can easily scale to the years. ksplice can be used on personal desktops / workstations, too.

    You only *simply* have to devote a lifetime to computer science to learn the trade down to a true art.! :-)

  • by Anonymous Coward on Friday February 26, 2010 @06:12AM (#31283088)

    Nice overview, but I do not believe it to be completely correct.

    Although it is technical possible to execute 4 float or 2 double operations in a single SSE operation, the user usually has to explicitly code for those operations to take full advantage. Vectorization (combining multiple float operations into a single instruction) is often difficult because it is only possible with the same operation on "adjecent" values; It is not possible to combine arbitrary operations. As a result, making efficient use of this capability often requires a rewrite of the algorithm, and compilers are usually not very good at it.

    SSE1 only operated on float values, but SSE2 added support for doubles as well, and the instructions set specifically makes it possible to also operate on single float or double values. This is indeed due to the clumsiness of the x87 FP stack.

    However, the difference in performance between the FP stack and single SSE2 instructions is not always as huge as could be expected. Underneath the FP stack based ISA, modern X86 processors implement a register based architecture. And while the stack only allows to operate on the stack top, the ISA offers an exchange operation to swap arbitrary stack elements. Because the underlying architecture is register based, such an exchange just reassociates a different register with a different stack location, and no data is moved. As a result, this operation is extremely efficient (practically free, except for opcode).

    There are, however, other advantages to using SSE instead of FP. For example, rounding a float to an integer is extremely expensive on the FP unit. This is because a global mode switch has to be performed to switch the rounding mode to truncation, and back to rounding around such a conversion. Switching this mode basically stalls the FP pipeline. In SSE on the other hand, a dedicate instruction was added to truncate to an integer for exactly this reason.

    As for the performance benefit with X86-64, I believe this to be largely caused by the larger register set. in IA32, there are only 7 registers which are sort-of general purpose. The hardware has to spend quite some effort in tracking loads and stores, to figure out it can actually keep such values in registers too (underneath, a modern X86 processor has lots of registers). X86-64 makes it a lot easier for the compiler to express what it wants.

"A mind is a terrible thing to have leaking out your ears." -- The League of Sadistic Telepaths

Working...