ARM Offers First Clockless Processor Core 351
Sam Haine '95 writes "EETimes is reporting that ARM Holdings have developed an asynchronous processor based on the ARM9 core. The ARM996HS is thought to be the world's first commercial clockless processor. ARM announced they were developing the processor back in October 2004, along with an unnamed lead customer, which it appears could be Philips. The processor is especially suitable for automotive, medical and deeply embedded control applications. Although reduced power consumption, due to the lack of clock circuitry, is one benefit the clockless design also produces a low electromagnetic signature because of the diffuse nature of digital transitions within the chip. Because clockless processors consume zero dynamic power when there is no activity, they can significantly extend battery life compared with clocked equivalents."
Soooo... (Score:5, Funny)
Re:Soooo... (Score:2, Informative)
According to http://www.arm.com/products/CPUs/ARM996HS.html [arm.com], 50-70 MHz. Plenty fast for embedded applications.
Re:Soooo... (Score:5, Funny)
Re:Soooo... (Score:5, Funny)
The correct answer is: 50000000000000 - 70000000000000 mHz.
Re:Soooo... (Score:2, Funny)
Just don't tell this to marketing or your boss. They might have brain dump.
Re:Soooo... (Score:5, Funny)
Re:They come in; fast, faster, fastest and OMGspee (Score:2, Funny)
Re:They come in; fast, faster, fastest and OMGspee (Score:5, Funny)
I thought they came in Light Speed, Ridiculous Speed and LUDICROUS SPEED!
Re:Soooo... (Score:2)
Synchronisation? (Score:5, Interesting)
Re:Synchronisation? (Score:5, Informative)
This is not really any different than the way a clocked core synchronises with peripherals. These days devices like the PXA255 etc used in PDAs run independent clocks for the peripherals and the CPU. This allows for things like speed stepping to save power etc.
Re:Synchronisation? (Score:3, Informative)
Horrible summary (Score:5, Informative)
VAX 8600 (Score:5, Interesting)
Re:VAX 8600 (Score:3, Interesting)
The VAX 8600 was produced by a team at DEC that had a heritage doing large computers (PDP-10, DECSYSTEM-20). It was competing, internally, with a different group with a "midrange" (VAX) heritage, who produced the VAX 8800 and some other machines. There was
Re:Horrible summary (Score:5, Informative)
http://en.wikipedia.org/wiki/CPU_design#Clockless
Yes, they are based on asynchronous digital logic, but calling them clockless is ok. They do NOT have a clock signal.
One of the top problems in CPU design is distributing the signal to every gate. It is very wasteful. Clockless CPUs are a revolution waiting to happen. And it will. The idea is just better in every respect. It will take effort to reengineer design tools and retrain designers, but they are far superior (now that we really know how to make them, which is a recent development).
Re:Horrible summary (Score:4, Informative)
You are confused (Score:5, Interesting)
Unfortunately, self-clocked design (like the reported ARM uses) is also sometimes called "asynchronous" logic design; however, this is a completely different kind of thing than the "asynchronous" combinatorial logic used in clock-based design. Self-clocked design also does combinatorial logic in latched stages, but uses a self-timed asynchronous protocol to run the latches instead of a synchronous clock. Basically, the combinatorial logic figures out when it's finished, and tells both the next stage ("data's ready, latch it") and the input latch from the previous stage ("I'm done; gimme some more data").
To close the loop, each stage can wait until there's new data ready at its inputs, and space to put the output data. Thus, in absence of some bottleneck, your chip will simply run as fast as it can.
To overclock a self-timed design, you simply increase the voltage. No need to screw around with clock multipliers; as long as your oxide holds up, your traces don't migrate, and the chip doesn't melt...
Re:You are confused (Score:3, Interesting)
That sounds a bit like a dataflow language [wikipedia.org]. Maybe you could make a program that automatically converts a program made in such a language into a chip design ? Then we'd only need desktop chip manufacturing to make true open-sourced computing a reality...
But no, such chips would be illegal, since they wo
Asyncrhonous == Clockless (Score:5, Interesting)
But your assertion about critical path is slightly off. Asynch processors still have a critical path. If you immagine the components as a bucket-bregade and the data the buckets, then they may not all be heaving the buckets at exactly the same time anymore, but they will still be slowed down by the slowest man in the line. The difference is that critical path is now dynamic. You don't have to time everything to the static, worst-case component on your chip. If you consistenly don't use the slowest components (say, the multiply unit), then you will get a faster IPT (instruction per time) on average.
And yes, you don't have clock skew any more which is nice, but you now have to handshake data back-and-forth across the chip. Of course putting decoupling circuitry in can help.
Imminent (Score:2)
Re:Imminent (Score:3, Interesting)
Re:Horrible summary (Score:2)
All High School English Teachers Must Die! (Score:2)
Re:Horrible summary (Score:3, Interesting)
The most glaring is that you assume that synchronous processors can only have one clock - that's incorrect. While the clock tick is of fixed length (by design), the global clock (as seen by external parties) may run at a different speed than internal clocks.
If the a path of logic takes 5ns to complete, and its clock matches exactly, then you are perfectly optimized. You are hampered not by the clock, but by the transistor's switching speed. This path will have the
Re:Horrible summary (Score:5, Informative)
Re:Horrible summary (Score:2)
Re:Horrible summary (Score:3, Informative)
P(ave) = C(eff) x V(dd)^2 x f
Which of course means my original comment was correct. Rather than citing confidential material and skirting the issues, how about backing up your assertions with real facts.
timing (Score:2, Interesting)
Re:timing (Score:4, Informative)
Re:timing (Score:2)
Re:timing (Score:3, Informative)
On the PIC series of microcontrollers, you can time any code simply by adding up the clock cycles taken by each instruction and figuring in your clock rate. There's even a nice tool to do this for you. This is often handy for simple delays; sometimes you're using all the timers or you don't want to stick stuff into a bunch of configuration registers just to slow down a loop. I don't see this sort of timing being as easy w
This thing sounds fast (Score:2, Funny)
I worked for ARM... (Score:5, Interesting)
Truely wonderful and very special company for the first two of those years, then it slowly and surely went downhill - these days, it's just another company. ARM's culture didn't manage to survive its rapid growth in those few years from less than two hundred to more than seven hundred.
Re:Livin' large and in charge (Score:3, Interesting)
Re:Livin' large and in charge (Score:3, Informative)
> their investors eventually demanded that they make a profit. Was that about the time
> when you left?
Your view in this matter is utterly unlike the reality of events.
ARM was exceedingly hard working and to begin with something like half the staff had PhDs. What (IMHO) happened was that with rapid growth the quality of lower and middle management in particular was diluted and also politics, the rot of all companies,
ARM? (Score:5, Funny)
Re:ARM? (Score:2, Offtopic)
Re:ARM? (Score:3, Funny)
Re:ARM? (Score:5, Funny)
Sun was talking about clockless chips in 2001 (Score:3, Insightful)
Re:only talk (Score:4, Interesting)
Sun has clockless chips up and running (real silicon, not sims) and they have done some interesting things, but they don't have a complete system that's ready to ship. And there are other components out there that use the clockless philosophy to do certain things, but they're not CPUs in any sense. To give credit where credit is due, as the parent post points out, ARM beat Sun out the door with a clockless CPU that is a drop-in replacement (to some degree, anyway -- not clear how much) for an existing, established architecture. But that wasn't/isn't Suns goal (although perhaps it should be...). They're pushing in new directions, not using this to reimplement current architectures.
Other Uses (Score:2, Interesting)
Re:Other Uses (Score:2)
OSX takes away the geekiness flair from the image (Score:2)
The summary (Score:5, Funny)
Those damn young'uns and their newfangled clockless clocks.
Not That Difficult (Score:5, Interesting)
One of the neatest things about asynch processors is their ability to run in a large range of voltages. You don't have to worry that lowering the voltage will make you miss gate setup timing since the thing just slows down. Increasing voltage increases rise time/propegation and speeds the thing up. The grad students had a great demo where they powered one of their CPUs using a potato with some nails in it (like from elementary school science class.) They called it the 'potato chip'.
Re:Not That Difficult (Score:5, Interesting)
depends on your pipeline (Score:2)
The alternative proposed by the research community is GALS [boisestate.edu] - globally asynchronous, loca
Re:Not That Difficult (Score:3, Interesting)
It didn't really work out. While we could easily get prototypes to work well over rated temperature ranges, getting the production version to work reliably was an order of magnitude more effort than the clocked version. As the complexity of the logic increases, the number of potential race con
First? (Score:2)
cool. I want one. Or two...
Overdoing it (Score:2, Insightful)
Current processors are clocked at whatever speed they can safely run at and many of them automatically underclock themselves if they overheat.
Without a clock, what keeps the speed at a safe level?
Re:Overdoing it (Score:2)
Interesting question. Maybe the instructions are clocked (slowed) locally within the chip.
Re:Overdoing it (Score:3, Insightful)
So, no more System clock in the corner of my OS? (Score:5, Funny)
Re:So, no more System clock in the corner of my OS (Score:2)
Computer Science has lost it's history somewhere.. (Score:5, Informative)
Gads. Now that I'm "overqualified" to write software (i.e., employers don't seem to think experience is worth paying any extra for), the geek world has completely forgotten that it even has a history.
How fast is it? (Score:3, Interesting)
Re:How fast is it? (Score:5, Informative)
So this core wouldn't be designed for speed.
Also for many embedded platforms the cpu speed is less important compared to power consumption and bus contention.
Tom
Clockless chip overview (Score:5, Interesting)
This seems to be a good overview of clockless chips. I can't vouch for its accuracy (not my area), but the source - IEEE Computer Magazine - should be good. The article was published March 2005.
(warning: PDF)0 18.pdf [computer.org]
http://csdl2.computer.org/comp/mags/co/2005/03/r3
Transmeta's Crusoe was supposed to be clockless (Score:5, Interesting)
What did I miss? I remember the hype, the early diagrams of how it was all supposed to weave through without the need for a clock. Would someone care to elaborate on the post-mortem of what was supposed to be the first clockless processor, 4 years ago?
Re:Transmeta's Crusoe was supposed to be clockless (Score:3, Interesting)
Re:Transmeta's Crusoe was supposed to be clockless (Score:3, Interesting)
1997 - Intel develops an asynchronous, Pentium-compatible test chip that runs three times as fast, on the half the power, as it synchronous equivalent. The device never makes it out of the lab."
So why didn't Intel's chip make it out of the lab? "It didn't provide enough of an improvement to justify a shift to a radical technology," Tristram says. "An asynchronous chip in the lab might be years ahead of any synchronous design, but the design, testing and manufacturing systems that
Sweet! (Score:5, Funny)
Why is async good (Score:5, Informative)
1. It will give good power consumption characteristics i.e. low power consumed, not just because of the built in power down mode, but also because of the voltage the chips will be running at. By pulling the voltage lower than a synchronous equivalent, it will be simpler to have greater power savings. This becomes possible if you are willing to sacrifice speed. and in async devices, speed of switching can be dynamically altered as each block will wait till the previous one is done, not until some outside clock has ticked.
2. Security: Async designs give security against side channel power analysis attacks. As all gates must switch (standard async design usually uses a dual rail design, so most gates means all gates along both +ve & -ve switch), differential power attacks become much harder. Thus async designs are perfect for crypto chips (hardware AES anyone?)
3. elegance of solution:the world is generally async. Key presses are, memory accesses are. so why not the processor
But they have several points of disadvantage:
1. They are hard to do. Especially using the synchronous design flow that most of the world uses. Synchronous tools assume, especially in RTL, that the world is combinational, and that sequential bits are simply registers that occur once a clock cycle (not true for full custom designs like intel and amd, but for slightly lower level : esp ASIC design)
2. The tools that exist now, are either able to do good implementation using only a few gates ie small functions or bad implementations, that are in worst case as slow as synchronous equivalents but are larger functions. Tools exist like http://www.lsi.upc.edu/~jordicf/petrify/ [upc.edu] Petrify , but these become unusable for circuits with more than ~50 gates.
3. Async designs are usually large. This is not always true, but standard async designs are usually implemented as dual rail or using 1-of-M encoding on the wires. But the main overhead comes from the handshaking circuitry. For really fine grain pipeling, the output of each stage must be acknowledged to the previous stage. This adds a massive overhead, as it necessitates the use of a device called the Muller C Element, that sets the output to the output, only if the inputs are the same, or retains the previous value, if not. Many copies of this element are usually required, and its this that adds space, for example, a simple 1 bit OR gate, that would usually have 4 transistors, has 16 transistors for the dual rail async implementation.
For the time being, I think they will find a lot of use in low power applications - such as embedded microcontrollers/processors, in things like wireless sesnor networks, and security processors. However I believe that full processor design is very far off.
Re:Why is async good (Score:2, Interesting)
Re:Why is async good (Score:5, Interesting)
You're right about that. I research side channel attacks on crypto hardware, and my first response to this was --- well, this would make EM analysis more complicated. For those not familiar with the general approach, in side channel attacks you don't try to do anything as complicated as breaking the underlying math of the crypto. Instead you observe the hardware for emissions that can give some clues as the instructions being carried out. If your observations help give you any info about what the chip is processing, you might learn parts of keys or gain a statistical advantage in other attacks. So if it's harder to observe signals emitted (electromagnetically from the chip, then attacking the hardware is harder.
Re: (Score:2)
Overclocking (Score:4, Funny)
The perfect chip for .. (Score:2)
Uhhh, this isn't ARM's first clockless offering... (Score:3, Informative)
Price (Score:2, Funny)
Finally! (Score:3, Funny)
I will be happy to have CPU without one.
Origins of this technology (Score:3, Informative)
Re:That's odd (Score:3, Informative)
Re:That's odd (Score:2)
Re:That's odd (Score:5, Informative)
Re:That's odd (Score:3, Informative)
No kidding. When I took a digital systems lab class, we had to do one simple asynchronous circuit. The corresponding state machine only had four states (compared to a computer processor, which might have a hundred states or more), but it was probably the most difficult circuit to design. Basically, you have to make sure that as you're transitioning between states, you always end up in the correct one, no matter where you may be in between.
Re:That's odd (Score:5, Informative)
Hence, large-scale async work is often based on every data transfer between modules being sent along with a PULSE or READY signal. Of course, every module has to be designed so that its output is ready when it propagates the pulse... otherwise there's bogus output into the next module. Basically, one module having the propagation delay timed incorrectly can kill the whole system. BUT, with fast logic, your system will simply run as fast as the hardware can handle...
Commercial async processors have been around for AGES [multicians.org] -- but modern logic IC-based processors are rarely build and sold on a large scale, being mostly experimental designs.
Re:That's odd (Score:2)
However I could be mistaken, and info related to the 6502 is a little hard to come by. Plenty of it from hobbyists however not all entirely accurate, and reaching CBM these days to ask is a little difficult
I think they're still in use in positioning devices that point things like satellite dishes and on microwave hops that auto
Re:That's odd (Score:5, Informative)
When I read the article what popped into mind was low consumption while doing nothing, which is what made me think of it. So now I've shown my age and made quite the ass of myself, but what else is
So not the same thing. Sorry for the ruccus
Re:That's odd (Score:3, Informative)
of course (Score:3, Informative)
ARM, followd by PowerPC, are the most common cores for embedded Linux and embedded Linux boxes far outnumber servers and desktops (where x86 rule).
Re:Obligatory (Score:2)
Actually (Score:2)
Re:Actually (Score:2)
certainly not (Score:3, Interesting)
Linux is more portable. Linux runs on the original 68000. Linux was just ported to the Blackfin DSP. There seem to be about a dozen crappy little no-MMU processors that can run Linux.
Linux requires a gcc-like compiler, but not necessarily gcc. IBM and Intel have both produced non-gcc compilers that are able to compile Linux.
Re:The next palm pilot? (Score:5, Interesting)
Basically a good asynchronous chip would draw almost no power while it's waiting for something (like I/O events from network, keyboard, timers, etc). And it would instantly ramp up and handle the event as fast as it possible could. The speed is generally a factor of voltage and temprature. It's how fast the gates can switch and perform interlocks under current conditions, rather than what rate a clock is driving everything.
It's going to be interesting to see what performance metric is used on these "clockless" chips by the industry and by the marketing/sales types. MIPS? FLOPS? SPECmark? not that MHz was ever a good benchmark, but things like MIPS is a lot easier to manipulate to make your product appear faster than your competitors.
Re:The next palm pilot? (Score:2)
Now if your design goes even further and automatically turns off circuitry when it's not in use (beyond just have it hold a 0) then you will have a delay as you p
Re:The next palm pilot? (Score:3, Informative)
Re:The next palm pilot? (Score:2)
Re:my god, you people are stupid (Score:4, Funny)
Long live digg? I think you mean long live reddit, bitch!
Re:All I want to know... (Score:2)
Re:All I want to know... (Score:3, Insightful)
Can people please remember the computer industry does not start and stop with the latest bit of kit for playing DOOM3 or surfing the ruddy internet....
Synchronous vs. Asynchoronous (Score:5, Informative)
Clocks help by allowing the designer to effectively freeze the state of the logical circuit on a regular basis. This way, all the signals in a chip can propagate to where they are supposed to go, then the logical operations occur. This process repeats on every clock pulse.
The problems with using clocks are pretty significant, however. First, you need to add a lot of additional circuitry to implement a clock. Another problem is that generally, A LOT of changes happen on every clock tick, which means a large spike in electical current (because you need to use the electrical current to actually change the state of all of the digital circuits). This spike also causes what is known as noise in electronics, and with higher frequency circuits, the noise can actually cause interference with other unconnected electronics (this is known as EMI). And another problem with a clock is that you generally need to keep it running all of the time for it to be useful, which means using electrical power even when no changes are occurring.
So, the asynchronous CPU is a significant engineering feat. It is very difficult to design, but it is probably much smaller and more efficient than any equivalent clocked ARM core. That said, I wonder how do you actually evaluate the performance? With synchronous CPUs, it is a simply a function of the clock speed and architecture. In addition, all of these devices need to be tested so that they are guaranteed to work - I wonder how they do that.
Re:Synchronous vs. Asynchoronous (Score:3, Interesting)
Dear Poseur nerd (Score:3, Funny)
Dear Poseur nerd;
Your Slashdot post has been audited by a nerd committee and has been found to be lacking in both quality and substance. Normally this would only result in downward moderation. However, in this instance it grossly lacks nerdly appreciation of the subject matter presented, indicating that you are not a true nerd. If you were a true nerd, you would have instead made a post about where one of the said clockless processors might be obtained, or maybe indicate how it mig
System Clock (Score:3, Informative)
Essentially (as an example), when a processor wants to copy something from a register to memory, it puts
Not really (Score:4, Informative)
You'd have to specify a specific benchmark... (Score:5, Insightful)
Given an equivalent process, layout technology, and number of transistors, an async design will be at least somewhat faster and vastly more power-efficient than a clocked design.
But none of those things are going to be equivalent in the real world - except possibly the process that ARM designs to. So comparisons will be difficult.