Forgot your password?
typodupeerror
Hardware

Clockless Computing 342

Posted by michael
from the beat-of-a-different-drummer dept.
ender81b writes "Scientific American is carrying a nice article on asynchronous chips. In general, the article advocates that eventually all computer systems will have to move to an asynchronous design. The article focuses on Sun's efforts but gives a nice overview of the general concept of asynchronous chip design." We had another story about this last year.
This discussion has been archived. No new comments can be posted.

Clockless Computing

Comments Filter:
  • HOW???? (Score:2, Funny)

    by Anonymous Coward
    But how will we be able to tell time with our computers! Dear god no!
  • by nick-less (307628) on Wednesday July 17, 2002 @02:52PM (#3903613)
    for the first guy who overclock's it ;-)
    • by sketchkid (555690) on Wednesday July 17, 2002 @02:54PM (#3903621) Homepage
      larry ellison? is that you? :)
    • by npietraniec (519210)
      You actually could "overclock it" because such computers would have a maximum speed... Instead of spinning their wheels like todays computers do, they would only clock when they needed to. They'd be able to achieve quicker bursts because all that wheel spinning wouldn't melt the processor.
      • Re:1 Million reward (Score:5, Informative)

        by ZeLonewolf (197271) on Wednesday July 17, 2002 @03:32PM (#3903932) Homepage

        You actually could "overclock it" because such computers would have a maximum speed... Instead of spinning their wheels like todays computers do, they would only clock when they needed to. They'd be able to achieve quicker bursts because all that wheel spinning wouldn't melt the processor.
        Er...no.

        That's one of the key benefits of clockless computing: an instruction runs through the processor as quickly as the electrons can propagate through the silicon. In other words, the processor is ready to accept the next instruction at the exact instant it's available. You just can't pump it any faster...

        HOWEVER,

        Electricity propagates through Silicon faster when the temperature drops. Thus, the COOLER an asychnronous chip runs, the FASTER it gets! This opens up alot of exciting doors....and will certainly ignite hordes of development in the CPU cooling industry if async chips ever get off the ground. For an async chip overclocking = overcooling.
        • "Electricity propagates through Silicon faster when the temperature drops. Thus, the COOLER an asychnronous chip runs, the FASTER it gets!"

          Does this mean that the speed varies based on temperature?

          Initially this idea bugs me a bit because it means that a computer would have 'moods' based on the temperature. I could see that being a little problematic. The nice thing about a clock is that you can reasonable expect things to be done within a certain number of ticks. With variable speed processors, some synchronization issues will definitely arise that'll need solving.

          On the other hand, lots of work has already been done that way. Look at Quake 3 played over the internet. Lotsa people connect to a server with variable speed connections and response times, but the game manages to remain playable.

          Maybe I'm worried about nothing. For the uninitiated it'll take a bit to wrap their minds around.

          I do have a feeling though that ther'll be a market for both types of processors for the forseable future.
    • by Midnight Thunder (17205) on Wednesday July 17, 2002 @03:17PM (#3903815) Homepage Journal
      Intel marketing wouldn't like clockless chips as it would cause them massive headaches in the Mhz FUD. For once real world performance comparisons would have to matter.
    • by Mike_K (138858)
      Simple. Crank up the voltage.

      One huge advantage of asynchronous circuits is that you can turn the power down, and the chip simply slows down (up to a point, but you see the point). You turn power up (increase Vcc) and the chip runs faster. Same principles apply in overclocking your desktop chip, except here you don't need to crank voltage AND clock :)

      Of course doing this could ruin your chip.

      m
    • by jncook (4617) on Wednesday July 17, 2002 @06:20PM (#3905127) Homepage
      In 1993 I was a graduate student in the Caltech asynchronous circuit design group [caltech.edu]. That year we had a prototype asynchronous microprocessor that implemented a subset of the MIPS instruction set.

      The guys in the lab used to demo this by hooking up an oscilloscope to show the instruction rate. They would then get out a can of liquid nitrogen, and pour it on the CPU. The instruction rate would climb right up... This lead to many jokes about temporary cooling during heavy loads. "Hey, get the ice cubes... He's starting gcc!" :-)

      I believe our group used a different basic latch design than Sutherland describes. We handled all bits asynchronously using three wires, one that went high for 0, one that went high for 1, and a feedback wire for "got it". His design looks like it could latch a bus of wires simultaneously. Forgive me if I'm wrong... it's been almost a decade.

      One of the nice features of these chips is that they are tolerant of manufacturing errors. Often impurities in the silicon will change the resistance or capacitance of a long wire. In asynchronous designs, this just means operations that need that wire will be a little slower. In the synchronous world, either the whole chip fails or you have to underclock it.

      A group of ex-Caltech graduate students started a company to sell these asynchronous processors. Details at Fulcrum Microsystems [fulcrummicro.com].

      (For those at Caltech: Yes, that's me on the asynch VLSI people page. And yes, I wrote prlint. What an awful piece of software that was.)
    • You can guarantee that Tom's won't think much of this. I can see it now "Overclocking with this chip is impossible, a huge drawback, meaning the the 1423 fps is all you will get in Quake III."
    • To aveclock an asynchronous cpu you spray it with CFC. We have a couple in the office and its cool to see them speed up.
      Or you can attach a faster hamster. [man.ac.uk]
  • Yeah and... (Score:2, Funny)

    by xbrownx (459399)
    ...maybe some day we'll actually get 64 bit processors for home use
  • For any large project (such as an MPU), using asynchronous logic isntead of synchronous for the entire thing means it goes from being "merely" really-really-hard to damn-near-impossible.
  • by gyratedotorg (545872) on Wednesday July 17, 2002 @02:59PM (#3903672) Homepage
    they can't take away the clock. how will i know what time it is?
  • by Anonymous Coward on Wednesday July 17, 2002 @02:59PM (#3903680)
    Yet another old idea revived. The Amiga's Zorro expansion bus was asyncronous and plug n play in the 80s (although the rest of the machine was clocked).
  • Explanation, sorta (Score:3, Interesting)

    by McCart42 (207315) on Wednesday July 17, 2002 @02:59PM (#3903681) Homepage
    To clear a few things up, just because a processor/motherboard is "clockless" does not mean it won't be able to tell time. They can still use the 60 Hz AC signal for ticks.

    This is really cool. I was learning a little about asynchronous systems in my Logic Design and Computer Organization class last fall...they seemed pretty cool on a small scale, however they could get really difficult to work with when you're dealing with something as complex as a processor.
    • by barawn (25691) on Wednesday July 17, 2002 @03:17PM (#3903812) Homepage
      er... doubt they'd use that, to be honest. :) It's extremely unlikely that the entire system would be clockless: you'd have to redesign almost every peripheral. In any case, there'd be a clock SOMEWHERE for them to use.

      For me, this is kindof amusing: asynchronous logic is where you start out - it's more basic (shove signal in, get signal out). You then move to synchronous logic to eliminate glitches and possible race conditions (clocked flipflops, etc.). Apparently now you move BACK to asynchronous logic to gain performance. I can't disagree : working with synchronous systems, I've always been annoyed that certain combinations couldn't be used because they were too close to a clock edge, and it could miss the latch. If you can eliminate the glitches and race conditions, asynchronous logic would be faster. Of course, that's like saying "if software bugs didn't occur, you'd never need to look over code." Yes, true, valid, but not gonna happen.

      Of course, they're not talking about a true full asynchronous design: just getting rid of external clock slaving. The external clock would still BE there, for the external architecture - it's just that the internal architecture wouldn't all move to the beat of a clock.

      For instance, I'm pretty sure that no one is suggesting getting rid of synchronous memory design: it's just easier that way.
      • Ok, I don't know anything about this, but...

        Isn't that exactly what Sun supposedly did in order to get faster RAM without using RAMBUS? Here's a link. [theregister.co.uk] Maybe it was just in the memory interface? I don't really follow it.
    • by autopr0n (534291)
      Well, its not like they are going to have no clock, obviously they'll have some crystals in there to keep other things synchronized. And lots of thing, especially interactive stuff needs accurate timekeeping (not to mention most of the chips in the overall system will need a clock)
  • by imta11 (129979)
    I have read this article, and it is a very cool technology. Portions of the UltraSparc IIi [sun.com] have gates (capacitance) in the datapath that let the chip capture state of a computation and finish the result later. This lets the chip do other calculations during the time it would usually be in the wait state. Like compile the bytecode to optimize a loop after iterations have begun, but before it terminates.
  • Return of the 68000? (Score:2, Interesting)

    by vanyel (28049)
    Wasn't the 68000 asynchronous?
    • Hardly.

      It was the original CPU chosen for the Amiga 1000 and several of the Atari machines. It was the first consumer 32 bit device (16/24 bit address bus, 32 bit for everything else).

      It had a clock speed of 7.14 MHz, or 0.000714 GHz.
    • by Baki (72515)
      In a way yes. If I remember well, it's memory addressing and I/O bus system was asynchronous (not the clock of the CPU itself), meaning no 'wait states'. It would request a memory location and react as soon as the memory came up with the result. I forgot the details though.

      • by martinde (137088)
        > It would request a memory location and react as soon as the memory came up with the result.

        Well, kind of. A bus cycle completed when someone signaled "data transfer acknowledge" (DTACK) - then the CPU would read the data off of the bus. Most systems understood where in the address space the memory request was going, how fast that device was, and had logic to count system clocks to trigger DTACK when the data "should be" ready. (In fact, most memory devices have no way of signaling when a read has completed - they just guarantee in in a certain amount of time.)

        On the other hand, if you didn't hit DTACK in time, a bus error was generated and an exception routine triggered. Ahhh, the good old days ;-)
        • by AstroJetson (21336) <gmizell AT carpe-noctum DOT net> on Wednesday July 17, 2002 @03:45PM (#3904026) Homepage
          Exactly right. Nowdays, most of the Motorola embedded processors (many of which use 68000 or 68020 cores) can generate their own DTACK signals. For example, the 68302 has four CS (chip select) lines that you can internally map to whatever address ranges you want. You specify how many wait states are required and the DTACK and CS signals get generated automagically. This cuts down dramatically on on-board glue logic and address decoding logic, which is important for (typically small) embedded designs.
          • Brrrr, wait states. I remember I was programming 68020 on VME bus in that time (>10 years ago). When I first met an Intel with Multibus, I was shocked by the concept of wait states.
            • I designed a piece of equipment once that had a touchpad and an LCD (and a few other odds and ends) that were all run from the same board. This board was connected to the CPU board with a flat ribbon cable (I wanted a backplane, but economics won out). Anyway the devices on the front panel board were much slower than the CPU and due to that and also the lag on the cable I had to insert wait states for the block of memory that mapped the front panel. The 302 made it real easy, there was virtually nothing on the CPU board other than the CPU itself, some RAM, ROM, and drivers for the ribbon.
    • by isorox (205688) on Wednesday July 17, 2002 @03:16PM (#3903800) Homepage Journal
      Wasn't the 68000 asynchronous?

      No, it was so slow it just seemed that way.
  • by The Fun Guy (21791) on Wednesday July 17, 2002 @03:07PM (#3903738) Homepage Journal
    The article talks about an advantage of clockless chips being the fact that you can do away with all of the overhead in sending out the clock signal to the various parts of the chip. It also discusses what kind data processing activities are more suited for clocked vs. clockless chips. To get a best-of-both-worlds chip design, what about farming out various responsibilities on the chip to clockless sub-sections? The analogy I have in mind is when I drop my laundry off at the dry cleaners. I am on a tight schedule, and I have a lot of things to do in a certain sequence, while the dry cleaners collects laundry and does it at various rates during the course of the day. This particular laundry of mine can be done at any point over the next 4 days, and held afterwards, just so long as I have the finished product exactly when I need it, Thursday at 4:15 p.m. Different people assign different limits on the time-sensitivity of their laundry. The clocked sections can drop off their data for processing, and pick it up when they need it, and what happens inbetween is up to the clockless subchip, which does more-or-less FIFO, but can be flexible based on the time-sensitivity of the task.
  • by zulux (112259) on Wednesday July 17, 2002 @03:11PM (#3903764) Homepage Journal
    withoutaclocksignal,howcanyoutellwhenoneinstructio nstopsandanotherbegins?

    (kidding)

    • by A nonymous Coward (7548) on Wednesday July 17, 2002 @03:29PM (#3903902)
      withoutxxxx axxxxxxxxxx clockxxxxxx signalxxxxx ,xxxxxxxxxx howxxxxxxxx canxxxxxxxx youxxxxxxxx tellxxxxxxx whenxxxxxxx onexxxxxxxx instruction stopsxxxxxx andxxxxxxxx anotherxxxx beginsxxxxx ?xxxxxxxxxx

      Because rephrasing your question as above is what synchronous looks like; every word has to be padded to the longest word length. Asynchronous is like normal written language; words end when they end, not when some 5 char clock says so. Another crude analogy is sync vs async serial comm, except using hoffman(sp?) encoded chars, so async can use variable length chars, but sync has to padd the short ones out to the length of the longest.

      I tried underline instead of x but the stupid lameness filter objected/
    • withoutaclocksignal,howcanyoutellwhenoneinstructio nstopsandanotherbegins?

      By the spaces inserted randomly by Slashcode.

      (Sorry, but it's true!)
  • by McCart42 (207315) on Wednesday July 17, 2002 @03:15PM (#3903787) Homepage
    After reading the article, I have to wonder why asynchronous processors (or smaller logic devices, such as ALUs) haven't been considered before. The ideas have certainly been around for awhile--and in fact, asynchronous is intrinsically simpler than synchronous logic. The only conclusion on this I can reach is that while asynchronous designs may be "simpler" in theory, in that they don't include a clock pulse, they are much more difficult to work with in practice. Here's an example for those of you that have worked with logic design: try creating the logic for a simple vending machine that dispenses the product whenever a combination of coins (triggered by 3 switches, quarter, dime, and nickel) adds up to $0.50. Which would you prefer to use--synchronous or asynchronous logic? I know when I did this example I got myself stuck by using asynchronous logic, because while asynchronous logic meant less memory states (all states above $0.50 were treated the same), it also meant lots of added complexity, which I didn't need for the problem at hand.

    I foresee lots of bugs, but if they can pull this off, more power to them.
    • It's also tough to test and verify. I do ASIC timing analysis for a living, and most test methodologies (including IEEE JTAG boundary scan) rely on some sort of test clocks to pre-load test patterns into the chip before throwing on the system clock. I'm sure you could do this with global async. control signals, but it would be harder.

      I also haven't seen or heard of any large-scale software tools for doing this sort of analysis (as opposed to classic synchronous design, where one can pick from at least half a dozen static timing analyzers on the market today). This is probably at least a big a gate as anything else.

    • by Salamander (33735) <jeff AT pl DOT atyp DOT us> on Wednesday July 17, 2002 @03:38PM (#3903989) Homepage Journal

      You've hit the nail right on the head. Async circuits aren't harder to design; they're harder to verify and debug. Historically the tools just haven't been up to it and, despite some recent breakthroughs, I'm not sure they are now. Check out the work at CalTech [caltech.edu], Manchester [man.ac.uk], and Theseus Logic [theseus.com] for the current state of the art.

    • The big issue in going asynchronous is scale. It's easy to get small logic blocks to work properly, but there are timing issues that exist when building something complex like a processor, that are very difficult to address.

      The logic inside of standard clocked logic is asynchronous, but the clock is used to make sure you look at the result only when it is known to be valid. The clock rate is limited by how long it takes to assure the logic state to be stable.

      The timing and scaling issues exist even with clocked logic, which is why it took so long to make high clock rate motherboards. The data transfers on a modern motherboard are happening at well above the frequency of the FM radio band (tops out at around 108 Mhz), which makes the physical design of the board very interesting. You need to make sure that signals travel the same distance if they are supposed to be evaluated together, like the address or data buss.

      The change to asynchronous logic means you have to change the way you design your logic, change all of your CAD software you use to design the chips, and change all of your automated test equipment you use to certify which chips are good. This is a massive conversion for the chip industry, taking a great deal of time, and a great deal of money.
  • But... (Score:3, Insightful)

    by ZaneMcAuley (266747) on Wednesday July 17, 2002 @03:15PM (#3903791) Homepage Journal
    ... won't the buss and storage devices be a bottleneck still?

    Bring on the solid state storage.
  • Tools (Score:3, Insightful)

    by loli_poluzi (593774) on Wednesday July 17, 2002 @03:20PM (#3903844)
    Kevin Nermoyle (Sun VP) advocated asynch at the 2001 uProcessor Forum. The biggest and most daunting objection I heard in response was that tool support would be a killer. There is no tool support for asynch design at the complexity level needed to do a processor. You're left to a bunch of Dr. Crays using the length of their forearm to resolve race conditions with wiring distance. Since a large portion of the industry would have to make the leap to get the tool guys to invest in development, this kills any realistic possibility of an overnight asynch revolution. Small niche applications will have to get the ball rolling on this. Even still, designer's would need to get a lot smarter to think asynch. Think of how many chip protocols rely on a clock. How do you do even simple flow control in a queue for example? Pipelining goes to pot - its a whole different world. My two-cents.. Loli
  • by nuggz (69912)
    Why not have several smaller sections passing messages to each other? They could be clocked unclocked or just picking their noses.

    Right now a network is a bunch of arbitrary speed systems just passing messages, can't we scale that down to the computer level?
    It wouldn't even involve anything overly revolutionary.
  • ...but what scale will I use to justify how cool I'm trying to be!?!?! How will people be able to judge just how much more successful they are than their fellow human beings.

    I guess I'll just go back lifting weights, over compensating cars and the ruler. *sigh* I hate analog.
  • I have a professor here who swears by this asynchronous stuff. He told us that Intel actually developed an asynchronous version of the Pentium (I) processor. It worked just fine, and used less power than a normal clocked processor. Unfortunately, all of the processes and designs for everything else they were doing had to be redesigned for it, and it would have ended up costing a bundle in retraining and redesign in order to mass market the chip.

    It seems to me that clockless chips like these would seem to work very well with MIPS style processors - where you have lots of little instructions. However, you can't take advantage of the extreme pipelining features that chips like the Pentium 4 use when you don't have a clocked design. It would take a lot of research and a lot of re-education to get the design engineers to start thinking asynchronously instead of clocked, but my professor seems to think that eventually there will be no other way to speed things up.

    Its also like you'd be trading in one problem for a host of others. I remember doing tests on 1GHz clock chips, and those things had to be absolutely PERFECT in order work correctly on the motherboard. They ate up a lot of power too. However, an asynchronous design would have its own traps. You can design a state machine for it an then minimize the states, but glitches will do a lot more harm on a chip that is running asynchronously. Plus you have to take into account that chips run at different speeds at different temperatures. I think we have a long way to go in the quality of individual electronic components before we can actually implement a modern processor that is asynchronous.

    By the way, that Professor's name is John McDonald, and he's here at Rensselaer Polytechnic Institute.

    -Montag
    • by William Tanksley (1752) on Wednesday July 17, 2002 @04:47PM (#3904548)
      It's amusing to read the claim that an asychronous chip couldn't take advantage of pipelining. You see, the thing is that pipelining exists ONLY to control two of the disadvantages of clocked processors.

      First, it allows different instructions to complete in different amounts of time. An asynchronous chip wouldn't have that disadvantage.

      Second, it allows 'idle' portions of the chip to be used by other instructions whose time hasn't come. Asynchronous chips are vulnerable to that as well, but they can be much less vulnerable than even the most pipelined architecture, because dataflow can completely guide the chip: you can hammer in more data as soon as the previous data's been slurped in.

      So far from not taking advantage of pipelining, asynchonous chips naturally have one of the advantages of pipelining, and can be built to have the only other.

      -Billy
  • Armada (Score:2, Funny)

    Hopefully Sun's "ships" and "flotillas" won't go the way of the Spanish Armada;-) Will this be the new way to measure? "Sure this one has 10k ships, but they're only frigates. Even though this one only has 5k ships, they're all ships of the line."
  • by Alien54 (180860) on Wednesday July 17, 2002 @03:29PM (#3903904) Journal
    So ...

    if we have clockless computers for the desktop, HOW will Intel and AMD market them?

    After all, a large quick and dirty rating they have used for decades is the clock speed. Throw that away and what do you have?

    I can see the panic in their faces now...

    • An industry standard benchmark (SPEC CPU benchmark for example) will be used.
      This of course has problems because a lot factors into the speed of a computer. For instance motherboard chipsets will become increasingly important.
    • Intel won't market clockless chips. It will continue to market its overclocked Pentiums and run ad campaigns ridiculing AMD chips for running at 0 mhz.
      "Dude! You don't want one of those Thunderchunks. Those things run at like zero megahertz. The heck with that... Dude, you're getting a Pentium."
  • It would be interesting to see some real-world speed results comparing an asynchronous and synchronous circuit with identical functionality, fab process, transistor size, transistor switching speed, etc.
  • I am just curious. Would the way an asynchronous chip worked dictate anything about the instruction set of the chip? Would it be possible to use today's instruction sets in an asyn chip? Would you have to come up with something different? Would someone writing an asyn compiler have any special issues or optimisation techniques they would have to be aware of that would be inherent to the concept of the asyn chip itself?

    Are there any "features" related to the asynchronocity of the chip that it would be possible to add to the assembly language of an asyn chip? Becuase individual sectors of the chip can function independently and not have to synchronize, can you kind of get a multiprocessing-within-a-single-chip effect? I.E. can you create a singular asyn chip split up into separate sectors, each of which functions as if it were an autonomous processor? Can you have one chip concurrently execute single threads?

    If the answer to this last question is "yes", do you have to do this by organizing the chip such that the different sectors are basically seperate chips on the same cast, or can you just have it so that the exact borders of the the chip area working on a certain thread at a certain moment is reconfigured dynamically? Would it be possible someday to create a microchip whose internal execution model is somewhat like that of Cilk [mit.edu]?

    How does asynchronous design fit in with atomic-execution technologies like VLIW and EPIC?
  • by Dielectric (266217)
    A while back I saw a whitepaper on an asynchronous design, but it was being done for low power applicatios. Basically, you had two lines for each bit. Condition 00 wasn't allowed and could be used to detect faults. 10 was one, 01 was zero, and 11 was idle. Nothing would happen until one of the lines dropped, so there was no clock but the CPU still knew when it was time to do something. It was a fully static design where no power was being used unless there was some user interaction. You could run it off a few nanoamps, so a piece of citrus fruit would run it until the fruit rotted. Simple chemistry.

    I think this was from Seiko-Epson. I might have the states screwed up but that's the idea.
  • Today, we get to benchmark a system using measurements such as TpM, MIPS, FLOPS, etc. How do you quantify how fast a clockless machine is? Yes, they're supposedly faster with fewer transistors, but how do you sell a clockless computer to somebody who asks you how much faster a system is than the old one it replaces?

    The Pentium IV is supposed to be partially clockless, but to the outside world, all the I/O is clocked, making it easy to benchmark. If the I/O, logic, memory, etc., were ALL clockless, how fast is the machine?

    Government contracts of big systems are really picky about things like this.

    I think marketing will be the most likely problem for this technology. (Interfacing to clocked equipment won't be.)

  • by tlambert (566799) on Wednesday July 17, 2002 @03:43PM (#3904015)
    If you have looked at the "bucket brigade" graphic in the article, then you will know what I'm talking about...

    Is it just me, or does that picture seem to imply that you get a lower "buckets per unit time" throughput from asynchronous processing?

    I know that this is not the claim of the article... but it's still my gut reaction to the graphic.

    "Gandy Dancers" (railroad manual track laying and repair teams) were so-called because the first part of their name was the Chicago tool maker that made track laying tools, and the second part of their name came from the fact that they worked to a rhythm.

    A better analogy would be a work-content based multipath route, where the amount of time is based on the type of work to be performed.

    This would have implied (correctly) that, in an synchronous system, you should be able to "make up for" slow elements by doubling them up: i.e., when you are faced with a slow section of pipe, rather than bottle-necking, make it wider, instead.

    Or to use their analogy, if you have a slow guy, then get another slow guy to stand next to him so he doesn't bottlneck the brigade.

    Probably a more apt analogy would be nice: it's hard to show throughput increases, except by number of buckets in the hands of the people.

    -- Terry
    • by Rupert (28001) on Wednesday July 17, 2002 @04:21PM (#3904320) Homepage Journal
      It's more accurate if you think of the amount of water getting to the other end. If the water supply is irregular, the synchronous bucket chain will sometimes be sending empty buckets. The asynchronous bucket chain only has to send full buckets. If one person is 1% slower than the others, the other people on the synchronous bucket chain have to wait a whole extra cycle, reducing throughput by 50%. Throughput on the asynchronous bucket chain is reduced by just 1%.
      • All the benefits this article touts about asynchronous designs are almost totally bogus, except the claim about power consumption.

        As it stands now, it is more difficult to keep things happening in the right order in an asynch circuit than to route a clock.

        The idea that clock has to operate only as fast as the slowest component--Well this is true, but it doesn't matter. The last design I did had numerous clocks in it. The fastest being in the tens of MHz (I know, not that fast), the slowest being less than 1KHz. The portions of my design that were able to run at high speed did. The portions that needed a slower clock got one.

        And has anyone heard of a multi-cycle path? Just because a circuit can't complete its objective in one clock cycle doesn't mean you have to slow down the whole boat. If it needs more than one, give it two.

        There are a lot of other aspects to be concerned about too... design validation (on paper, before building it), static timing analysis, fault coverage...
  • Clockless primer (Score:3, Informative)

    by CommieLib (468883) on Wednesday July 17, 2002 @03:49PM (#3904052) Homepage
    Here's a somewhat shorter primer from Wired:

    http://www.wired.com/news/topstories/0,1287,6179,0 0.html
  • The famous PDP-6 was asynch logic. It made a very fast machine out of very few transistors, but was a nightmare to maintain. The follow-on PDP-10 was syncronous logic.

    There must be some history out there somewhere of the problems DEC had with the asynchronous logic. Any old MIT research notes?

  • Clockless issues (Score:2, Interesting)

    by KeggInKenny (593779)
    Despite the marketing problems associated with clockless machines (as a one-time computer retail sales guy, it was easy to talk a 1st-time-buyer into upgrading from the 800MHz system for the identical 900MHz for $50 difference) there are some other asynchronis aspects which may throw a wrench into design. First, if the chip is asynchronous, there must be a way to signal that the chip is ready for the next instruction - i.e. a "ready" line. Similarly since everyonw is pumped about the chips coming out in the last few years which execute multiple instructions simutaneously in different parts of the chip (think pipelines) there would have to be several of these signal lines. This will require additional logic circuits simply to decode these lines and figure out what instuctions can be executed, where they can, when they can, and so on. A related problem occurs with large, time consuming instructions which require the majority of the chip to execute. It would be difficult to implement a system which is by its nature asynchronous with a system of semaphores and time-estimating logic. And how much faster would this type of design actually be than a RISC based chip with a fast floating point unit. Since flops are typically the operation bottleneck in a new breed of processor, are we really saving time? I realize that the whole presumption of an asynchronous chip is that in traditional clocked chips we have to wait a finite amount of time (usually determined by the most complex operation) for every cycle, even if the op can be completed in less time. But personally I don't think that we'll see these on the market for at least ten years (well... maybe in a few microcontrollers, but only for really, really specialised tasks). Still it's good to see a few novel ideas (or in this case applying an old idea on a completely different scale) in this industry of re-packaging buzzwords.
  • by Alomex (148003) on Wednesday July 17, 2002 @04:14PM (#3904244) Homepage
    In the past I've mentioned here the role that popular publications like Scientific American have in creating hype. Be it the semantic web, nanotechnology, AI or asynchronous circuits, SciAm seems to focus on pie-in-the-sky ideas with a very small chance of success.

    That would be fine if they acknowledged this in the text, but more often than not they take an extremely bullish approach and echo the wildest promises by the researchers as if they were to happen tomorrow.

    Very smart people have been working for many years in asynchronous circuits, yet the likeliest scenario are hybrid designs mixing synch and asynch circuits (the asynch circuit stops the clock from propagating).

    Why do SciAm and other such publications do this? According to Chomsky because they are told so by the trilateral comission. Personally, I think they do it because it sells magazines.

  • The problem with slapping active cooling on an asynchronous chip is that the chip will *stop* working if it gets too cold, just like if it gets too hot.

    Here's why:

    There are two main aspects to consider in an asynchronous chip, gate delay (the time for a gate to open/close) and propagation delay (the time it takes for a signal to go from one gate to the next).

    Asynchronous logic works by carefully arranging the length and geometry of the wiretraces between gates, so that the signals coming from those traces all hit their target gate (nearly) simultaneously.

    The problem is that gate delays are affected by temperature differently than propagation delays. They both get faster with cooling, and slower with heating, but they do so nonlinearly, and at *different rates*. And asynchronous logic requires those rates to be carefully matched. Change the rates too much, and the chip breaks.

    Synchronous logic doesn't have this problem (as much), because the whole point of latching everything between clock cycles is to give the slower signals time to catch up to the faster ones, and to force them all to wait up until everybody is ready (at which point the clock releases the latch, and the next cycle starts). But this has the downside of the extra wiring, circuitry, and power required to run all the clock lines and latches.

  • Real-Time (Score:3, Interesting)

    by Amazing Quantum Man (458715) on Wednesday July 17, 2002 @05:28PM (#3904837) Homepage
    How would an asynchronous process affect determinism requirements, such as those of a hard real-time system?
  • The story spends a lot of words discussing Arbiters and the Buridan's Ass "paradox". A quote from the story: "Although Arbiter circuits never grant more than one request at a time, there is no way to build an Arbiter that will always reach a decision within a fixed time limit."

    I don't understand this at all - is it just Scientific American oversimplification? Why can't an Arbiter simply decide that if two pieces of data need to pass through the same component, it will let the left one through first this time, and next time there is a conflict it will let the right one through first (in order to avoid systematic "discrimination" against one part of the chip). This decision making process will always take the same amount of time.

    Can anybody explain to me what I have misunderstood - I'm sure there must be something I'm not getting, otherwise Sun wouldn't be researching this one piece so deeply.

    • I'm no expert but here's my best shot. Well, what is an arbiter? An arbiter is something will let a signal through unless there is a collision with another signal. But there is a place on the border of being a collision and going a certain way, it's hard to know which way to go. But what is a collision? If they both get there within 20 picoseconds? Say the left signal gets there first, but the right signal gets there 19 picosecond later. Do you let the left one pass? Or do you let the right one go because the left one got it last time?

      This is the meta-stable line that the article refers to. So set up another state (a "close collision state)that detects the meta-stable case and alternate letting the signals pass. But by doing this you create another meta-stable line between the "go left/right" state and new "close collision". Say the time difference falls in the new meta-stable line. How do you decide? This is where it gets tricky because you always have a boundary between states, and the more logic you throw in an arbiter the longer it takes to process the common "only the right/left signal is here so i'll let it pass" state and all other states.

      Also, the penalty for landing on the smaller meta-state will be proportional to how much more unlikely you made it to land on.

      So instead of best case of 200 ps and a 10% chance of 300ps, your suggestion might be a best case of 220 ps and a 5% chance of 600ps. Remember, when talking pico seconds, nothing is free.
  • by BitMan (15055) on Wednesday July 17, 2002 @06:20PM (#3905132) Homepage

    Unless I missed it, there was no mention of Theseus Logic's [theseus.com] Null Convention Logic [theseus.com] at all which is a real disappointment. Theseus has one of the few approaches that doesn't require a PhD-level of education to understand and design in.

    • This might be because theseus have no novel stuff. They copied all their work from a paper by muller from 1958.
      • This might be because theseus have no novel stuff. They copied all their work from a paper by muller from 1958.

        Huh? I'm not talking the packetized boolean stuff of Amulet that Furber came up with. I'm talking Karl Fant's NCL approach which he developed while at Honeywell in the late 60s through the 80s and took commercial when he created Theseus in the 90s.

        • Well NCL or "4 phase return to zero one of n codes" was invented by Muller in 1958. Theseus might act like they invented async but people were doing dual rail well before they came along.

          Their only comtribution is trying to use threshold "gates" (with hysteresis) which might be very nice isn't usefull for anyrthing except for adders.
          Their implementations use DIMS gates (again from 1958) and when they dont they have a non DI (Delay Insensitive) implementation with orphan hazards.

          I am doing research on simmilar systems and have found their buziness stratergy to make patents well after they were invented and then make every one beleive that they invented it. They do have some very nice mind melting presentations (Im guessing you went to one).
  • Asynchronous logic always appears to be better than synchronous at first glance, but when you do the math, sometimes it isn't so clear...

    Have you wondered why when traffic gets heavy on a freeway, it slows down? This is sort of like an asynchronous processor where every instruction is trying to get processed as quickly as they can (every driver is independent) but they need micro-synchronization to prevent collisions (brake lights, gas pedals). When the freeway is mostly empty, micro-synchronization works fine. However, as you approach the capacity limit, sometimes a global clock helps, sometimes it doesn't...

    If you have a pipeline, you get back pressure waves as you approach the capacity which can make things slower than a synchronous system. If the processing topology is more complicated, it becomes even more difficult to analyze...

    This effect is well known and affects things like processors and networks. Lookup articles on slotted ALOHA (a packet radio protocol) for some of the math if you are interested in some of the math behind this...

  • "Cockless Computing"?

    There goes my sales to lonely Nunns.
  • I would think that pipelining would be a difficult feature to uphold in an asynchonous system. You can't really divide instrucitons into little bits of undetermined size...or mayeb you can, if anyoen can explain how one pipelines an asynchonous processor I would be grateful
  • I have written a paper and am writing my thesis on synchronous to asynchronous conversion. You simply write your design in vhdl/verilog/schematics and you get a self timed design out. I placed a mips processor through it and it was about 30% faster.
  • Sorry I didnt post these sooner but I am at a asynchronous computing workshop [tima.imag.fr].
    If you are intrested in async then here is a list of cool websites:
    Async home [man.ac.uk] is the main website with resources events and background.
    Amulet group [man.ac.uk] have a selection of resources and news.
    And if you want a laugh then check out rat powered cpus [man.ac.uk]
  • ...if you can't define CLOCKS_PER_SEC

Receiving a million dollars tax free will make you feel better than being flat broke and having a stomach ache. -- Dolph Sharp, "I'm O.K., You're Not So Hot"

Working...