Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
AMD Hardware Technology

AMD's Piledriver To Hit 4GHz+ With Resonant Clock Mesh 286

Posted by Unknown Lamer
from the inventing-a-better-sliderule dept.
MojoKid writes about some interesting news from AMD. From the article: "Advanced Micro Devices plans to use resonant clock mesh (PDF) technology developed by Cyclos Semiconductor to push its Piledriver processor architecture to 4GHz and beyond, the company announced at the International Solid State Circuits Conferences (ISSCC) in San Francisco. Cyclos is the only supplier of resonant clock mesh IP, which AMD has licensed and implemented into its x86 Piledriver core for Opteron server processors and Accelerated Processing Units. Resonant clock mesh technology will not only lead to higher clocked processors, but also significant power savings. According to Cyclos, the new technology is capable of reducing power consumption by 10 percent or bumping up clockspeeds by 10 percent without altering the TDP." Unfortunately, aside from a fuzzy whitepaper, actual technical details are all behind IEEE and other paywalls with useless abstracts.
This discussion has been archived. No new comments can be posted.

AMD's Piledriver To Hit 4GHz+ With Resonant Clock Mesh

Comments Filter:
  • vaporware (Score:5, Insightful)

    by networkBoy (774728) on Monday February 27, 2012 @06:53PM (#39179813) Homepage Journal

    it's all vaporware till they ship, and it works.
    if they pull it off though, might give Intel a run for their money again, it's about time!

    • Re:vaporware (Score:5, Informative)

      by Anonymous Coward on Monday February 27, 2012 @07:04PM (#39179955)

      This is an ad. What is a "resonant clock mesh"? That's sounds really cool. So I started RTFA (I know, sorry). You don't have to chastise me that much, because I stopped reading soon. Right after

      An average Google search is reported to
      require ~ 0.3 watts, about the same amount of power that it takes for a 100 watt light
      bulb to be lit for 10 seconds.

      Which was obviously not written by anybody who has any clue what they are talking about.

      • Agreed. It's a breathlessly ebullient press release sales pitch. That said, I hope AMD is able to get back into the game to keep Intel honest, and I own an Intel processor (the last four or five machines I built before it were AMD-based).
        • Re: (Score:2, Insightful)

          The only workstation class machine with which I have been completely happy is powered by an AMD 4 way Phenom II. Quiet, powerful, cheap, pick all three. And looking around, I would say that its successor is highly likely to be an AMD 6 way, 45 nm process chip. Best value by far for my money.

          Today I can choose slightly less latency with Intel or significantly more value with AMD. Call me cheap, but I will take the value, thank you.

          • You first say you picked all three "quiet, "powerful", "cheap". Then you say you dropped the powerful to get the cheap. I'm confused.
          • Re: (Score:2, Interesting)

            by Anonymous Coward

            80% of Intel performance at 12% of the cost.

          • by billcopc (196330)

            Yep. This is where AMD lives and dies: the budget segment. That's where they stomp Intel, which prefers to keep its high margins and the mindshare that comes along with having the fastest chip of them all.

            For myself, AMD would have to push out very affordable 4-socket and 8-socket Opteron solutions, like they did in the K8 days. These days, it's a better value for me to spend the big bucks on Intel workstations and ride them out for an extra year.

            And when i say workstation, I'm thinking "server board wit

            • These days, it's a better value for me to spend the big bucks on Intel workstations and ride them out for an extra year.

              Your strategy confuses me. In the "extra" year you will lose big.

            • by haruchai (17472)
              There was a time 8-12 years ago where it looked like AMD could have snatched the performance crown. But, without the Fab expertise to match Chipzilla, it just never happened and nothing short of a fantastic screwup by Intel or an astonishing breakthrough by AMD will close the gap. But, AMD has been rock-solid for my personal needs and make it so easy to keep migrating to newer CPUs / Mainboards that I haven't run an Intel desktop, at home, in 10 years.
              • Re:vaporware (Score:5, Informative)

                by Nursie (632944) on Monday February 27, 2012 @09:46PM (#39181459)

                Err, there was a time 8-12 years ago when AMD *did* snatch the performance crown.

                Around about the time of the Athlon 64's appearance, when Socket 939 came along, they were actually both faster and cheaper than Intel. Nothing intel had could match the FX range on the desktop, and nothing intel were doing in the server room could match Opteron at the time. Intel was struggling with its netburst architecture (IIRC) which had high clock speeds and performed slightly better under some loads (video encoding IIRC) but markedly worse for pretty much everything else.

                It didn't last long, Intel took back the performance crown, and after a few years made serious inroads into the budget sector as well. But for a brief, shining moment (around the time the FX-55 and 57 were released) AMD held the crown.

                • by haruchai (17472)
                  That's what I was referring to. Apart from solving the Fab issues, I don't know what else AMD could have done. They had ongoing problems with yields and there were initial problems with power consumption. I vaguely recall that you had to use either ( or both ) certified coolers or power supplies or the warranty was void. Intel took FOREVER to get their version of Hypertransport and the Alpha-derived designs to market but once they did, with Nehalem, they've not looked back. If they ever revive and perfect
                  • by Nursie (632944)

                    "They had ongoing problems with yields and there were initial problems with power consumption. I vaguely recall that you had to use either ( or both ) certified coolers or power supplies or the warranty was void."

                    I have no idea about Opterons or the server room, but in the land of the desktop that wasn't so. The FX chips may well have needed a good power supply and decent cooling (especially if you were going to take advantage of their clock-unlocked features), but in general the high-end gamer PC world was

                  • by slydder (549704)

                    They could have left ATI the fuck alone and concentrated on doing that which they were really good at. Chip design.

                    Once they started messing with ATI and GPU's and automated chip design it all started to go downhill.

                    Here lately they are moving back to manual chip design which is why it's taking a bit to get back into gear. I only hope they can get it working before it's too late.

                    • Re:vaporware (Score:5, Informative)

                      by hairyfeet (841228) <bassbeast1968@@@gmail...com> on Tuesday February 28, 2012 @07:13AM (#39183785) Journal

                      Actually I'd say buying ATI was one of the smartest things they ever did. one can argue if they had waited until the market tanked they could have gotten it cheaper but hindsight and all that. But have you tried bobcat? Less than 18w for a dual core with an HD6310 GPU and often runs at less than 12w. hell AMD had to slow down their desktop production simply because they didn't have enough capacity to meet demand for the Brazos platform. If that's failure I'll take two please. Go to someplace like Tiger and see how many units you have with the E350, we are talking netbooks and laptops, HTPCs and all in ones, the OEMs are cranking out new designs to use those chips as fast as they can. I walked into my local Wally World the other day and less than 4 units were Intel, the rest? All AMD Fusion. And don't forget this is still running on VLIW GPUs, the next revs will replace them for vector units which should behave like a hyper powerful FP when not needed for graphics.

                      so I'd say while AMD has made some SERIOUS mistakes, killing the AM3 line and Stars arch before getting the bugs fixed (or better yet replacing for the consumer chip) the BD/PD design, trying to push a server chip like BD/PD as a desktop chip, frankly the APUs created thanks to the merger have been one of the few smart moves they've had. With Brazos they have a unit that stomps Intel+ION while often costing less than intel alone and thanks to intel shooting themselves in the face by killing the Nvidia chipsets there won't be any new ION designs. With Brazos you have a unit that sips power, is quiet, low enough heat it can be passively cooled, while still able to do 1080p over HDMI. If you haven't tried one you really should, its a sweet chip.

                • by billcopc (196330)

                  The Athlon 64 was indeed awesome. I was a full-on raging AMD fan back then, eventually culminating in an 8-way Opteron workstation: the good old Tyan Thunder K8QW. Only problem was, AMD stagnated for way too long. When I upgraded from the A64 to the X2, it was a huge leap (obviously), stomping all over Intel's overpriced Pentium-D. But then, Intel came out with the Core 2 series, and AMD just kept releasing die-shrinks of the same old CPUs. I had nothing to upgrade to. I eventually tired of waiting fo

            • Re:vaporware (Score:4, Informative)

              by level_headed_midwest (888889) on Monday February 27, 2012 @10:56PM (#39181891)

              AMD *does* push out affordable 4-socket Opteron setups- the Opteron 6000 series CPUs. They are selling those a whole ton less expensive now than they did in the K8 days. The least-expensive Opteron 6000s sell for $266 each and the most-expensive ones are around $1200-1500, compared to starting around $800 each and going on up to close to $3000 for the K8-era 4-way-capable Opterons. Considering a 4-way-capable Intel Xeon still costs close to $2000 and goes on up to near $5000- and is based on two-year-old technology- the Opterons are that great deal you were wishing for.

              However on the desktop, Intel has gotten much better in their pricing (i.e. they don't cripple lower-end chips as severely as they used to) and is giving AMD a real run for their money.

          • The only workstation class machine with which I have been completely happy is powered by an AMD 4 way Phenom II.

            My last box was a quad-core Phenom II. It served me well. There's no denying though, that Intel's current i7s (I have a 2600K) blow everything else out of the water. I fervently hope AMD will come up with something to challenge it. Competition is good.

          • Re:vaporware (Score:4, Informative)

            by hairyfeet (841228) <bassbeast1968@@@gmail...com> on Tuesday February 28, 2012 @07:00AM (#39183745) Journal

            May I make a suggestion? Tiger has been selling their remaining stocks of 95w Thubans (in case you haven't heard in a serious "WTF are they thinking?" move AMD has killed AM3 for two sockets that have less than a year of life in them, FM1 and AM3+) for around $100. Sign up for their emails, that is where they have been offering it as of late. i got one and with the money i saved upgraded my ECS board to a nicer Asrock and i must say i couldn't be happier, the 1035T is not only around 40% faster than my 925 Deneb but whereas the Deneb would max out at around 139f doing transcodes with the hyper N520 cooler i paired the thuban with i'm getting a MAX of 114F and that's after 7 and a half hours of slamming the CPU with Virtualdub. At idle this baby is literally below room temp, no shit looking at Coretemp my chip is at 67f and the room is 72f. Frankly I've never been happier with a chip upgrade in my life and its just a damned shame AMD has killed AM3 but their loss is your gain if you jump on it and snatch one while they're cheap. I mean 6 cores for $109? How can you beat that? Paired with 8gb of RAM and a CF enabled board i figure this baby will last me until 2020 easy, what a sweet chip.

            But for everyone that wants to save some money and have a nice chip snatch one of the AM3s NOW before the stock runs out because when they are gone, that's it. I went ahead and built my GF a new Athlon X3 box and gave the Deneb to my youngest and as soon as this next batch of laptops gets sold I'll be building the oldest an X3 or X4 before supplies run out. The really nice Am3 boards have never been cheaper and paired with 4-8Gb of DDR 3 and a Hyper212 or hyper N520 they make pretty badass desktops, plenty of OCing headroom if you desire and easy to unlock so that X3 can easily be the cheapest quad you'll ever buy. But for me that X6 so cheap? hell how could you not love getting 6 cores for $109 shipped? That's a no brainer.

        • by Ungrounded Lightning (62228) on Tuesday February 28, 2012 @12:41AM (#39182361) Journal

          Agreed [that it looks like vaporware]. It's a breathlessly ebullient press release sales pitch.

          Agreed it's a sales pitch. But not vaporware at all. Very neat solution. (I saw another with similar properties a couple years ago but this one is 'way better.)

          The issue is the power consumption of the clocking of the chip. Modern designs are primarily layers of D-type flip-flop registers separated by small amounts of random logic and all the flip flops are clocked simultaneously, all the time. The clock signal is input to ALL the flipflops and a bit of the random logic. I'm guessing somewhere between one in five and one in ten gate inputs are driven about equally by CLK or ~CLK. Further, the other signals flip between one and zero once, sometimes, on each cycle. ALL the CLK signals flip from zero to one and back to zero EVERY cycle. So there's a lot of activity on the clock.

          In CMOS the load on the clock is primarily capacitave - the stray capacitance of the CMOS gates and wiring - plus some losses, mainly due to the resistance of the wiring. The stray capacitance has to be charged and discharged every cycle. The charge represents energy. In a conventional design the clock drivers are essentially the same thing as logic gates (inverters). New energy is supplied from the power supply (and about half of it, excluding signal-line resistive losses, dumped as heat in the pullup transistors of the drivers) every cycle as the lines are charged. Then the charge is dumped to ground (and the rest of the energy dumped as heat in the pulldown transistors). All that energy gets lost as heat every cycle, and it represents about 30% of the power consumed by the chip. It would be nice to scavenge it and reuse most of it for the next tick.

          A previous invention used a half-wave transmission line looped around the chip and connected plus-to-minus. A big mobius strip. The CLK and ~CLK loads acted as distributed capacitance around the transmission line. A clock waveform circulated continuously, twice per cycle. Instead of a sea of drivers providing new energy and then throwing it away every cycle, the transmission ring had a few drivers distributed around it, keeping the wave circulating and correctly formed, and pumping in enough energy to replace the resistive losses while the bulk of the energy went round-and-round. Result: Most of the clock power requirements and heating load go away.

          Unfortunately, the circulating clock wave meant the region completing a computation ALSO went round-and-round, rather than everything switching at the same time. Stock design tools assume CLK/~CLK is simultaneous (except for minor variations) across the whole chip. So using that earlier system would require a major rewrite on the stock tools and new design methodologies.

          THIS system does a similar hack energetically, but with everything in sync. Instead of a sea of drivers driven by a carefully-balanced tree of pre-drivers, the CLK and ~CLK are constructed as a pair of heavy-conductor meshes - like two stacked layers of flattened-out window screens. These form two plates of a capacitor. These plates are connected by an inductor, forming a resonant "tank circuit". When this is "pumped up" by a few drivers and is "ringing", energy alternates between being an electric field between the screens and a magnetic field in the inductor coil, twice (once for each polarity) each cycle. Again the bulk of the energy is reused over and over while the drivers only have to replace the (mostly) resistive losses (and pump it up initially, over a number of cycles). Again the bulk of the clock power and heating is gone. But this time the whole chip is switching essentially simultaneously, so the stock design tools just work.

          Neat!

          Downside (of both inventions): You can't quickly start and stop the clock in a given area or run it more than a few percent off the speed set by the resonance of the tank circuit or transmission line. No overclocking. Also no clock gating to save power on quiesc

      • Re:vaporware (Score:4, Informative)

        by hawguy (1600213) on Monday February 27, 2012 @07:55PM (#39180495)

        This is an ad. What is a "resonant clock mesh"? That's sounds really cool. So I started RTFA (I know, sorry). You don't have to chastise me that much, because I stopped reading soon. Right after

        An average Google search is reported to
        require ~ 0.3 watts, about the same amount of power that it takes for a 100 watt light
        bulb to be lit for 10 seconds.

        Which was obviously not written by anybody who has any clue what they are talking about.

        I think it was a typo (or edit by someone who doesn't know what they are talking about). They should have said 0.3 watt-hours (and should have said "energy" instead of "power")

        Google says they use 0.0003 kWh of energy per search [blogspot.com].

        A 100W bulb uses .1 kWh in an hour, or .0000278 kWh in a second, or .000278 kWh in 10 seconds. (or .278 Wh)

        Therefore, a 100W bulb running for 10 seconds uses about the same amount as energy as an average Google search. Which is a lot higher than I thought it would be - since I use 20W CFL's, each time I do a google search, that's the equivalent of 50 seconds of light per Google search. Just while typing this reply, I did enough Google searches to light up my room for about 15 minutes.

        • by PitaBred (632671)

          I think they take all the routers, networking, cooling, etc. into account as well. Not just the CPU power.

      • This article [hothardware.com] has an informative diagram.
    • Not really (Score:5, Insightful)

      by Sycraft-fu (314770) on Monday February 27, 2012 @07:18PM (#39180093)

      Intel is already running at 4GHz+. Ok not officially, but it is almost impossible to find a Sandy Bridge K series that won't easily overclock to 4Ghz or more. I bumped my 2600k to 4GHz. No voltage increase, no messing around, just turned the multiplier up. Zero stability issues, doesn't even draw a ton more power. Basically they are just being conservative for thermal reasons.

      The 22nm Ivy Bridge is soon to launch as well. Never mind any potential better OCing, it is faster per clock than SB. Well SB is a good bit faster than Bulldozer (who's architecture Piledriver uses) per clock, sometimes more than a bit (depends on what you are doing).

      So no, they'd need way more speed to give Intel any kind of run for their money, unfortunately. What they really need is a better design, something that does better per clock, but of course new designs take a long time and BD itself was quite delayed.

      Remember the one and only time AMD did eclipse Intel was during Intel's P4 phase. Intel had decided to go for low work per clock, high clock speed. Well speeds didn't scale as they'd hoped and the P4 was not as powerful for it. AMD chips were tops. However the Core architecture turned all that around. It was very efficient per clock, and each generation just gets better. Meanwhile AMD stagnated on new architectures, and then released Bulldozer which is not that great.

      Also they have to fight the losing fab battle. They spun off their fabs and as such aren't investing tons of R&D in it. Well Intel is, and thus are nearly a node ahead of everyone else. Other companies are just in the last few months getting their 32nm node and 28nm half-node production lines rolling out products to retail channels. Intel has their 22nm node process complete and is fabbing chips for retail release in a couple months. So they've got that over AMD, until other fabs catch up, by which time Intel will probably have their 14nm half-node process online in Chandler (the plant construction is in full swing).

      Sadly, things are just not good in the x86 competition arena. AMD competes only in a few markets, and Intel seems to edge in more and more. Servers with lots of cores for reasonable prices seems to be the last place they really have an edge, and that is a small market.

      I don't want to see a one player game, but AMD has to step it up and this unfortunately is probably not it. If they make it work, expect Intel to just release faster Core i chips with higher TDP specs. The massive OCing success shows they could do so with no problem.

      • Re:Not really (Score:5, Insightful)

        by networkBoy (774728) on Monday February 27, 2012 @07:32PM (#39180245) Homepage Journal

        I'm a diehard Intel Fanboi. My last AMD was an 80286, I owned an AMD80386DX40, but never used it (acquired it at a swap meet after the P60's had just launched).
        Prescott had a use case where it outperformed AMD, but it was very narrow, if your load was highly predictive and did not cause cache misses or branch prediction failures, it owned the AMD. Sadly every workload except straight up numerical number crunching was not so good. I used my 3.6GHz P4 for transscoding video. It was the first machine that I owned where I could encode faster than real time (i.e. movie is 60 min, I could encode in 50).

        I really hope this pans out for AMD and brings them a little up into Intel's game. While as you said there has only been one time where AMD flat out bested Intel, there have been several cases where AMD has nailed a particular segment:
        * Low cost many cores (data compute clusters).
        * Low cost reasonable performance for most end user loads.
        * Downright cheap CPU for entry machines.
        Every time they've done something they have forced Intel to step up to that segment and improve.
        In this case I hope to see not the high spec CPU improvement, but rather the mid-range CPU segment get a very low power option. Somewhere in the i5 equivalent range, but giving desktop performance while sipping mobile levels of power.
        It would make building a poor man's compute cluster more feasible from a power and cooling standpoint.
        -nB

      • Re:Not really (Score:4, Insightful)

        by Intropy (2009018) on Monday February 27, 2012 @07:51PM (#39180445)

        It's not true that the AMD lead was that short. The Athlon came out and was immediately on par with or better than Intel's Pentium IIIs. By the time it was thunderbird vs coppermine/tualatin the lead was pretty sizable. That lasted throughout the Athlon64/Pentium 4 period and into the Core's run until the Core 2 duos arrived. The gap was close for a while with Inte's multi-core processors generally superior, but as little as about a year and a half ago, AMD had the better offering in the X3 than Intel's Core i3. Competition is tight, which has been good for the rest of us.

        • If you're factoring cost in, AMD's lead dates back to the K6-2. Clock for clock they were slower, but I could get a 400MHz K6-2 and motherboard for less than a 266MHz Pentium 2 and motherboard back then and the K6-2 was a lot faster - especially since it ran the memory 50% faster than the P2.
      • by PitaBred (632671)

        You can probably go higher. I've got a 2500K that's running 4.8GHz on stock voltages. Basically all K series chips can reliably hit 4.5GHz on stock voltages with adequate cooling.

      • And, as always, so can many of AMD's latest offerings (exceed 4Ghz).

      • by Nemyst (1383049)

        It's even more ridiculous than that. My motherboard automatically overclocked my 2500K to 4.3GHz. From what I can tell, that 1GHz increase over the stock value isn't even pushing it (temperatures are still ridiculously low, with a 7-Zip benchmark hitting 55C). Granted, aftermarket coolers probably help, but I believe a 0.5-0.75GHz bump on a stock cooler is entirely reasonable.

        I have a feeling that Intel might actually be downplaying their default clocks; even under the most terrible conditions, I can't see

      • by dbIII (701233)
        Yes, but can a Xeon do it and how many cores can you have in a box at a sane price?
        Currently there are AMD systems with 64 cores going for under US$10k. For some CPU bound tasks such things are wonderful. Any speed increase makes it even better.
        Intel are catching up with 10 core CPUs that are faster than the currently available Opterons but for tasks with a LOT of threads the AMD CPUs still outperform for the same number of sockets. It may look like a "small market" to you but there's still a huge number
      • Also they have to fight the losing fab battle. ... Other companies are just in the last few months getting their 32nm node and 28nm half-node production lines rolling out products to retail channels. Intel has their 22nm node process complete and is fabbing chips for retail release in a couple months.

        However this technology lets AMD get rid of most of the clock drivers and most of their power consumption and waste heat. That means the rest of the logic can be pulled closer together in a given technology, s

    • by ackthpt (218170)

      it's all vaporware till they ship, and it works.
      if they pull it off though, might give Intel a run for their money again, it's about time!

      Intel is pretty good at catching up, even after Intel said nobody needed 64 bit processors and nobody needed multi core processors, they're right there on top.

    • by AbRASiON (589899) *

      "might give Intel a run for their money"
      I'm sorry to inform you but you're a little (lot) out of the loop on the current state of Intel and AMD processors available. I'm sure most people here don't want to hear this but the little guy is well and truly down on the ground being kicked in the stomach.

      I wouldn't be surprised if one of these CPU's at 5ghz would barely compete with Intels current top shelf items, let alone 4ghz.

      • Re: (Score:2, Interesting)

        by networkBoy (774728)

        Is AMD really doing that badly?
        Seriously I am out of the loop from an AMD perspective*, but I assumed they were still rocking the cost/performance on the low end of the CPU ranges, and was hoping this would allow them to push into the mid-range i5 territory.
        -nB

        *all I work on at work & at home is Intel stuff, so I don't have any relevant AMD info.

        • Re: (Score:2, Informative)

          Bulldozer - their current architecture - was really bad. Slow, mediocre price/performance ratio, and power-hungry. It remains to be seen if Piledriver can make it all better.
          • I got a bulldozer 8250 for 179$ and an motherboard AM3+ for 139$.
            It run everything I want well and I never have to kill anything before starting a demanding game.
            It compiles speedily enough that my vertex 2 is now the bottleneck when I run maven.

            So please, tell us what Intel could have offered me in term of performance with a set price of 318$ ?

            • by kyrio (1091003)
              You have to ignore the people who go on about AMD not being worth the money (though I have to admit that Bulldozer was a huge flop). Last year I got my 955BE and motherboard for $200 total. Nothing Intel offers can come close to that for a CPU and Mobo. The CPU alone would be at least $150, to match the Phenom II X4 955BE. I got a high quality motherboard and high quality CPU for about the cost of Intel's lower end CPUs.
        • Re:vaporware (Score:5, Informative)

          by ifiwereasculptor (1870574) on Monday February 27, 2012 @07:48PM (#39180413)

          Well, here's AMD on a nutshell:

          Brazos, the ultra low power processor, is a success.

          Llano, the A series, is actually a very solid product. For the cost of an i3, you get a quad core that is about 1/4 slower overall, but whose integrated graphics about 3 times faster. Actually selling very well.

          Bulldozer is a disaster unless all you do is video encoding.

          Now, here's the puzzling part: they want to use bulldozer, the failure, as the new core for the A series, the success. I hope they find a way to fix it, otherwise my next rig will have an Intel for the first time in ten years.

          • Re:vaporware (Score:5, Interesting)

            by tyrione (134248) on Monday February 27, 2012 @08:24PM (#39180807) Homepage
            You must not work in Parallel Programming, doing any heavy engineering analysis/modeling. Taking advantage of all those threads and cores within Bulldozer and utilizing it with OpenCL along with the GPGPUs is a dream come true. More and more modeling environments are leveraging all that this architecture offers, but to you if your game doesn't presently use it it's worthless. To each their own.
          • Re:vaporware (Score:5, Interesting)

            by Anthony Mouse (1927662) on Monday February 27, 2012 @09:34PM (#39181341)

            Now, here's the puzzling part: they want to use bulldozer, the failure, as the new core for the A series, the success. I hope they find a way to fix it, otherwise my next rig will have an Intel for the first time in ten years.

            I think the people calling bulldozer a failure have the wrong expectations. The core used in the existing A series is a direct descendant of the original Athlon from 1999, which itself was very similar to (and designed by the same people as) the DEC Alpha introduced in 1992, predating even the Pentium Pro. Suffice it to say that there isn't a lot of optimizing left to be done on the design.

            Bulldozer is a clean slate. The current implementation has some obvious shortcomings, not least of which that the cache architecture is lame. (The L1 is too small and the L2 latency is too high. They might actually do pretty well to make a smaller, lower latency, non-exclusive L2 and use the extra transistors for a bigger L3 or even an L4.) But that's not a bad thing. It's something they can fix and make future generations faster than the current generation. Which is the problem with the old K10 -- there are no easy little changes left to be made to make it substantially faster than it is now.

            The other part of the problem is that people want Bulldozer to be something it's not. It isn't designed for first in class single thread performance. It's designed to have adequate single thread performance while reducing the number of transistors per core so that you can have a lot of cores. It's designed for the server market, in other words. And to a lesser extent the workstation market. They designed something that would let them compete in the space that has the highest margins. So now all the high-end gamers who only care about single thread performance are howling at the moon because AMD concluded it couldn't compete with Intel in that sector and stopped trying.

            What you have to realize is that it isn't that the design is flawed. It's that you aren't the target market. They could have built something that achieved 90-100% of Intel's best on single threads instead of 60-80% by doubling the number of transistors per thread and halving the number of threads and cores, but think about who would buy that. PC enthusiasts who comprise about 0% of the market. It wouldn't sell in the server market because the performance per core * number of cores would be lower. It wouldn't sell in the budget market because it would require too many transistors per thread and therefore cost too much to manufacture.

            Instead, with Bulldozer they can use more modules and sell to the server market or anyone else with threaded software and then and use fewer modules in combination with a GPU and sell to the budget market and the midrange gaming market, and leave the six dozen howling high-end PC gamers to Intel.

            • by nadaou (535365)

              hear, hear

            • by wanzeo (1800058)

              Fantastic explanation.

          • Re:vaporware (Score:4, Interesting)

            by TheLink (130905) on Tuesday February 28, 2012 @02:35AM (#39182767) Journal

            This might be enlightening: http://hardforum.com/showpost.php?p=1037482638&postcount=88 [hardforum.com]

            What did happen is that management decided there SHOULD BE such cross-engineering ,which meant we had to stop hand-crafting our CPU designs and switch to an SoC design style. This results in giving up a lot of performance, chip area, and efficiency. The reason DEC Alphas were always much faster than anything else is they designed each transistor by hand. Intel and AMD had always done so at least for the critical parts of the chip. That changed before I left - they started to rely on synthesis tools, automatic place and route tools, etc. I had been in charge of our design flow in the years before I left, and I had tested these tools by asking the companies who sold them to design blocks (adders, multipliers, etc.) using their tools. I let them take as long as they wanted. They always came back to me with designs that were 20% bigger, and 20% slower than our hand-crafted designs, and which suffered from electromigration and other problems.

            That is now how AMD designs chips. I'm sure it will turn out well for them [/sarcasm]

            And that comment was back in 2010. No surprise now Bulldozer is slower and uses more power, and the only advantage is it has more cores (meh, any idiot can add more cores, at worst case you just add another computer[1]).

            [1] The same embarrassingly parallel tasks that do well on multiple cores will do well on multiple computers.

        • by AbRASiON (589899) *

          Yes, they are doing that badly. The bulldozer was a giant dissapointment. They have nothing on the table for the desktop crowd. At almost all price points it's silly to buy AMD at this time unfortunately. Especially for heat / power usage etc.

      • *shrugs*

        AMD's strategy was to switch to milling out 2 cores or so per unit, aka the Bulldozer architecture, and then stitching them together into a processor. I guess it makes the design more compact / easier to fab.

    • Yes, because everyone knows that having more cycles is the way to win the processor war. That's why the pentium 4 was so dominant.
  • But how will it scale? How many FLOPS can it pull? GHz doesn't mean squat.....

    • Re:That's nice (Score:4, Insightful)

      by networkBoy (774728) on Monday February 27, 2012 @06:55PM (#39179841) Homepage Journal

      for a single executing thread of a specific bit width GHz means everything.
      The trick is can they scale it to multiple cores/threads, while lowering their power to match Intel's performance/Watt at the high end of the compute arena. If they can do that they will once again pull in DC customers.
      -nB

      • by mhajicek (1582795)
        Single core performance is all that matters when processing a toolpath for CNC machining. I don't care about power consumption, just higher clock speed and fast memory access (large cache).
        • by Kaenneth (82978)

          And for Dwarf Fortress.

        • Re:That's nice (Score:5, Insightful)

          by Daniel Phillips (238627) on Monday February 27, 2012 @07:39PM (#39180317)

          Single core performance is all that matters when processing a toolpath for CNC machining.

          Rubbish. There is no way your CNC machining app will even get close to the minimum latency that a single AMD core is capable of. What you are really saying is that your vendor is slow to get a clue about parallel programming.

           

          • by Jeremi (14640)

            What you are really saying is that your vendor is slow to get a clue about parallel programming.

            Maybe there are CNC algorithms that aren't easily parallelizable. Or (more likely) they can be paralellized, but the CNC development teams haven't got around to doing that yet. It doesn't really matter which as far as the consumer is concerned -- in either case, they will want a chip that maximizes single-threaded performance. Finger-pointing doesn't help them one bit, but fast CPUs might.

            • Maybe there are CNC algorithms that aren't easily parallelizable.

              I doubt that, being somewhat familiar with the problem space.

              It doesn't really matter which as far as the consumer is concerned -- in either case, they will want a chip that maximizes single-threaded performance.

              Speak for yourself. I prefer to keep the money in my pocket, and spend it on more frequent full-box upgrades. This keeps me ahead of the curve on average. Example: in a past gig where money was no object I started life with a Core2 class desktop which was state of the art at the time, but no, even when money is no object the beancounters will reject the idea of a new box every six months. In short order my onetime shiny Intel box was being smoked

          • Re:That's nice (Score:4, Informative)

            by Shark (78448) on Monday February 27, 2012 @08:24PM (#39180803)

            He's not talking about running the g-code, he's talking about generating it from a model. Most CAM software are very CPU intensive for toolpath generation.

            • Most CAM software are very CPU intensive for toolpath generation.

              All the more reason to parallelize it, cutting latency drastically in the process.

            • by scsirob (246572)

              So you have a team of engineers working on a design for weeks or even months. It's finally done, and now you are arguing that the CPU to translate to G-code takes an hour extra? Rrright...

      • by DJRumpy (1345787)

        Agree. The multi-core trend was more to address inefficiency in CPU design, as well as technological limitations in clock speed. In short, GHz is important, as long as it's efficient.

        • Agree. The multi-core trend was more to address inefficiency in CPU design, as well as technological limitations in clock speed.

          More precisely, it is about seeking the best tradeoff in the Latency*Heat*Cost equation.

          In short, GHz is important, as long as it's efficient.

          Interesting proposition. I think its a little more complex than that. The main use of GHz today is to paper over the inefficiencies of current-generation single threaded software.

      • See Athlon vs P4. Both were best for single threaded stuff, owing to a single core. However the Athlon did more with less, got better performance at lower clocks. Why? It could do more per clock, or more properly took less clocks to execute an instruction.

        IPC matters and the Core i series is really good at it. Bulldozer, not as good. What that means is that all other things being equal, BD needs to be clocked higher than SB to do the same calculations in the same time.

        Well that is also a problem because the

        • by PitaBred (632671)

          If you go aftermarket cooling, you can almost certainly hit 4.5GHz on stock voltages. Right now I'm doing 4.8GHz on stock voltage with a 2500K.

  • details? (Score:5, Insightful)

    by rudy_wayne (414635) on Monday February 27, 2012 @07:11PM (#39180021)

    Unfortunately, aside from a fuzzy whitepaper, actual technical details are all behind IEEE and other paywalls with useless abstracts.

    So why post an article that contains no meaningful information?

    Oh wait . . . never mind. I forgot where I was.

  • awesome (Score:4, Insightful)

    by smash (1351) on Monday February 27, 2012 @07:40PM (#39180347) Homepage Journal
    Maybe it will catch up to the Sandy Bridge Core i5 now?
  • by viperidaenz (2515578) on Monday February 27, 2012 @08:12PM (#39180681)
    but clock resonance sounds like it wouldn't play well with changing the clock frequency.
  • resonate clock mesh (Score:5, Informative)

    by slew (2918) on Monday February 27, 2012 @08:21PM (#39180763)

    Quick background: Currently clocks on most generic chips today are structured as trees. As you can imagine the fan-out of the clock trees is pretty large and thus require clock buffers/driver circuits which need to be balanced so that clock signal gets to the leaves at about the same time (in a typical design where you don't use a lot of physical design tricks). To ease balancing the propagation delay, the clock tree is often physically looks like a fractalized "H" (just imagine the root clock driving in the center of the crossbar out towards the leaves at the corners of the "H", the wire lengths of the clock tree segments are the same, then the corners the big H driving the center of a smaller "H", etc, etc). Of course at the leaves, there can be some residual imbalance due to small manufacturing variations and wire loading and that has to be accounted for in closing the timing for the chip (to avoid short paths), and ultimatly these imbalances limit the upper frequencies achievable by the chip.

    Additional background: In any electrical circuit, there are some so-called resonant frequencies because of the distributed (or lumped) inductance and capacitances in the network. That is some frequencies experience a lot less energy loss than average (for the car analogy buffs, you can get your car to "bounce" quite easily if you bounce it at it's resonant frequency).

    The basic idea of the Cyclos technology is to "short-circuit" the middle of the clock tree on the chip with a mesh to make sure all the middle of the clock tree is coordinated to be the same clock (as oppposed to a typical H tree clock, in every stage the jitter builds up from the root). That way you avoid some of the imbalances the limit the upper frequencies achievable by the chip. The reason I say "short-circuit" is that it really isn't a "short circuit". If you just arbitrarily put in a mesh in the middle of a clock tree, although it would tend to get the clocks aligned, it would presents a very large capacitive and inductive load to drive and would likely increase power greatly. **Except** if that mesh was designed so that it resonated at the frequency that you were going to drive the clock, then you can get the benefit of jitter reduction w/o the power cost. Since you get to pick the physical design parameters of the mesh (wire width, length, and grid spacing, and external tank circuit inductance) and the target frequency, theoretically you can design that mesh to be resonant (well, that remains to be seen).

    The reason this idea hasn't been used to date is that it's a hard problem to create the mesh with the proper parameters and now the processor really has to just run at that frequency all the time (well, you can do clock cycle eating to approximate lower frequencies). Designers have gotten better at these things now and the area budgets for these types of things have gotten in the affordable range as transistors have gotten smaller.

    FWIW, In a pipeline design (like a cpu), sometimes it's advantagous to have a clock-follows-signal clocking topology or even an async strategy instead of a clock tree, but there of course is a complication if there is a loop or cycle in the pipeline (often this happens at say a register file or a bypass path in the pipeline), so that trick is limited in appliciablity, where the mesh idea is really a more general solution to clock network jjitter problems.

    Here's a white paper that describes this idea... http://www.cyclos-semi.com/pdfs/time_to_change_the_clocks.pdf [cyclos-semi.com]

    • by TheSync (5291) on Monday February 27, 2012 @08:47PM (#39181023) Journal

      How can the mesh be resonant to a square wave (with lots of high frequency harmonics over a huge band)?

      I can imagine it being resonant to a single frequency sine wave.

      But if the clock mesh is powered by a sine wave, you have to turn it back into a square wave to drive gates, and to do that you have to compare the clock voltage level with some known voltage levels, and there you may have process inaccuracies.

      • by subreality (157447) on Monday February 27, 2012 @10:15PM (#39181665)

        [quote]How can the mesh be resonant to a square wave (with lots of high frequency harmonics over a huge band)?[/quote]

        There's no such thing as a square wave at 4GHz. You can draw them like that on paper, but in reality the edges smear into a pretty good approximation of a sine wave.

        Regardless, it will still have some higher frequency components, but you don't have to worry about them. The resonance won't help generate nice sharp edges, but that's the line driver's job. The resonance is just to save energy by helping pump the voltage at the fundamental frequency.

        (Disclaimer, not an EE, but I've looked over their shoulders a bunch of times)

    • by kermidge (2221646)

      Thanks to you and PatPending, I'm now read into something that seems mighty interesting and far over my head. I'm wondering if, or how, this might be applied to the "3d" chips that IBM is/was working on. (Btw, I re-read James P. Hogan's "Inherit the Stars" on Saturday - he mentions stacked chips with internal cooling channels, in 1978.)

  • by PPH (736903) on Monday February 27, 2012 @08:38PM (#39180929)

    ... inside the processor? Sounds like the end of overclocking.

  • by Gordo_1 (256312) on Monday February 27, 2012 @09:00PM (#39181125)

    and has been for at least 5 years. A theoretical 10% performance boost? Gimme a break. I upgraded from a Core2Duo E6600 @ 2.4GHz to a quad core i5 2600k which runs at an overclocked 4.5GHz on air... Day to day, the new rig delivers a *mostly* perceptible performance advantage, but nothing earth shattering... I give you several recent changes that felt bigger:

    1. Moving from hard drive to SSD
    2. Moving from a DirectX9 class GPU to a DirectX 11 GPU (at least in games).
    3. Move from pre-JIT JS browser engine to a JIT-engined browser.

    As far as desktop CPU development goes, I think the future is largely about optimizing software for the multi-core architectures, not adding Gigahertz.

  • by DrJimbo (594231) on Monday February 27, 2012 @11:18PM (#39182005)

    link [cyclos-semi.com]

    Cyclos resonant clock mesh technology employs on-chip inductors to create an electric pendulum, or "tank circuit", formed by the large capacitance of the clock mesh in parallel with the Cyclos inductors. The Cyclos inductors and clock control circuits "recycle" the clock power instead of dissipating it on every clock cycle like in a clock tree implementation, which results in a reduction in total IC power consumption of up to 10%.

    Inductors save power because unlike most other circuit elements, inductors are able to store energy in a magnetic field so it can be used later on. This is part of how switching power supplies get their efficiency.

  • I don't even know what one is. And I haven't even glanced at the fine article. I just know I want one of those. Sounds so shiny. Just wanna say it over again and again and again...

    Resonant Clock Mesh

    Resonant Clock Mesh

    Resonant Clock Mesh...

  • by jfbilodeau (931293) on Tuesday February 28, 2012 @06:34AM (#39183649) Homepage

    I don't know about you, but I would be concerned about the effects of a resonance clock mesh cascade failure.

    I know a guy who had to deal with a resonance cascade and it wasn't pretty.

The only problem with being a man of leisure is that you can never stop and take a rest.

Working...