Forgot your password?
typodupeerror
AMD Upgrades Hardware

AMD Designing All-New CPU Cores For ARMv8, X86 181

Posted by samzenpus
from the brand-new dept.
crookedvulture (1866146) writes "AMD just revealed that it has two all-new CPU cores in the works. One will be compatible with the 64-bit ARMv8 instruction set, while the other is meant as an x86 replacement for the Bulldozer architecture and its descendants. Both cores have been designed from the ground up by a team led by Jim Keller, the lead architect behind AMD's K8 architecture. Keller worked at Apple on the A4 and A4 before returning to AMD in 2012. The first chips based on the new AMD cores are due in 2016."
This discussion has been archived. No new comments can be posted.

AMD Designing All-New CPU Cores For ARMv8, X86

Comments Filter:
  • by nitehawk214 (222219) on Monday May 05, 2014 @03:27PM (#46921579)

    Probably worked on the A4 and A4 and the A4, as well.

  • ... be common, and use something like code morphing - which Transmeta used - to come up w/ a solution that would work w/ both x64 and ARM 64? Thereby avoiding inventory mix issues during production?
    • How's Transmeta doing these days? Oh that's right they are defunct.

      That kind of thing doesn't work well for performance.

      • They were never fast; but they were pretty much the only game in town if you wanted x86 within tight thermal constraints, for a time after they launched. VIA was similarly tepid and a bit hotter and Intel was pretending that a "Pentium 4 Mobile" was something other than a contradiction in terms.

        Now, once Intel stopped pretending that Netburst was something other than a failure, and put some actual effort into lower power designs, it was Game Over; but they didn't do that overnight.
        • Intel has only shown you what's possible with a large number of advanced low-power transistors. That's still just one design (of the many possible ones) that uses this level of logic integration. Does that mean that it's impossible to do anything better with the same large number of advanced low-power transistors? Do you have any reason to believe that the Transmeta approach (that actually worked better back then) wouldn't work better now for some reason?
          • by amorsen (7485) <benny+slashdot@amorsen.dk> on Monday May 05, 2014 @04:22PM (#46922037)

            Transmeta was at the end of the era where decoding performance mattered. Keeping the translated code around was actually useful. These days decoding is approximately free on any CPU with half-decent performance -- the amount of extra die space for a complex decoder is not worth worrying about.

            You can save a bit of power with a simpler decode stage, but you are unlikely to beat ARM Thumb-2 on power by software-translating x86 the way Transmeta did. Besides, most of the interesting code for low power applications is ARM or MIPS already, so what is the point?

            • These days decoding is approximately free on any CPU with half-decent performance

              In what way? And what do you mean by "decoding"? Do you also include dependency solving, interlocking, reordering etc.? Because what I was thinking about was pushing even more to the SW component. The problem is, CPUs have been widening for quite some time because of our over-reliance on single-threaded SW. But even if it doesn't work nearly as well for eight-issue monsters, given that simple cores like Jaguar, which seem to be practicable if you have many more of them, push you back into the time of "quart

              • by amorsen (7485) <benny+slashdot@amorsen.dk> on Monday May 05, 2014 @04:49PM (#46922387)

                You cannot meaningfully do reordering and so on in software on a modern CPU. You do not know in advance which operands will be available from memory at which time. You have to redo that work every time you get to the code (unless it is in a tight loop, but modern x86's are REALLY good at tight loops) because circumstances will likely have changed -- and you cannot reorder in software every time, that is just too costly.

                If you want to see an architecture which looks like it has a chance of breaking the limits on single-threaded performance, look at the Mill [millcomputing.com]. In theory you could software-translate x86 to Mill code and gain performance, but it would be really tricky and no Mill implementations exist yet.

                • I guess you're right on the reorderings, there are unpredictable aspects to the execution trace. But then again, there's the engineering maxim that every extra component has to justify its value to be included in a system. Surely these circuits made sense when Pentium III was competing with P4 was competing with K7. Whether their usefulness is undiminished in low-power parallel systems seems like the question to me, though. There appears to be a law of diminishing returns for everything.
                • Also, I forgot one thing...

                  You do not know in advance which operands will be available from memory at which time. ... If you want to see an architecture which looks like it has a chance of breaking the limits on single-threaded performance

                  Do I really need to know that, or can I just switch to a different thread of execution until then? And do we really need to care about single-threaded performance that much these days? What if I want to program in Go instead of C++? (E.g., what if Google wants 0.5M of new servers for deploying of Go services?) Perhaps some level of "outoforderiness" is desirable, but a lower one would do? I really don't care in what way the performance gets squeezed into my battery-powered devices

                  • by amorsen (7485)

                    Do I really need to know that, or can I just switch to a different thread of execution until then?

                    Sun tried it, market penetration near zero. You can get 12 threads per socket on a desktop Intel CPU, good luck keeping 12 threads busy on mainstream workloads.

                    Single threaded performance is everything for a CPU; it is cheap to add sockets and cores for parallel workloads. For real parallel work you use the GPU anyway.

                    • Sun tried it, market penetration near zero.

                      That just might have something to do with the ridiculous price they were asking for it, doesn't it?

                      You can get 12 threads per socket on a desktop Intel CPU, good luck keeping 12 threads busy on mainstream workloads.

                      Doesn't seem like that much of an issue to me. I can't think of an application where they wouldn't come in very handy. And I really tried, but nothing came out of it.

                      Single threaded performance is everything for a CPU; it is cheap to add sockets and cores for parallel workloads. For real parallel work you use the GPU anyway.

                      Well, that's good for certain kinds of data parallelism, but probably not for all of them. At least, right now, even though AMD is trying to stretch its usefulness as much as they can.

                  • by UnknownSoldier (67820) on Monday May 05, 2014 @06:43PM (#46923569)

                    > And do we really need to care about single-threaded performance that much these days?

                    Not every task is parallelizable.

                    Second, are you going to pay for an engineer to make their code multi-threaded that shows X% run-time performance?

                    • Not every task is parallelizable.

                      That's a red herring. Many more tasks probably are than most people would think. See Guy Steele's work. I think I even came up with a scheme to run TeX passes using speculative execution (results always correct, and most of the times faster) the other day (the state to keep around fortunately isn't very large).

                    • Not to mention that on most 'desktop' or 'server' machines, the OS is constantly juggling hundreds or thousands of processes, so while an individual program may be single threaded, the operating system can be spread across all available processes. The hard thing is knowing, for an individual process and core, when it is worth switching context - shunting it off to wait for I/O and shoveling a different process onto that core - or just idling that core for a while. IIRC (from _long_ ago), I/O typically cos

              • by unixisc (2429386)

                But that was a part of the very concept of VLIW, which both Crusoe & Efficeon were. But those processors were somewhat more RISC than VLIW, except that their integer units were 128-bit and 256-bit, as opposed to 32-bit or 64-bit. Essentially, the idea here was that the bottom core would be constant, and any time there was an instruction set upgrade in a CPU from Intel or AMD, the Transmeta CPU would implement those new instructions in terms of their own native instructions, which would presumably eith

            • by Carewolf (581105)

              Transmeta was at the end of the era where decoding performance mattered. Keeping the translated code around was actually useful. These days decoding is approximately free on any CPU with half-decent performance -- the amount of extra die space for a complex decoder is not worth worrying about.

              Actually Intel has recently returned to that. They now keep a small microinstruction cache of decoded instructions around so that loops can be executed more efficiently.

              • by amorsen (7485)

                Fair enough, but they still choose to have all decoding done in hardware, so they still pay the (rather small) die-space penalty of a complex decoder.

              • Ahh, the good old days. This reminded me of the Motorola 6800's Halt and Catch Fire [wikipedia.org] instruction. :D Tight loops can be ... interesting.

    • by LWATCDR (28044)

      The Transmeta chip was not a smash hit so probably not.
      The really cool thing is that you will see ARM and X86 will share parts. GPU cores are a no brainer. Throwing in things like cache and memory controllers could be a big deal.
      ARM sharing a socket with x86 will be really cool IMHO.

      • There are hardly any ARM CPUs or MCUs around that will ever get inserted in a socket. They are all mostly SMD chips.

        • by LWATCDR (28044)

          AMD is targeting this at the server market. It is supposed to be socket compatible with the new X86 they working on. It is not strictly targeting the mobile market.

    • But unixisc, that's a solved problem. We don't write software in Assembly Languages anymore.

      See, we can simply compile the program on the chip we want to use it on.

      The problem is that humans are stupid. Languages at the Human interface level should never compile down into machine code. All languages should compile down into bytecode. You should NEVER distribute programs as binaries (that would be dumb). Then the hardware abstraction layer (your OS) can compile the bytecode INTO OPTIMIZED machine code f

  • Did it say anywhere if they're going to juiced with HSA?

    nanosecond latency to some bigass stream processors and no risk of memory-scribbling is too much to not want.

  • The last time I truly got excited about AMD was when the K6-2 came out. These days, I just wish AMD would put a focus on power consumption and high quality rather than simply trying to out-core Intel.

    • by werepants (1912634) on Monday May 05, 2014 @03:55PM (#46921789)

      The last time I truly got excited about AMD was when the K6-2 came out.

      What? During the P4 days AMD was ahead in almost every category in the benchmarks... did you miss that whole era? No denying the picture today is far less exciting, though.

      • by unixisc (2429386)
        Actually, K7 - when Dirk Meyer's team left DEC to join AMD - was when they first made any technical challenge to Intel's CPUs. Until then, they were a series of one mediocre challenge after the other - first the Am386s & 486s, then the NexGen acquisition, then the K6. Finally, when AMD did the Athlon w/ the ex-Alpha team from DEC and extended CISC to 64-bit, that's when things started getting interesting.
        • My personal favorite was the Athlon XP 1700+. The best was date code JIUHB DLT3C, it had documented cases of getting above 4GHz - pretty good considering that it is still a feat to hit that 10 years later. Bought two or three 1700+'s on ebay before I hit the jackpot. Unfortunately, I never managed to put together the water cooling system I had planned, so I never got it over 3 GHz.
      • by Jaime2 (824950)

        But that's only because Intel let the marketing department make engineering decisions and kept making chips with higher and higher clock frequency. As soon as they regained their sanity, they once again dominated the benchmarks.

        I do love how AMD brilliantly capitalized on the blunder. By labeling their chips according to the clock speed of the performance equivalent Intel chip - every time Intel put insane engineering effort into ratcheting the clock up 10% and only getting 1% better performance, AMD simply

      • by afidel (530433)

        Yup, on the server side AMD was ahead from the first Opteron until Shanghai, and then Intel launch Nehalem and they've been ahead ever since. One the desktop Intel got competitive again with the Core2 but on a performance per $ metric it wasn't until Nehalem that they dominated.

        • On a performance per $ metric, AMD are arguably still competitive, at the expense of selling cheaply and barely breaking even financially. They are currently not competitive in performance per watt and absolute performance (both on the desktop, mobile looks a bit better).

          AMD really fucked up with the Bulldozer, and while there have been modest improvements to that with Vishera and Steamroller, they were insufficient to close the gap to Intel.

      • by asmkm22 (1902712)

        I didn't say the K6-2 was the peak of AMD; just that it was the last time I really got excited about anything they came out with. AMD did some good stuff during the mid-2000's, but there were other computer upgrades that had more impact on performance -- particularly RAM. Those were the days when adding a stick of RAM was a legitimate means of being able to do amazing things like browse the internet while listening to music... at the same time!. Upgrading from 512 to 2GB was a huge boost in productivity.

      • by Xest (935314)

        "During the P4 days AMD was ahead in almost every category in the benchmarks"

        It was ahead in many categories off the benchmarks too.

        Like how quickly it heats your room up, and how much power it drained.

        I had one but my god in the summer months did I wish I'd gone Intel as I was sat sweltering from the heat of that computer on top of the already high ambient temperature.

    • How about the AMD386? It ran at 40Mhz. 40!
      • by DudemanX (44606)

        I had an AMD486 80Mhz. It was cheaper than an i486 66Mhz and performed great. The Pentium had just come out at the time but was super expensive. I was able to find late model 486 board with PCI slots though and with the awesome value of the AMD chip was able to have a nice "budget" system for the time. It was even able to run Quake playably(a game which "required" the Pentium and it's baller FPU).

        • I had an AMD 486 DX4 - 120 MHz. It beat the pants off the contemporary Pentium processors.

    • K6-2 was good, but the K6-III was much better. It was the first consumer-level CPU with on-die L2 cache. It scared Intel enough that they renamed the PII to PIII (because anything with a 3 in the name is clearly better than anything with a 2 in the name). The down side was that the K6-III overclocked for shit.

  • Serious Question (Score:4, Interesting)

    by Anonymous Coward on Monday May 05, 2014 @03:57PM (#46921801)

    Is AMD just around so Intel doesn't get bogged down by anti-monopoly or antitrust penalties?

    • Re: Serious Question (Score:4, Interesting)

      by Anonymous Coward on Monday May 05, 2014 @04:05PM (#46921857)

      64 cores per U, 80% intel performance per core, at 12% intel price.

      • by Junta (36770) on Monday May 05, 2014 @05:05PM (#46922579)

        Well, something of an oversimplification/exaggeration.

        64 'cores' is 32 piledriver modules. That was a gamble that by and large did not pan out as hoped. For a lot of applications, you must consider those 32 cores. Intel is currently at 12 cores per package versus AMD's 8 per package. Intel is less frequently found with their EP line in a 4 socket configuration because the performance of dual socket can be much higher with Intel's QPI than 4 socket. AMD can't do that topology, so you might as well do 4 socket. Additionally, the memory architecture of Intel tends to cause more dimm slots to be put on a board. AMD's thermals are actually a bit worse than Intel's, so it's not that AMD can be reasonably crammed in but Intel cannot. The pricing disparity is something that Intel chooses at their discretion (their margin is obscene), so if Intel ever gets pressure, they could halve their margin and still be healthy margin-wise.

        I'm hoping this lives up to the legacy of the K7 architecture. K7 architecture left Intel horribly embarrassed and took years to finally catch up with when they launched Nehalem. Bulldozer was a decent experiment and software tooling has improved utilization, but it's still rough. With Intel ahead in both microarchitecture and manufacturing process, AMD is currently left with 'budget' pricing out of desperation as their strategy. This is by no means something to dismiss, but it's certainly less exciting and perhaps not sustainable since their costs are in fact higher than Intel's cost (though Intel's R&D budget is gigantic to fuel that low-cost per-unit advantage, so the difference between gross margin between Intel and AMD is huge, but net margin isn't as drastic). If the bulldozer scheme had worked out well, it could have meant another era of AMD dominance, but it sadly didn't work as well in practice.

        • Intel is less frequently found with their EP line in a 4 socket configuration because the performance of dual socket can be much higher with Intel's QPI than 4 socket.

          I've not heard of this before. Do you have a link? I'm guessing it's something about just corss connecting all QPI lines between two sockets rather than 4? Also, can't HT do that?

          Intel quad sockets do seem less popular recently. I think AMD are most competitive on that end of servers, especially after memory prices came down so you can get a 5

          • by Junta (36770)

            Unfortunately I do not have a link. I do however know some system designers.

            They designed a 4 socket Opteron system, and did not make a dual socket. It was peculiar to me so I asked why not a dual socket and they said there was no point in a dual socket because there was no performance advantage.

            They also designed both a 4 socket EP system and a 2 socket EP system. I asked why and they said that they could gang up the two QPI links between two sockets for better performance.

            I admittedly did not ask point

    • by tlhIngan (30335)

      Is AMD just around so Intel doesn't get bogged down by anti-monopoly or antitrust penalties?

      Somehow these days, I think it's yes. And I think Intel's lobbing customers AMD's way to ensure that AMD survives. E.g., the current generation of consoles now sport AMD processors. I'm sure Intel would be more than happy to have the business, but not only do they not need it, they see it as a way to give AMD much needed cash for the next few years.

      Hell, I'm sure part of the whole Intel letting others use their fabs

      • by Kjella (173770)

        Somehow these days, I think it's yes. And I think Intel's lobbing customers AMD's way to ensure that AMD survives. E.g., the current generation of consoles now sport AMD processors. I'm sure Intel would be more than happy to have the business, but not only do they not need it, they see it as a way to give AMD much needed cash for the next few years.

        Consoles are primarily about graphics, not CPU power. While Intel's integrated graphics suck somewhat less than they used to, the PS4 has 1152 shaders backed by 8GB DDR5 and Intel has never had anything remotely close to that, maybe a third or quarter of that tops. An Intel CPU with AMD dedicated graphics would be very unlikely since AMD would almost certainly price it so their CPU/GPU combo came out better. So realistically it was AMD vs Intel+nVidia, neither of which like to sell themselves cheap. I don't

  • Best of luck to them (Score:5, Interesting)

    by Dega704 (1454673) on Monday May 05, 2014 @04:36PM (#46922227)
    I was such an AMD fanboy ever since I built my first (new) computer with a K6-II. I have to admit I miss the days of the Athlon being called "The CPU that keeps Intel awake at night." After Bulldozer bombed so thoroughly I just gave up and haven't followed AMD's products since. I definitely wouldn't mind a comeback, if they can pull it off.
    • by arbiter1 (1204146)
      As of late its been hopeing and claiming their new stuff would be great but when it hits the market it turns out to be not as good as they hoped and back track some of what they said.
    • by unixisc (2429386)
      Their manufacturing has always been their Achilles heel. If only they had the fabs that Intel has....
    • by Bryan Ischo (893) * on Monday May 05, 2014 @05:23PM (#46922811) Homepage

      I don't get it. Do you, and just about everyone else who has posted in this discussion, only by chips that cost > $200? Because AMD is, and always has been, competitive with Intel in the sub $200 price range.

      Sub $200 chips have, for a very long time, been very fine processors for the vast majority of desktop computer tasks. So for years now, if you're anything close to a mainstream computer user, there has been an AMD part competitive with an Intel part for your needs.

      Of course, once you get to the high end, AMD cannot compete with Intel; but that's only a segment of the market, and it is, in fact, a much smaller segment than the sub $200 segment.

      I personally have a Phenom II x6 that I got for $199 when they first came out (sometime in 2011 I believe) that was, at the time, better on price/performance than any Intel chip for my needs (mostly, parallel compiles of large software products) and absolutely sufficient for any nonintensive task, which is 99% of everything else I do besides compiling.

      Anyway, if you only think of the > $200 segment, why stop there? I'm pretty sure that for > $10,000 there are CPUs made by IBM that Intel cannot possibly compete with.

      • by mbkennel (97636)
        | Of course, once you get to the high end, AMD cannot compete with Intel; but that's only a segment of the market, and it is, in fact, a much smaller segment than the sub $200 segment.

        From AMD's end, that's a critically important segment since it's where the most money is, and chip design and manufacturing are exceptionally expensive.
      • don't get it. Do you, and just about everyone else who has posted in this discussion, only by chips that cost > $200?

        AMD are also competetive in the quad socket server end of things. Those CPUs cost more like $1000 a pop.

    • by marsu_k (701360)
      I don't know how much of a profit they're making on their APUs, but they're the winners of the current console generation (somewhat surprisingly, the winner of the previous gen was IBM with PPC/Cell). I'm hoping they stay afloat - they may only be competitive (when it comes to general x86/x64) on very few tasks that require very many cores (and even then probably using more watts at that), but it's never healthy to have a monopoly.
      • by Kjella (173770)

        I don't know how much of a profit they're making on their APUs

        Last quarter, they lost $3 million on CPU/APUs so in practice they're breaking even, but revenue is going down which means less and less goes to R&D. Their profits last quarter are a bit from dedicated graphics cards but mostly from console chips. Which is of course better than a loss, but consoles have a very special life cycle with high launch and Christmas sales with little in between so it's unclear how long that'll last.

  • It seems that it would be fertile territory for genetic algorithms to design the die. Sure, humans need to define the features, but run everything through a genetic algorithm, simulate and let the computer grow its own chips. Perhaps whole chips are not practical, but sub-processing units could do it.

     

    • by wiggles (30088)

      Pretty sure that firing all of the hot shot CPU designers and having such algorithms design their CPUs for them is how they wound up with the Bulldozer fiasco.

      Looky here. [xbitlabs.com]

      • by rrohbeck (944847)

        Nonsense. Code does routing and floor planning, it doesn't design two-core modules.

        Oh and in the current designs the automatic layout saved significant real estate and power compared to hand layouts.

        The article you refer to is utter bullshit.

    • Sounds useful, but for smaller cores. Having said that, the more you simplify the design the better for certain smarter methods. For example, it's my understanding that Chuck Moore optimizes his Forth cores to expand the envelope of operating conditions to such extent that AMD and Intel can't afford simply because their cores are too large to be understood. Too many state transitions to study, too many gates etc., whereas CM can afford to simply run a full physical model including individual transistor temp
  • . . . . until the next generation knows not history and thinks they rediscovered RISC . . .
  • Intel is going to have something on the market that runs more efficiently and with better performance. Try as they might, AMD just can't seem to get their act together for producing a decently performing product since the Athlon II.
    • Re: (Score:2, Troll)

      by Tough Love (215404)

      Which is why consoles don't use AMD at all. Oh wait...

      • Consoles are using AMD because the parts are cheap, not because the performance/watt is fantastic. AMD hasn't been able to produce a CPU with amazing performance, decent thermals, and high power efficiency for years now. Why do you think gaming PCs and nearly all laptops use Intel? Because Intel offers all three with ease.
        • Re: (Score:2, Troll)

          by Tough Love (215404)

          Excuse me for injecting a note of reality into your rant, but I thought consoles care about heat. Also, aren't "thermals" and "power efficiency" the same thing? Or does that get in the way of your rhetoric.

      • Which is why consoles don't use AMD at all. Oh wait...

        So... intel paying astromods now?

  • or at least give all CPU's 2-3 HT links so you can have 2 or more HT to chipset / HT to pci-e bridges on a 1 cpu board.

Pause for storage relocation.

Working...