Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AMD Hardware Technology

AMD Reveals Plans to Move Beyond the Core Race 227

J. Dzhugashvili writes "The Tech Report has caught wind of AMD's plans for processors over the coming years. Intel may be counting on cramming 'tens to hundreds' of cores in future CPUs, but AMD thinks the core race is just a repeat of the megahertz race that took place a few years ago. Instead, AMD is counting on Accelerated Processing Units, chips that mix and match general-purpose CPU cores with dedicated application processors for graphics and other tasks. In the meantime, AMD is cooking up some new desktop and mobile processors that it hopes will give Intel a run for its money."
This discussion has been archived. No new comments can be posted.

AMD Reveals Plans to Move Beyond the Core Race

Comments Filter:
  • Same old. (Score:5, Insightful)

    by sam991 ( 995040 ) on Thursday December 14, 2006 @08:49PM (#17248184) Homepage
    Intel pushes the 'more power! faster!' philosophy while AMD just redesigns the architecture and it takes Intel a few years to catch up. Not much has changed since 2000.
    • Re:Same old. (Score:5, Interesting)

      by TrisexualPuppy ( 976893 ) on Thursday December 14, 2006 @08:53PM (#17248236)
      Intel pushes the 'more power! faster!' philosophy while AMD just redesigns the architecture and it takes Intel a few years to catch up. Not much has changed since 2000.
      Correct. Intel still has the lion's share of the market, and they want to keep it that way. It's interesting how they "cheat" and lock two dies together and call it dual-core or quad-core just to come out with the technology "first" to keep the investors happy.

      AMD is smaller obviously, so it has fewer resources...but with those Alpha scientists, they're going to keep going strongly. It's just a matter of time with business directives like this before AMD takes over. They've been having some really cool ideas...and a few more over a few years, the innovators may win. And no, I'm not an AMD fanboi, but I have talked to some architects from IBM and Intel, and they do concur.
      • hyper transport (Score:5, Informative)

        by Joe The Dragon ( 967727 ) on Thursday December 14, 2006 @09:04PM (#17248356)
        Intel would be better off if they where to start useing hyper transport Even having two cores on same die with linked by hyper transport to each other with one link to the chip set is better then 2 cores shearing the FSB.

        What is the point of having 32 cores with only one link to the chip

        Even with the new Xeon's there still only one link per cpu and the cpus need to use it to get to ram

        Amd chips right now have up to 3 newer ones will have up to 5 links
        • Re: (Score:2, Informative)

          by aquaepulse ( 990849 )
          This [wikipedia.org] is supposed to be the plan for Intel. Why would they use an existing standard?
        • Re:hyper transport (Score:5, Interesting)

          by ZachPruckowski ( 918562 ) <zachary.pruckowski@gmail.com> on Friday December 15, 2006 @12:55AM (#17250732)
          FSB 1333 (333 QDR) seems to be holding up well for 4 cores at the moment. It only really seems to be much of an issue in the 4+ socket world (which is admittedly lucrative). You are correct in the sense that Intel stands to gain from a move to a hypertransport-like system, but that only raises the counter-issue: If Intel has the performance lead now, even with the FSB issues, then AMD's 07-08 products have to hit it out of the park to beat Intel's chips without that handicap.

          The real issue is feature size. AMD is hurt badly by being consistently behind on that. Intel's been at 65 nm for a while now, and AMD is only now releasing 65 nm parts. Intel will be at 45 nm in some lines by this time next year, while AMD is a year behind them. Feature size brings with it higher yields (more chips per wafer) once you work the kinks out, lower heat, and more transistors per chip. That's the game winner right there, unless one of them shoots themselves in the foot again, like Netburst.
      • Re:Same old. (Score:4, Informative)

        by Martin Blank ( 154261 ) on Friday December 15, 2006 @12:22AM (#17250458) Homepage Journal
        To be fair, the Merom architecture is a true dual-core architecture. The "quad-core" chip that Intel announced recently is simply glued together, and AMD's recent split design isn't terribly much better (though the two cores are linked by a dedicated datalink). AMD has a true quad-core design being prepped for next year, and Intel may have to follow suit, especially if AMD is able to show a decisive performance edge.
      • Re: (Score:2, Interesting)

        by SuluSulu ( 1039126 )

        AMD is smaller obviously, so it has fewer resources...but with those Alpha scientists, they're going to keep going strongly. It's just a matter of time with business directives like this before AMD takes over. They've been having some really cool ideas...and a few more over a few years, the innovators may win. And no, I'm not an AMD fanboi, but I have talked to some architects from IBM and Intel, and they do concur.

        Doesn't that assume that Intel doesn't change their strategy? It seems to me that Intel

      • Re: (Score:3, Interesting)

        by teg ( 97890 )

        Correct. Intel still has the lion's share of the market, and they want to keep it that way. It's interesting how they "cheat" and lock two dies together and call it dual-core or quad-core just to come out with the technology "first" to keep the investors happy.

        Cheat? The result is 4 cores in one socket. Things like "they cheated!", how many nm the process is etc is really irrelevant. What matters is the end result, like performance, power usage, memory bandwidth. That AMD can't do it yet and had gotten

    • Heads up, Nvidia. There will soon be a market for chipsets that support both CPU lines SIMULTANEOUSLY. Get the best of both worlds, literally.
    • "Intel pushes the 'more power! faster!' philosophy while AMD just redesigns the architecture and it takes Intel a few years to catch up. Not much has changed since 2000."

      The truth is there is no one way to design a CPU, what really happens is Displacement according to what's possible in the possiblitiy space at the time and what resources and solutions are available.

      For example CPU's at some point may go back to a Pentium 4 style design if they ever design a better substrate that can withstand high frequenc
    • by rar ( 110454 )
      Instead, AMD is counting on Accelerated Processing Units, chips that mix and match general-purpose CPU cores with dedicated application processors for graphics and other tasks.

      Wasn't this more or less exactly how the Amiga worked? I think you are right about 'same old'...
  • Two different methods that achieve the same result are better than one.
  • This is the type of innovation that usually comes when there is true competition in the market. Imagine how much better the OS market would be with similar competition.
    • by Moridineas ( 213502 ) on Thursday December 14, 2006 @09:02PM (#17248328) Journal
      Like if there were hypothetical competitive operating systems like Mac OSX, Linux (and the competition therein--Ubuntu, Fedora, etc), FreeBSD, OpenBSD, Solaris, etc?
      • Mac OSX can be realy competitive if apple where to let it run on all hardware.
      • The thing is, I like to be able to play games on my home computer. Any old game I see at the store. I want to be able to play it. A home computer is half about entertainment. Windows has no competition in that area. They just don't.

        I use Linux daily at work, but, I have no driving need to have a Linux box at home. I don't do that much worky stuff at home. I'm already burned out after doing it all day at work (and if I need to do more work, I can ssh in with Cygwin from home). And you really can't game on L

    • by Anonymous Coward
      This is the type of innovation that usually comes when there is true competition in the market. Imagine how much better the OS market would be with similar competition.
      This is the type of innovation that happens when you abandon a crew on Ceti Alpha V. Imagine how much better the OS market would be with similar KHHHAAAAAAANNNNNN!!!
    • Re:Free Enterprise (Score:4, Insightful)

      by badboy_tw2002 ( 524611 ) on Thursday December 14, 2006 @09:54PM (#17248838)
      Of course, for the competition of the type that exists between Intel and AMD or AMD/Nvidia you need a common standard to compete with. If all apps ran on the same OS/GUI API then you'd have a true choice in operating systems (this one is more secure, this one faster, this one runs Word twice as fast and handles more DB load, etc). CPUs have x86, GPUs have DirectX/OpenGL, OSs need a standard application interface commmonly accepted by software developers. Otherwise you're comparing not just the OS but all the stuff that goes with it (skins, music players, etc etc etc)
      • Re: (Score:2, Informative)

        It's called POSIX, and everyone supports it except Windows.
        • by MooUK ( 905450 )
          "Except windows". There you have the problem.
        • Simply false (Score:2, Informative)

          by DMiax ( 915735 )

          Windows supports POSIX: look here [wikipedia.org].

          In any case you have a point in that Microsoft does not really encourages programming for POSIX-compliant OSs, but just for Windows.

      • by cureless ( 35682 )

        OSs need a standard application interface commmonly accepted by software developers
        You mean like POSIX? (Portable Operating System Interface for uniX)
  • by mr_stinky_britches ( 926212 ) on Thursday December 14, 2006 @08:53PM (#17248240) Homepage Journal
    That sounds great and all, and the AI that the article mentions really does sound interesting...but I am not clear on how a processing unit for extremely specialized tasks is going to translate into significant performance gains? Is the current generation of CPU not optimized for mathematic operations? This seems the most direct way to get the best all around performance, to me. Also, isn't it kind of sucky to make the processor only good at a few good things rather than fast in a diverse range of applications?

    If anyone can give me any insight here...please speak up.

    Thanks

    - I post interesting things or short articles I write here [wi-fizzle.com]
    • Is the current generation of CPU not optimized for mathematic operations?

      You name the operations, and I'll tell you if it is optimized for it or not.

      Or, in other words, that's the difference between genral-purpose and special-purpose.
    • by SQL Error ( 16383 ) on Thursday December 14, 2006 @09:06PM (#17248364)
      Is the current generation of CPU not optimized for mathematic operations?

      What do want to run on a computer that isn't "mathematic operations"?

      More specifically:

      Are current CPUs optimised for physics simulations? No.
      For image processing? No.
      For data compression? No.
      For encryption? No.

      These are all areas where custom cores can provide enormous performance benefits (both in absolute terms, and in terms of performance per watt) over current CPUs, which are general purpose.
      • Physics simulation, image processing, and data compression all use math. For physics and images the math is matrix math. Look at what most benchmarks are made of, and you will see it's the same stuff.
        • by mabinogi ( 74033 )
          Saying that a CPU does "maths" and is therefore is specialised for any mathematical task is like saying that someone knowing "IT" is therefore qualified to fix the printer.

          It entirely depends on _which_ mathematical operations you're talking about.
        • The TiVO (original Series1) used a 54 MHz PPC chip [wikipedia.org] with 16 MB of RAM. Seem sort of slim for all the work it could do? That's because a lot of the work was done by dedicated chips. Similarly, most MP3 players have tiny CPUs, since they're mostly run by dedicated MP3 or AAC or whatever decoder chips.

          If compression could be handled by a secondary dedicated chip, then that's an option to save a lot of space on hard drives, or with dedicated encryption cores, you get super-easy encryption of whatever you nee
          • One problem with specialized chips (or specialized silicon) is that you're always at least one step behind the state of the art. Sure, you've got silicon to help with DES processing... but oops, we've all moved to AES now!

            Now, I can also think of a few examples where that's not true, such as:

            - SSL co-processors
            - TCP off-load engines

            • Yeah, but if you add the level of funding that could come from having an R&D sugar-daddy in the computing industry (being AMD or someone looking for a Torrenza-based add-on) would help that out a bit. Additionally, there'd be a lot of incentive from the software makers to cooperate. If AMD has a H.264 encoder built-in, but not a DivX one, then people will jump to H.264 over DivX (or vice-versa) simply because the hardware-accelerated version is that much faster. I mean, if 20% of the market sees your
      • by megaditto ( 982598 ) on Thursday December 14, 2006 @09:32PM (#17248660)
        What I'd like to see is a couple of those field-programmable thingie cores that can reconfigure their circuits to a specialized calculation a program is doing... Wishful thinking but still...
        • Transputer (Score:4, Informative)

          by temojen ( 678985 ) on Thursday December 14, 2006 @10:04PM (#17248904) Journal
          Compaq used to sell those; they're called transputers and came as a PCI card with 4 FPGAs, some RAM, and a PowerPC CPU.
          • The transputer was an invention of Inmos and they were interconnected little CPUs designed for parallelizable task. Transputers had an own language, Occam where for every block you had to specify if its instructions were to be executed in serial or parallel manner.
            It was a rather fascinating system (especially for its time) but it has died on the market.
            (Sorry for the off-topic ranting, but I programmed these during my studies and quite liked the concept)
        • Re: (Score:2, Informative)

          by jcasper ( 972898 )
          XtremeData (http://www.xtremedatainc.com/ [xtremedatainc.com]) has a board with a FPGA that plugs into an Opteron 940 socket. Not exactly what you asked for, but a step in that direction.
        • by Rakishi ( 759894 )
          As I understand it the problem is that FPGAs have a limited number of changes/uses/whatever before they crap out. My EE friends keep complaining about how the ones in their classes are barely working even when they're relatively new (but used so much that they're already way over the recommended number of uses).
        • Re: (Score:3, Interesting)

          "What I'd like to see is a couple of those field-programmable thingie cores that can reconfigure their circuits to a specialized calculation a program is doing... Wishful thinking but still..."

          This is what human minds do but CPU's are far from this goal, not to mention the nightmare of managing it as complexity increases.
        • What I'd like to see is a couple of those field-programmable thingie cores that can reconfigure their circuits to a specialized calculation a program is doing... Wishful thinking but still...

          Here [theregister.co.uk] is one, there are a couple of other companies doing exactly the same thing.
      • by nr ( 27070 )
        And what value does that provide to Cray and other supercomp/cluster builders? they dont have any use of graphics/encryption/compression stuff put into the generic processors they use to build their supercomputers, they want only Floating Point Operations and nothing else.
      • Re: (Score:3, Insightful)

        Remember, a large percentage of the time the entire system - including CPUs, chipsets, memory, disks, etc., are just pushing data around without performing any calculations. We could all gain from better performance of these operations as well.
      • Re: (Score:3, Insightful)

        by GeffDE ( 712146 )
        Physics simulations and image processing can be (and are) done on GPUs. Same for any hardcore math stuff, like Folding Proteins [stanford.edu]. The problem with the AMD approach is that there are only so many (and I don't think it is many, but I really don't know, so if you do, please let me know) different kinds of operations. Like I said, the physics simulations and image processing are the same type of problem and also conveniently tackled very proficiently by graphics hardware.
        • Re: (Score:2, Interesting)

          by stigmato ( 843667 )

          Like I said, the physics simulations and image processing are the same type of problem and also conveniently tackled very proficiently by graphics hardware.
          Perhaps thats why AMD merged with ATI? Just a thought.
        • No, that's not a problem. That's actually the key strength. It means that a relatively small menu of super optimized special processors will be sufficient to satisfy most of their customer's needs at ludicrous speed.
          • by GeffDE ( 712146 )
            But what I was saying is that for most applications (like web-browsing, office documents, whatever), there is no "special processor" that will speed things up tremendously. Even for graphics, for games and media, if the graphics card is leveraged correctly, that is a super-optimized special processor. So like I was saying, the majority of uses (data compression, speech and encryption are the only applications I've seen that can be improved. But that definitely does not mean that there aren't others) of a
      • by Alef ( 605149 )

        Are current CPUs optimised for physics simulations? No.
        For image processing? No.
        For data compression? No.
        For encryption? No.

        Maybe not, but if you have specialized cores for each of these, you will have 4 cores idling when you don't do any of that. The alternative would be to have 5 general purpose cores. Each single one would be slower at a specific task, but the symmetric design would give better flexibility allowing all cores to operate all the time. It isn't a clear cut case which approach is the

      • "These are all areas where custom cores can provide enormous performance benefits (both in absolute terms, and in terms of performance per watt) over current CPUs, which are general purpose."

        The problem is I think many developers would not like it, after all if you're the loser or non-beneficiary of the specialized circuitry you're not going to be a happy camper.

        That and the article that was pulled from Anandtech about the highly specialized Playstation 3 CPU said game dev's were not happy with the degree o
      • "Are current CPUs optimised for physics simulations? No."

        No, but some add on cards, and in not too long a time, video processors may have physics simulators. That does not really

        "For image processing? No."

        True. If you don't count the various multi-media instructions.

        "For data compression? No."

        True.

        "For encryption? No."

        Except the Sun T1 (aka Niagara) and VIA C3 processors that is. Especially the C3 (in the Epia range of Mini-ITX motherboards) do SHA hashing operations, AES operations, faster RSA operations a
    • Much in the same way that a dedicated graphics chip can render so much quicker than a general purpose CPU running many times the speed. The problem is, the specialized units for certain problems may or may not be needed a year from now when a better way to solve a problem is found. Or new problems arise that don't fit the existing specialized units. General purpose CPUs may be slower at a given task, but they can perform a much greater range of tasks. As such, I'm not calling a winner here.
    • by CAIMLAS ( 41445 ) on Thursday December 14, 2006 @09:20PM (#17248516)
      Imagine a processor with special circuitry routines which will speed up the operation of the following by a significant percent:
      - database servers
      - web servers
      - CAD and 3d programs (rendering)

      Basically, it's not much different than MMX or any other extension to a processor. The programmers can still code for the x86 (or whatever) architecture and the same operating system, but then shortcut those instructions when the additional instructions are found to be available. Or maybe they can work it transparently so programmers don't have to do anything additional - it'll optimize on the fly (provided they can figure out how to do that). Overall, I think the software headache will be worth it to companies, as they will be able to have substantial gains in performance in the hardware department, cutting cost while gaining performance. What datacenter wouldn't love to use half as many machines to provide access to the same amount of information; what animator wouldn't love to have their workstation be able to render things at twice the speed?
      • by Cyberax ( 705495 )
        There is such computer for hobbyists: http://www.eng.petersplus.ru/sprinter/ [petersplus.ru] - it is a clone of 8-bit Z80 with FPGA. People were able to port Doom on this computer - FPGA was used as rendering accelerator.

        You could even reprogram it on the fly - I remember writing accelerated floating-point computations just for fun.
      • Re: (Score:3, Insightful)

        by kestasjk ( 933987 ) *
        As a database guy I really don't think processors would make a bit of difference to database speed in the vast majority of cases. Database design is usually what's at fault when you're shown a slow database, followed closely by query design, followed by memory, followed by hard disks, followed by processors. The same sort of thing applies to web servers; the bottleneck is never the processor.

        As for CAD, well I think that would be quite a waste. Remember that processor designers only have so many transist
        • The same sort of thing applies to web servers; the bottleneck is never the processor.

          I've seen a number of shared hosts with the CPU tapped out from PHP processes. It's also the primary reason that people get booted from shared hosts: using too much CPU.

          That said, I don't know if a specialized processor would help it any. Many shared hosts seem to be more interested in balancing the load with virtualization.
        • Web servers nowadays seem to have their bottleneck at memory, then processor, acess to the database server, and only then disk acess. The days of static pages is gone.

    • Re: (Score:3, Interesting)

      by mo ( 2873 )
      The thing is, there's a huge number of applications that do the same basic computations over and over again.
      Just as the floating point coprocessor became the FPU section of the processor, it makes sense to give future processors the ability to do the common operations that are now done by graphics cards.
      Things like matrix multiplications (which is actually will be a single processor operation in SSE3) are used all over the place in graphics, sound, and well, virtually anything that eats up CPU power these d
    • by jd ( 1658 )
      The short answer is no. The longer answer is that the current architectures are designed to solve the more common simple cases at a reasonable speed. They are not designed for complex common operations (which is why we use libraries like ATLAS and FFTW, rather than a simple opcode, for so many fundamental operations). They are totally incapable of rarer complex operations, which is why research facilities pour literally millions of dollars into developing high-performance maths toolkits, and hundreds of mil
    • So what you're saying is, you don't own a graphics card?

      Because that's what they're talking about, replacing your graphics card with a graphics-dedicated section on the CPU. So if you think dedicated cores are stupid, you must think graphics cards are stupid.

      Why do you think AMD just merged with ATI?

  • by GrEp ( 89884 ) <crb002@NOSPAM.gmail.com> on Thursday December 14, 2006 @08:58PM (#17248294) Homepage Journal
    Nvidia, please make a board for solving small instances NP-complete problems. Mainly max-clique and graph coloring :)
    • by GrEp ( 89884 )
      Or "ATI" since this is a AMD post. :)
    • Re: (Score:3, Informative)

      by hritcu ( 871613 )
      Mainly max-clique and graph coloring
      Solving SAT in hardware would be enough, since you can reduce most NP-complete relatively easily to it, the SAT-solving algorithms are already highly optimized, and there were even previous attempts to build special purpose hardware for solving SAT.
  • Amiga? (Score:2, Informative)

    by vjl ( 40603 )
    That sounds like the Amiga's way of doing things...over 20 years ago! I'm glad it's catching on, and I'm glad AMD is doing it; AMD usually gets things right, and makes their products a lot more affordable than Intel...

    /vjl/
    • That sounds like the Amiga's way of doing things...over 20 years ago! I'm glad it's catching on, and I'm glad AMD is doing it; AMD usually gets things right, and makes their products a lot more affordable than Intel...

      /vjl/

      Actually, this is simply the latest iteration of a well-documented pattern going back forty-odd years known as the Cycle of Reincarnation [jargon.net].

    • It's not so much "catching on" as "these things go in cycles". For a while, the big thing was to have a seperate math co-processor; now that's been folded back in. Now, the big thing is a seperate graphics processor; this is about folding that back in.

      Why do you think they merged with ATI?

  • Sounds like OS-dependency and driver hell to me. Imagine if you had an MP3 decoding co-processor, an MPEG-2 encoding co-processor, an Excel co-processor, a GCC co-processor... getting it to all work seamlessly would make today's 200MB video card drivers pale in comparison. So you install WMP version 42 and you have to check "use dedicated MP3 coprocessor" in Tools->Options? The whole point of CPUs is that they are general purpose.
    • I was thinking more along the lines of how MMX/3DNow are implemented.. extra instructions.

      And maybe not such specific tasks like "mp3 decode".. but what about an FFT/IFFT instruction set extension? A matrix-multiply or matrix-inversion instruction set extension? The operating system could see these instructions and ensure they're executed on the correct processing unit (fast interconnects are of course needed here, which I believe is what HT3 is all about!)

      Hardware acceleration of these tasks would greatly
      • by Anonymous Coward on Thursday December 14, 2006 @09:32PM (#17248646)
        Don't forget crypto, the hardware AES on a 1GHz VIA C3 runs circles around a A64 X2 4800+ doing the same in software, at something like 10 vs. 80W power consumption.
      • well see, now what you're talking about is a digital signal processor. that's what those things are optimized for. and that's about all they do well. would be interesting to have a couple on-board a PC, though.
    • Yea you had better turn in that 3D video card you have there now. it just won't catch on.

      Consider this, a smart version of IBm's cell chip, with the other cores designed for one task each. with two of the cores a generic CPU.
  • infinitely parallel. Gaming on one hand, can be very much parallelized. With physics and an ever increasing amount of vertices to transform and AI to calculate, and in general crap to render.

    A lot of other software is not. Such as: Office productivity, operating systems...(these can benefit, but ultimately they'll reach a limit).

    The other question is, when you put hundreds of cores on a chip, how do you handle logistics of accessing cache? Or cache coherency?(not required) They it'll go up to 16 or so
  • Where will all the optimised code come from?
    What will the cost be in making it all work 'just' for AMD?
    How locked in would any code be?
    Over the life of a project, will it be worth 'porting' code to AMD?
    • Re: (Score:2, Insightful)

      by elhedran ( 768858 )
      Where will all the optimised code come from?

      Believe it or not, 100cores requires optimized code as well. programs don't magically become multi-threaded, a developer has to work out how to split the work up into 2/4/100 threads and not lose performance due to locking/thread communication.

      What will the cost be in making it all work 'just' for AMD?

      Probably about the same as making it work for a new graphics card

      How locked in would any code be?

      It sounds to me they are talking optimization. hence it would run
  • by lightversusdark ( 922292 ) on Thursday December 14, 2006 @09:33PM (#17248668) Journal
    The article is a bit light on detail, there's a webcast of the presentation on AMD's Investor Relations site [corporate-ir.net] (needs a login (BugMeNot doesn't work) and it's WMP or Real only. And it's apparently four hours long.
    The most interesting thing for me was the mention of "Hybrid Graphics":
    According to AMD, notebooks with hybrid graphics will include both discrete and integrated graphics processors. When such notebooks are unplugged, their integrated graphics will kick in and disable the discrete GPU. As soon as the notebook is plugged back into a power source, the discrete GPU will be switched on again, apparently without the need to reboot. AMD says this technology will enable notebooks to provide the "best of both worlds" in terms of performance and battery life.
    It also looks like they're also extending the Fusion concept along Cell-like lines, with additional cores for non CPU or GPU purposes.
    Their road map through 2008 only talks about up to quad core, although I assume this means CPU cores (I'm not sure that I would accept a CPU+GPU on a single die branded as a 'dual-core' chip). I think the Cell has eight cores, but due to yield issues not all are enabled in a PS3, and they are not all functionally equivalent. I don't know if this is the case for the Cell-based IBM blades, though.
    The roadmap basically looks like periodic refreshing of the product line reducing power consumption with each iteration, which is where I think Intel have got a head-start on AMD. However, if AMD can sort out the yield issues, and compilers and developers begin to take advantage of these "associate" cores in Cell and future AMD architectures, then maybe Intel will have turned out to have missed a trick, as they did with x86-64.
    • They wouldn't need to work on compilers, and developers wouldn't need to rewrite code if they encouraged people to use BLAS and then optimize BLAS. I think that a lot of this multi-core stuff will end up being matrix and vector math units with some kind of MIMD based on GPU style masking branches. If they wrap it in a special-purpose API, they only end up hurting their benchmark scores.
    • The cell has 8 SPUs which are stripped down vector processors and one PPU with is an only mildly stripped down PPC core. On the PS3 one of the SPUs are disabled to increase yield.

      The problem with the cell is the the SPUs are hell to program if you have a problem that doesn't fit nicely in the 256k ram that an SPU has. And most programming tasks these days don't. If the programming is essentially DSP work then you are good to go, but hopefully AMD learns from Sony's mistake, and still allow all cores ran

  • Worth noting that this strategy may well have at least occurred to Intel already.

    It's not a new idea to mix lots of kinds of cores on one die: Intel's IXP network processors have been available a number of years now. These combine an Xscale (StrongARM) core with a number of specialised network processing-oriented microengines. The Xscale can run Linux and acts as a supervisor to the microengines, which do the fast path work of actually processing the data. The microengines are streamlined to be able to d
  • Cue : (Score:5, Funny)

    by CODiNE ( 27417 ) on Thursday December 14, 2006 @09:48PM (#17248786) Homepage
    20 people asking "Why would anyone need this?"
    50 people replying "I encode video"
    45 people replying "Games"
    10 replying "Babes of course"
    1 karma whore incapable of making a decent top 10 list. :)
  • Does that mean they're finally going to hire some Pacific Islanders, Basque, Thai, Pygmies, or Mayans?
  • Comment removed based on user account deletion
  • Are there any big architectural differences between multiple, specialized cores within a die and multiple, specialized units within a core? If so, what are they?
  • by bcrowell ( 177657 ) on Thursday December 14, 2006 @11:07PM (#17249568) Homepage

    The two big obstacles to getting better performance from parallelization are that (1) some problems aren't parallelizable, and (2) programmers, languages, and development tools are still stuck in the world of non-parallel programming. So from that point of view, this might make more sense than simply making a computer with a gazillion identical, general-purpose CPUs.

    On the other hand, I'd imagine that most of these processors would sit idle most of the time. For instance, right now I'm typing this slashdot post. If I had a video card with a fancy GPU (which I don't), it would still be drawing current, but sitting idle 99.99% of the time, since displaying characters on the screen as the user types is something that could be done back in the days of 1 MHz CPUs. Suppose I have a special-purpose physics processor. It's also drawing current right now, but not doing anything useful. Ditto for the speech-recognition processor, the artificial intelligence processor, the crypto processor, ...

    There are also a lot of applications that don't lend themselves to either multiple general-purpose processors or multiple special-purpose CPUs. One example that comes to mind is compiling.

    On a server, you're probably either I/O bound, or you're running a bunch of CGI scripts simultaneously, in which case multiple general-purpose processors are what you need.

    For almost all desktop applications except gaming, performance is a software issue, not a hardware issue. I was word-processing in 1982 with a TRS-80, and it wasn't any less responsive than Abiword on my current computer. Since I'm not into gaming, my priorities would be (1) to have a CPU that draws a low amount of power, and (2) to have Linux do a better job of cooperating with my hardware on power management. I would also like to have parallized versions of certain software, but that's going to take a lot of work. For example, the most common CPU-heavy thing I do is compiling long books in LaTeX; a massively parallel version of LaTeX would be very cool, but I'm not holding my breath.

    • by Eideewt ( 603267 )
      I'm with you almost all the way, except on compiling not benefiting from parallelization. If that were the case, programs like distcc would be pointless.
    • by Coryoth ( 254751 )

      The two big obstacles to getting better performance from parallelization are that (1) some problems aren't parallelizable, and (2) programmers, languages, and development tools are still stuck in the world of non-parallel programming.
      Programmers might be stuck in the world on non-parallel programming, but there are plenty of languages that aren't: Ada, AliceML, Concurrent Clean E, Eiffel + SCOOP, Erlang, Occam, Oz, and Pict all do concurrency remarkably well. Several of those, Ada, Eiffel, and Erlang in par

  • Intel may be early (Score:3, Interesting)

    by Eideewt ( 603267 ) on Thursday December 14, 2006 @11:19PM (#17249738)
    I do have my doubts about Intel's "more cores than you can shake a stick at" approach. I can't see the use in more than a few full-speed cores. They all have to be able to get at instructions quickly or most will just spin their wheels so hundreds of cores are a big challenge in more than just a making them fit and operate together sense. How much can we parallelize before most of the cores are doing little to nothing because their caches are empty? For that matter, the average user doesn't usually utilize one CPU core fully. Even on Dual-core (including actual dual CPU) desktop machines both cores are rarely needed for a responsive computer.

    Intel's standpoint seems to be that there's a world of data crunching lurking in all our computers (automated photo sorting, face recognition, and photo-realistic rendering), but none of these strike me as killer apps waiting to happen. All are things we could get used to and come to depend on, but I don't think any of them are being held back just because of our computing capacity, although photo-realistic rendering may be close. I'm pretty sure these aren't solved problems yet. Even if we were itching to do all this, one can only sort so many photos. It seems a bit wasteful to have all that power waiting around most of the time. Are we really nearly living in a world in which computing power is so plentiful that we can have that kind of ability even though we hardly ever use it?

    On the other hand, AMD's approach seems to have more immediate application. Video/audio encoding and other parallel processes are things that many of us do do frequently. A couple hundred cores could be pressed into use for this, but that seems much less elegant than purpose-built hardware.

    I don't know which approach will be best in the long-run. Probably both. It does seem to me that Intel is at best a few years to early to be hyping large numbers of cores.
  • The killer app, sadly, is probably going to be DRM. Look for some scheme where the encrypted video goes from the network port to the display without the bits ever being accessible from a user-programmed CPU. We already have Microsoft's scheme where video and audio are pumped around within the operating system kernel without ascending to the application level. Look for that to go entirely into a single IC.

  • with processors and such to control individual areas instead of one trying to do it all? We have PCs, minis, and mainframes where I work. The network (read:pc) based development groups used to laugh at us when they got a new system, usually with gigs of memory and multiple processors while our distro center production systems used 1g processors with 512mb ram.

    That was until they realized we were serving 50+ users and doing batch work at the same time. The volume of print alone was beyond their servers to
  • For this to work, there is a need to set a common API for the hardware acceleration. Otherwise, everybody has to do the same thing over and over again, or just ignore the instructions altogether.

    Take for instance the C3 processor from VIA. The latest already do SHA, AES, RSA and hardware random generation. Will AMD use similar instruction sets, or will they use completely different instructions or even processors? How am I, as a programmer, going to use these instructions? These instructions should also be
  • I think that a CPU company should attempt producing a CPU capable of running code written in functional programming languages. A CPU like that would contain thousands of arithmetic and bitwise operation units, all capable of operating in parallel. The O/S would be responsible for assigning jobs to those operation units, according to what the functional program was trying to achieve. This approach could speed up operations tremendously for functional programs, because usually functional programs greatly mini

I've noticed several design suggestions in your code.

Working...