Forgot your password?
typodupeerror
Intel Hardware

Intel Shows 48-Core x86 Processor 366

Posted by timothy
from the soon-will-be-in-calculators dept.
Vigile writes "Intel unveiled a completely new processor design today the company is dubbing the 'Single-chip Cloud Computer' (but was previously codenamed Bangalore). Justin Rattner, the company's CTO, discussed the new product at a press event in Santa Clara and revealed some interesting information about the goals and design of the new CPU. While terascale processing has been discussed for some time, this new CPU is the first to integrate full IA x86 cores rather than simple floating point units. The 48 cores are set 2 to a 'tile' and each tile communicates with others via a 2D mesh network capable of 256 GB/s rather than a large cache structure. "
This discussion has been archived. No new comments can be posted.

Intel Shows 48-Core x86 Processor

Comments Filter:
  • ...or perhaps a megacore?

  • Can someone elaborate on why you'd want 48 full processors, rather than a processor with two (dual) or four (quad) "cores" (I'm presuming core in this case == FPU in the article). Supposedly Win7's SMP support becomes much more effective at the 12-16 core thresehold.

    • Re: (Score:2, Funny)

      by Anonymous Coward
      To enable system administrators to say "Fuck it, we'll go to one blade!"
    • by h4rr4r (612664)

      For a server.
      Probably not running windows, as linux and other *n.x type OSes support monstrous amounts of CPUs already.

    • Idle benchmarks (Score:5, Insightful)

      by Colin Smith (2679) on Wednesday December 02, 2009 @05:40PM (#30303154)

      With 48 processors you can have your system 98% idle running your typical application at full speed rather than just 50% or 75% idle as is the norm now.
       

      • by h4rr4r (612664)

        Please tell me where I can find boxes that would run 50% idle for my use. My company would pay handsomely for such CPUs. Current Quad Xeons fail to do this.

      • So would this have saved that guy's ass who spent $1M in electricity running SETI@Home on the school's computers?
    • Re: (Score:3, Insightful)

      by Locke2005 (849178)
      Current memory architecture has trouble keeping data fed to just 2 CPUs; unless each of the 48 cores has it's own dedicated cache and memory bus, this is a pretty useless design.
      • by V!NCENT (1105021)

        Yes. Serial RAM acces. Damn. When are people going to realise that RAM, which has a lot of, what are they called, banks?, hysically seperated from each other, could be made paralell?

      • by TheRaven64 (641858) on Wednesday December 02, 2009 @07:32PM (#30305074) Journal
        Processors access memory via a cache. When you load a word from memory to a register, it is loaded from cache. If it is not already in cache, then you get a cache miss, the pipeline stalls (and runs another context on SMT chips), and the memory controller fetches a cache line of data from memory. Cache lines are typically around 128 bytes. Modern memory is typically connected via channel that is 64 bits wide. That means that it takes 16 reads to fill a cache line. If you have your memory arranged in matched pairs of modules then it can fill it in 8 pairs of reads instead, which takes half as long.

        On any vaguely recent non-Intel chip (including workstation and server chips for most architectures), you have a memory controller on die for each chip (sometimes for each core). Each chip is connected to a separate set of memory. A simple example of this is a two-way Opteron. Each will have its own, private, memory. If you need to access memory attached to the other processor then it has to be forwarded over the HyperTransport link (a point-to-point message passing channel that AMD uses to run a cache coherency protocol). If your OS did a good job of scheduling, then all of the RAM allocated to a process will be on the RAM chips close to where the process is running.

        The reason Intel and Sun are pushing fully buffered DIMMs for their new chips is that FBDIMMs use a serial channel, rather than a parallel one, for connecting the memory to the memory controller. This means that you need fewer pins on the memory controller for connecting up a DIMM and so you can have several memory controllers on a single die without your chip turning into a porcupine. You probably wouldn't have 48 memory controllers on a 48-core chip, but you might have six, with every 8 cores sharing a level-3 cache and a memory controller.

      • by maraist (68387) * <michael@maraistNO.SPAMgmail@n0spam@com> on Wednesday December 02, 2009 @09:21PM (#30306254) Homepage
        What is worse is that theyve done away with cache coherence. So I dont think you can take a 48 thread mysql / java process and just scale it. You COULD use forked processes that don't share much. (ie postgres/apache/php).
    • by V!NCENT (1105021)

      Yes. YES! Raytracing! And emulating a D3Dn card in software (Google: pixomatic) and run the latest game with acceptable framerates.

    • by Yaztromo (655250) <.moc.cam. .ta. .omortzay.> on Wednesday December 02, 2009 @05:59PM (#30303528) Homepage Journal

      Can someone elaborate on why you'd want 48 full processors, rather than a processor with two (dual) or four (quad) "cores" (I'm presuming core in this case == FPU in the article).

      Bad assumption. In this case, we're talking about (what you would consider) a 48 core CPU. Previous designs would have apparently contained only a small number of full processing cores, and a large number of parallel units suitable only for floating point calculations (which can be great for various types of scientific calculations and simulations). This new design contains 48 discrete IA x86 cores.

      Seems like the type of processor Grand Central Dispatch [wikipedia.org] was designed for.

      Yaz.

    • webserver on a high traffic site. Either serving up lots of db connections or a lot of http connections, either way, I can imagine this having specific uses.

    • by vertinox (846076) on Wednesday December 02, 2009 @06:34PM (#30304148)

      Can someone elaborate on why you'd want 48 full processors, rather than a processor with two (dual) or four (quad) "cores" (I'm presuming core in this case == FPU in the article). Supposedly Win7's SMP support becomes much more effective at the 12-16 core thresehold.

      The first thought comes to mind if video processing and CGI animations because those applications are embarrassingly parallel [wikipedia.org].

      And those companies usually have the money to spend on top of the line hardware.

      Eventually this will trickle down to consumer level as always and people at home can now do real time movie quality CGI on their home computers in 10 years.

      • Re: (Score:3, Interesting)

        by Jeremy Erwin (2054)

        Embarrassingly parallel is right. Cache coherency was sacrificed in order to up the number of cores, though I suppose a Beowulf on a chip is still useful for some things.

        • Re: (Score:3, Interesting)

          by Bengie (1121981)

          I was recently reading an article about multi core designs and they said they'll have to drop cache coherency at some point soon and redesign locking a bit. Some other architectures don't use cache coherency to help with scaling, but that's not x86.

    • NUMA vs SMP (Score:3, Interesting)

      by mario_grgic (515333)

      In my experience Windows 7 64 bit is noticeably faster with NUMA configuration (Windows experience index is significantly higher because of improved memory throughput) and majority of application also run up to 10 % faster.

      I don't know if this is because of Nehalem Xeon CPUs having faster access to CPU local memory in NUMA configuration or if windows is also optimized for this?

  • Yet another cloud? (Score:5, Insightful)

    by Mortiss (812218) on Wednesday December 02, 2009 @05:36PM (#30303062)

    Why is everything called cloud these days? Yet another du jour buzzword. Is this really justified here?

    • by hibiki_r (649814) on Wednesday December 02, 2009 @05:38PM (#30303114)

      When it comes to marketing cliches, when it rains, it pours.

    • Re: (Score:3, Interesting)

      by Lord Ender (156273)

      The term "cloud" is over-used, but a 48-core chip is certainly a good match for anyone who uses virtualization, and cloud-style data services are absolutely big users of virtualization.

      Cloud computing is certainly a big deal. I recently explained to my boss that instead of spending weeks going through tickets, bureaucracy, approvals, and procurement to get a server in our own datacenter, we could go to Amazon, type the credit card number, and be up-and-running with a few clicks!

      I don't know if he understood

    • Why is everything called cloud these days? Yet another du jour buzzword. Is this really justified here?

      Given that making effective use of these cores would call for engineering code to work with any number of cores, as opposed to just 2, 4, or 8, then yes it is semi-justified, especially if aimed at the server market. I do say 'semi', though, because I partially agree with you about its silliness.

    • by V!NCENT (1105021)

      http://en.wikipedia.org/wiki/File:Cloud_computing_types.svg [wikipedia.org]

      Now imagine you'd have this 'cloud CPU' as your server at home that runs apps that you could acces with Google Chrome OS... Great family server... Or remote X and play Doom3 at work from your netbook.

      Sounds interesting now? ;)

    • by hazydave (96747)

      They're Intel... they have this buzzword department, and those kiddies have to make a living, too. Remember the Intel Pentium 4 "Netburst" architecture. Nothing whatsoever to do with nets, networking, the internet, etc.... other than the fact Intel Marketroids were trying to convince all the Mundanes (Muggles, to you kiddies) that this CPU would magically make their internet go faster. Yup, that's it.. not the fact you're on a frickin' POTS modem.

    • by zullnero (833754)
      No, it's just that it's a hot keyword, and a whole lot of people can't be bothered to look up what it really means. And knowing Intel pretty well, their guys most likely know full well what it is, and they took the name as a taunt to anyone who would dare consider distributing workload instead of buying more server hardware and doing it the way that benefits Intel's bottom line.
  • Only 48? (Score:5, Funny)

    by Kingrames (858416) on Wednesday December 02, 2009 @05:38PM (#30303094)
    Only 48 cores? I'd ask them to double that, but reasonably, 64 cores should be enough for anybody.
    • by Locke2005 (849178)
      You do know why Asynchronous transfer mode uses 48 byte packets don't you? The advocates of 32 byte and of 64 byte packets could not reach agreement, so they compromised. Perhaps the Intel designers reached a similar accomadation. (As a software engineer, I too am frequently puzzled when hardware engineers do things that are not powers of 2, e.g. the triple channel memory that Intel's socket 1366 chips currently use, forcing you to by DDR RAM in multiples of 3.)
  • Imagine a Beowulf Cluster of These !!

  • by joeflies (529536) on Wednesday December 02, 2009 @05:43PM (#30303216)
    because now school administrators only have to install SETI@HOME on 100 48-core computers instead of 5000 standard computers.
  • Synergy! (Score:5, Funny)

    by HRbnjR (12398) <chris@hubick.com> on Wednesday December 02, 2009 @05:49PM (#30303356) Homepage

    This new Cloud processor should create synergies with my SOA Portal system and allow me to deploy Enterprise B2B Push based Web 2.0 technologies!

  • by Joe The Dragon (967727) on Wednesday December 02, 2009 @05:53PM (#30303440)

    Is there enough cpu to chipset bandwidth to make use of all this cpu power?

    • by V!NCENT (1105021)

      If you need very little data per core but are executing sick calculations, then yes. But probably not anything realistic...

    • by Angst Badger (8636) on Wednesday December 02, 2009 @07:34PM (#30305106)

      Is there enough cpu to chipset bandwidth to make use of all this cpu power?

      That's really going to depend on the intended use. And on whether the intended use involves problems that a) can be efficiently parallelized, and more importantly, b) actually have been efficiently parallelized. But unless each core gets its own memory bus and its own dedicated memory with its own cache, I rather expect that the only things that are going to be parallelized to their maximum potential are wait states. All that said, it will still probably run faster than a two- or four-core CPU for many tasks, but it won't be running 48 times faster. I would not, however, refuse a manufacturer's sample if one was handed to me. ;)

      On the positive side, if this beast actually makes it to market, it might help spur the development of new parallel software.

    • All intel has to do is re implement Hyper Threading in each core.

      48 cores = 96 threads, IIRC.

    • Not the same thing (Score:4, Informative)

      by Sycraft-fu (314770) on Wednesday December 02, 2009 @07:04PM (#30304678)

      Sun's processors are heavily multi-threaded per core. It is an 8 core CPU where each core can handle 8 threads in hardware. Intel's solution is 48 separate cores, doesn't say how many threads per core.

      The difference? Well lots of threads on one core leads to that core being well used. Ideally, you can have it such that all its execution units are always full, it is working to 100% capacity. However it leads to slower execution per thread, since the threads are sharing a core and competing for resources.

      Something like Sun's solution would be good for servers, if you have a lot of processes and you want to avoid the context switching penalty you get form going back and forth, but no process really uses all that much power. Web servers with lots of scripts and DB access and such would probably benefit from it quite a lot.

      However it wouldn't be so useful for a program that tosses out multiple threads to get more power. Like say you have a 3D rendering engine and it has 4 rendering threads. If all those threads got assigned to one core, well it would run little faster than a single thread running on that core. What you want is each thread on its own core to give you, ideally, a 4x speed increase over a single thread.

      So in general, with Intel's chips you see not a lot of thread per core. 1 and 2 are all they've had so far (P4s and Core i7s are 2 threads per core, Core 2s are 1 thread per core). They also have features such as the ability for a single core to boost its clock speed if the others are not being used much, to get more performance for one thread and still stay in the thermal spec. These are generally desktop or workstation oriented features. You aren't necessarily running many different apps that need power, you are running one or maybe two apps that need power.

      As for this, well I don't know what they are targeting, or how many threads/core it supports.

  • It does sound a lot like it. Truth is that it is probably a lot more like the old Pentium D packages but still kind of interesting.
    So how many Coretex A8 cores could you fit on one of these?

We don't know one millionth of one percent about anything.

Working...