Forgot your password?
typodupeerror
Intel Hardware

Intel Details Upcoming Gulftown Six-Core Processor 219

Posted by samzenpus
from the give-me-the-numbers dept.
MojoKid writes "With the International Solid-State Circuits Conference less than a week away, Intel has released additional details on its upcoming hexa-core desktop CPU, next gen mobile, and dual-core Westmere processors. Much of the dual-core data was revealed last month when Intel unveiled their Clarkdale architecture. However, when Intel set its internal goals for what its calling Westmere 6C, the company aimed to boost both core and cache count by 50 percent without increasing the processor's thermal envelope. Westmere 6C (codename Gulftown) is a native six-core chip. Intel has crammed 1.17 billion transistors into a die that's approximately 240mm sq. The new chip carries 12MB up L3 (up from Nehalem's 8MB) and a TDP of 130W at 3.33GHz. In addition, Intel has built in AES encryption instruction decode support as well as a number of improvements to Gulftown's power consumption, especially in idle sleep states."
This discussion has been archived. No new comments can be posted.

Intel Details Upcoming Gulftown Six-Core Processor

Comments Filter:
  • by pointbeing (701902) on Thursday February 04, 2010 @09:21AM (#31021272)

    Can most programmes really be written to take advantage of so many cores?

    Yup.

    Got a Core i7-920 running at 3.2GHz at home - OS is 64-bit Kubuntu 9.10.

    Yesterday I had five two-hour videos I wanted to render to DVD5 format - four were .avi and one was .mp4.

    Launched five instances of DeVeDe to render the video and create the DVD file structure and did all five at the same time - then left for work. Took an hour and twelve minutes and the machine didn't melt, explode or let any of the magic smoke out of the box.

    Even if an application isn't multithreaded the OS is - so even running a single task a multicore processor will give you a performance boost.

    A Core i7 has four cores that'll run two threads each - presents as eight processor cores to the OS. I have no problem using them all ;-)

  • Re:240mm square? (Score:3, Informative)

    by IBBoard (1128019) on Thursday February 04, 2010 @09:29AM (#31021344) Homepage

    Isn't it 240mm sq = 240mm x 240mm (as in (240mm) squared) and 240 sq mm is 240 x 1mm x 1mm (as in 240 x (square mms))? It's always an awkward one to represent and be clear on.

  • by TheRaven64 (641858) on Thursday February 04, 2010 @09:33AM (#31021366) Journal

    Porting libdispatch requires a generic event delivery framework, where the userspace process can wait for a variety of different types of event (signals, I/O, timers). On Darwin, Apple used the kqueue() mechanism that was ported from FreeBSD, so it's quite easy to port the code to FreeBSD (just #ifdef the bits that deal with Mach messages appearing on the queue). Kqueue is also ported to NetBSD and OpenBSD, so porting it to these systems will be easy too.

    Solaris and Windows both have completion ports, which provide the same functionality but with different interfaces. Porting to Solaris would require replacing the kqueue stuff with completion port stuff. Porting to Windows would ideally also require replacing the pthread stuff with win32 thread calls. Even Symbian has a nice event delivery framework that could be used, although I'm not sure what the pthread implementation is like in the Symbian POSIX layer.

    Linux is the odd system out. All different types of kernel events are delivered to userspace via different mechanisms, so it's really hairy trying to block waiting until the next kernel event. This also makes it harder to write low-power Linux apps, because your app can't spend so long sleeping and so the kernel can't spend so much time with the CPU in standby mode.

    If you don't need the event input stuff (which, to be honest, you do; it's really nice), you can use toydispatch, which is a reimplementation that I wrote of the core workqueue model using just portable pthread stuff.

    It also adds some pthread extensions for determining the optimal number of threads per workqueue (or workqueues per thread, depending on the number of cores and the load), but these are not required. The FreeBSD 8.0 port doesn't have them; they were added with FreeBSD 8.1.

  • by Anonymous Coward on Thursday February 04, 2010 @09:43AM (#31021446)

    With most browsers becoming multithreaded or multi process, even casual users will potentially gain from this.

    Not to mention your machine may have a truckload of background processes going on.

    Games etc will definitely benefit from any extra cores you throw at them. Since a fine grained threading library like OpenMP will grab all available threads when sharing out work. Also the Xbox 360 has 3 cores so a lot of games will use at least 3 threads as a minimum (since most games are multiplatform now).

  • by physburn (1095481) on Thursday February 04, 2010 @09:56AM (#31021596) Homepage Journal
    Most programs are very much not written to take advantage of multi-cores. Even advanced 3D games which might find the extra compute power useful, often can't deal with extra cores. E.g. I had to set the affinity of Borderlands to 1 CPU only to stop it crashing. Multithreaded programming is slowly getting easier as libraries to help it, become available. Java is particularly easy for this, have a look at java.util.concurrent, with i've just started using on the serverside. But most programs are miles behind in the move to being able to work with multiprocessors. Right now 6 cores will have very little to offer the desktop, on the server side however, i'm sure the extra core will have use, but only if the server is particularly loaded with transactions, something with rarely happens.

    ---

    Multithreaded Programming [feeddistiller.com] Feed @ Feed Distiller [feeddistiller.com]

  • by kjart (941720) on Thursday February 04, 2010 @10:18AM (#31021820)

    Having a second core was handy for people who like to play world of warcraft in one window and surf web pages in the other (considering how much CPU modern web pages eat for some reason. yay flash?).

    Having two more cores beyond that is fairly useless for the vast majority of even power users except for very specific apps that even they are running a very small percentage of the overall time they are using their computers.

    Not that I particularly disagree with your conclusions overall, but wow can actually be set to run on multiple cores and does get a performance benefit for doing so.

  • Re:DRM Support (Score:5, Informative)

    by sakdoctor (1087155) on Thursday February 04, 2010 @10:20AM (#31021852) Homepage

    What?

    AES acceleration will be useful for VPNs, serving SSL websites, VoIP, full disk encryption ... and so on.

  • by sznupi (719324) on Thursday February 04, 2010 @10:30AM (#31021978) Homepage

    Ah, so you don't realize that the cheapest Celeron nowadays is a dualcore 2.5 GHz, essentially a Core 2 Duo with 1 MiB of L2 (irrelevant, encoder will fit and the video is a stream of data) and 800 MHz FSB (irrelevant, mostly limited by the speed of computation, not by sustained transfer of the video stream). It would be done probably in around half of the time you were at work.

    If you were at home with a Celeron you could also do day-to-day stuff (yes, it would take even longer - but is that really that important in the case of a rare batch job which in every scenario is too long to be a "smooth" workflow?)

  • by Big Smirk (692056) on Thursday February 04, 2010 @10:51AM (#31022246)

    Real time games are a bad example because in general the trouble with threads is you have to sync them up. The entire program becomes give feedback, gather input, calculate stuff, give feedback. You generally need to make sure the calculate stuff parts starts and stops with some predictability.

    Some games seem to run their AI in separate threads. These seems to be a reasonable compromise. So when the game does 'gather input' it asks the AI subsection where it wants to go at that instant.

    However, its judging by the stability of games like Fallout3, its unclear if either the programmers know how to deal with threads or the underyling OS is ready for intense real time updates.

  • Re:on-board AES? (Score:5, Informative)

    by 0123456 (636235) on Thursday February 04, 2010 @11:05AM (#31022394)

    Why put AES on-board?

    They're not: they're putting extra instructions on-board which help implement AES more efficiently. They may also allow you to implement other algorithms more efficiently, though I haven't looked at them in enough detail to be sure.

    I thought AES was relatively fast as encryption algorithms go.

    That still doesn't make it fast at an absolute level. Particularly when you're doing full-disk encryption with user account encryption on top and IPSEC on all your network connections.

  • by petermgreen (876956) <plugwash@@@p10link...net> on Thursday February 04, 2010 @11:08AM (#31022428) Homepage

    Around here, the programmers never met a thread they didn't like. Add a requirement like - "display dialog box to confirm shutdown" and suddenly the thread count in the application jumps by 4...
    Lemme guess these programs are also buggy crash prone peices of shit?

    having more than one thread doing UI stuff has always struck me as more trouble than it's worth (you need loads of extra locks and a lot of thinking about what does and doesn't constitute a consistent state). Indeed some common gui libraries (swing for example) aren't built to support multiple threads accessing thier components for just this reason.

  • by TheThiefMaster (992038) on Thursday February 04, 2010 @11:45AM (#31022902)

    1/2-word?
    I'm pretty sure that there are instructions for atomic compare and swap of pointer-sized values, at least.

  • by master_p (608214) on Thursday February 04, 2010 @11:46AM (#31022920)

    Isn't the xchg instruction atomic for all sizes (8/16/32/64 bits)?

  • by Tim C (15259) on Thursday February 04, 2010 @12:09PM (#31023224)

    Most of the things that you do on a computer will run happily on a 1GHz CPU and still not bring usage over 50% more than occasionally

    Speak for yourself.

  • Re:Who cares... (Score:2, Informative)

    by Microlith (54737) on Thursday February 04, 2010 @12:52PM (#31023774)

    Sure, but neither the Oracle or IBM chips will be available for less than several grand, and never in consumer level equipment (I can't exactly order one off Newegg.) And there's no telling how long it will be until the AMD chip trickles down from Opteron class to Phenom class, while it will probably be short order for the Core i9 to appear in stores.

    I suspect that AMD will drop the 6-core version as an X6 pretty soon, but it will likely be outperformed (possibly significantly) by the Gulftown.

  • Re:Transistor count (Score:2, Informative)

    by wtfbill (1408123) on Thursday February 04, 2010 @01:13PM (#31024010)
    No, I bet his caffeine content is fine. The 68K transistors would refer to the 68000 procs from Motorola which were 16- or 32-bit depending on configuration. Some of them could be switched at boot time by holding one of the pins high or low (I forget which...where are those old data sheets I have on those?) Of course the 65xx series and the 6800 series were 8-bit, however, they didn't have close to 68K transistors. But GP is right on, 68K transistors for a 32-bit architecture.
  • by TheRaven64 (641858) on Thursday February 04, 2010 @03:01PM (#31025392) Journal

    Subversion repository [gna.org]. Note that it's designed specifically to do stuff in the background for libobjc2. It only implements a tiny subset of the libdispatch functionality, and not as efficiently (one thread per workqueue, for example). It's not intended to replace libdispatch, just to let me use some of the libdispatch APIs in code that has to be portable. The 'toy' in the name is not self-deprecation, it's an accurate assessment.

    Oh, and you get better results if you search for 'toydispatch' not 'linux toydispatch' (it's nothing to do with Linux, although it should run there).

  • Re:on-board AES? (Score:4, Informative)

    by wirelessbuzzers (552513) on Thursday February 04, 2010 @04:18PM (#31026270)

    Why put AES on-board?

    They're not: they're putting extra instructions on-board which help implement AES more efficiently. They may also allow you to implement other algorithms more efficiently, though I haven't looked at them in enough detail to be sure.

    The instructions perform a single round of AES (which has 10-14 rounds depending on key size), either encrypting or decrypting. Certain other algorithms such as Lex, Camellia, Fugue and Grostl use AES S-boxes in their core, and can probably benefit from these instructions. However, they will not achieve nearly so much a speedup as AES.

    The AES instructions themselves will approximately double the speed of sequential AES computations. This is very unimpressive; VIA's AES instructions are much faster. They will also make it resistant to cache-timing attacks without losing speed, which is unimpressive because you can already do this on Penryn and Nehalem. The low speed results from the AES instructions having latency 6; if you can use a parallel mode (GCM, OCB, PMAC, or CBC-decrypt, for example) then the performance should be 10-12x the fastest current libraries. Hopefully, this will cause people to stop using CBC mode, but perhaps I'm too optimistic.

    Intel also added an instruction called PCLMULQDQ which does polynomial multiplication over F_2. If it's fast (I can't find timing numbers, but hopefully it's something like latency 2 and throughput 1) then it will be very useful for cryptography in general, speeding up certain operations by an order of magnitude or more. This is more exciting to me than the AES stuff, because it might enable faster, simpler elliptic-curve crypto and similarly simpler message authentication codes. Unfortunately, these operations are still slow on other processors, so cryptographers will be hesitant to use them until similar instructions become standard. If the guy you're communicating with has to do 10x the work so that you can do half the work... well, I guess it's still a win if you're the server.

    I thought AES was relatively fast as encryption algorithms go.

    That still doesn't make it fast at an absolute level. Particularly when you're doing full-disk encryption with user account encryption on top and IPSEC on all your network connections.

    AES is fast for a block cipher, but modern stream ciphers such as Salsa20/12, Rabbit, HC and SOSEMANUK are about 3-4x faster. (In other words, they are still faster than AES in a sequential mode on Westmere.) AES is still competitive, though, if you can use OCB mode to encrypt and integrity-protect the data at the same time.

    The fastest previous Intel processor with cutting-edge libraries in the most favorable mode could probably encrypt or decrypt 500MB/s/core at 3-3.5GHz. This is fast enough for most purposes, but in real life with ordinary libraries you'd probably get a third of that. So this will significantly improve disk and network encryption if they use a favorable cipher mode.

    Cred: I am a cryptographer, and I wrote what is currently the fastest sequential AES library for Penryn and Nehalem processors. But the calculations above are back-of-the-envelope, so don't depend on them.

"Our reruns are better than theirs." -- Nick at Nite

Working...