Forgot your password?
typodupeerror
Intel Upgrades Hardware

A Co-processor No More, Intel's Xeon Phi Will Be Its Own CPU As Well 53

Posted by timothy
from the career-advancement dept.
An anonymous reader writes "The Xeon Phi co-processor requires a Xeon CPU to operate... for now. The next generation of Xeon Phi, codenamed Knights Landing and due in 2015, will be its own CPU and accelerator. This will free up a lot of space in the server but more important, it eliminates the buses between CPU memory and co-processor memory, which will translate to much faster performance even before we get to chip improvements. ITworld has a look."
This discussion has been archived. No new comments can be posted.

A Co-processor No More, Intel's Xeon Phi Will Be Its Own CPU As Well

Comments Filter:
  • Moore's law is not coming back from the grave, or is it ?
  • I thought that we already had GPUs embedded in CPUs. How embedding CPU inside GPU makes it so much different and breakthrough?

    • by Overzeetop (214511) on Tuesday November 26, 2013 @10:00AM (#45525585) Journal

      Patents already cover most implementations of GPUs within CPUs. But the field is wide open if you start embedding CPUs in GPUs. It's like "on the internet," but with uprocessors.

    • by Sockatume (732728)

      There's no other component. (Pedantry: calling it a GPU is a misnomer as nobody really uses them for real-time graphics. You won't be playing Crysis 3 with one of these. It just happens that this kind of hardware came out of graphics silicon design.)

    • GPU is important enough at least at a basic level but when you speak about servers... do we need them. I mean, I know microsofts next servers will have the optional GUI implemented and they will "try" to get away from the GUI and will recommend to work without it. I question the need for the GPU
      • Re: (Score:2, Interesting)

        by Anonymous Coward

        It's not just for drawing graphics. It can be used as a general computation platform.

        As an example imagemagick supports OpenCL nowadays. So as an example if you have a webpage where images can be uploaded and you do some processing for them (cropping, scaling for thumbnails etc) you can get absolutely amazing performance on a server with GPU.

      • by Anonymous Coward

        It's hard to believe how shortsighted some of you people are... The GPU in the server has nothing to do with graphics (despite it being in the name), it is used for general highly parallel computations, and while traditional server software doesn't currently support it it is theoretically possible to accelerate traditional server applications such as databases using the GPU, and it has been actually demonstrated with PoC software.

        Captcha: leverage

  • by Sockatume (732728) on Tuesday November 26, 2013 @10:02AM (#45525595)

    Knights Landing will be available as both an accelerator card and a standalone CPU with some sort of large high-speed memory pool on the die.

    • by Anonymous Coward

      For a Phi, the selling point is about ease of programming. The memory model of the accelerator card is a pain in the ass, making development more difficult. This on top of the fact that the administration of those are pretty limited and annoying. MPSS is crap for everyone, and one of the critical differences here is that the standalone accelerator might not require Intel to be the linux distribution curator anymore (they frankly suck pretty hard at it).

      Intel having a standalone variant pretty much obviat

      • by Sockatume (732728)

        So you think the accelerator card version is just a stopgap for customers looking to upgrade (rather than replace) their systems, and it'll go away in time?

  • Fully Baked? (Score:5, Informative)

    by DragonDru (984185) on Tuesday November 26, 2013 @10:18AM (#45525689)
    Good. The current generation Phi cards are a pain to administer. With luck the new generation will be more fully baked.
    - very hot card, no fans
    - depends on software to down throttle the cards (mine have hit 104C)
    - stripped down OS running on the cards, poor user facing directions for the usage

    Anyway, enough from me.
    • Re:Fully Baked? (Score:5, Informative)

      by Junta (36770) on Tuesday November 26, 2013 @10:25AM (#45525749)

      I won't disagree about the awkwardness of MPSS, but the 'very hot card, no fans' is because it's meant only to be installed into systems that cooperate with them and have cooling designs where the hosting system takes care of it. For a lot of systems that Phi go into, a fan is actually a liability because those systems already have cooling solutions and a fan actually fights with the designed airflow.

      Of course, that's why nVidia offers up two Tesla variants of every model, one with and one without fan, to cater to both worlds.

      • by Anonymous Coward

        Xeon Phi also have variants with and without fans:

        http://newsroom.intel.com/servlet/JiveServlet/showImage/38-5572-2661/Xeon_Phi_Family.jpg

    • by kry73n (2742191)

      maybe they will finally also remove those texturing units from the Phi

  • by Anonymous Coward on Tuesday November 26, 2013 @10:24AM (#45525743)

    The 80486 was the first Intel processor with integrated coprocessor, coming at about €1000 (only know the DM price). There was a considerably cheaper version, the 80486SX "without" coprocessor (actually, the coprocessor was usually just disabled, possibly because of yield problems, and still took current).

    One could buy an 80487 coprocessor that provided the missing floating point performance. Customers puzzled how the processor/coprocessor combination could be competitive without the on-chip communication of the 80486. The answer was that it did not even try. The "coprocessor" contained a CPU as well and simply switched off the "main" processor completely. It was basically a full 80486 with different pinout, pricing, and marketing.

    It was probably phased out once the yields became good enough.

    • by kheldan (1460303)

      It was basically a full 80486 with different pinout, pricing, and marketing.

      Intel also made an 80386/80387 "RapidCAD" chipset, that I managed to get a hold of at one point, and discovered that the 80387 was just a dud (which, according to Wikipedia, was there just to supply the FERR signal, to keep everything compatible with a real '387); the coprocessor was on-die with the '386 core, just like a '486.

  • These processors are like an Intel version of Sun Niagara, but with wider vector. Actually, from an architectural perspective Xeon Phi (Larrabee) is pretty basic. They’re an array of 4-way SMT in-order dual-issue x86 processors, with 512-bit vector units. I think one of the major reasons Xeon Phi doesn’t compete well with GPUs on performance is that legacy x86 ISA translation engine taking up so much die area. Anyhow, so if you have a highly parallel algorithm, then Xeon Phi will be a boon f

    • These processors are like an Intel version of Sun Niagara, but with wider vector.

      I thought the Niagara was a crazy-wide barrel process or sorts: it switches to a new thread every cycle with a grand total of 8 threads (per core). The idea being that if you've filled up all 8 threads, then each instruction can wait 8 cycles for a bit of memory entirely for free because it takes 8 cycles to execute.

      The idea (not entirely realised sadly) was that for highly parallel workloads you get much higher aggregate thro

      • by DrYak (748999)

        I don't know about Niagara's, but according to docs about Warps and half-warps, that's how Nvidia GPU run CUDA.

        They keep cycling through 2 or 4 threads, to hide memory latency.
        (Except that each thread it self runs on a wide SIMD instead of a normal CPU. So the final size of parallel execution [=threads] is the amount of wraps in parallel x size of the SIMD).

  • I could see using this, whereas I couldn't see myself using the card version. If the cost premium is reasonable this could be awesome for image processing. I have an image algorithm I use CUDA for and moving the data around consumes almost as much time as processing the data. If I had this in my servers I would have flexibility and much greater performance with this solution. --Robert
  • by etash (1907284) on Tuesday November 26, 2013 @11:59AM (#45526673)
    wouldn't an embedded in the cpu xeon phi version, lack the necessary GDDR4/5 which exists in the PCI-express card version with its 200-300GB/s of throughput, and be forced to just access the main computer RAM at about 40-50GB/s?

Too much is not enough.

Working...