Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps 176

Posted by timothy on Monday March 18, 2013 @04:50PM from the code-slower dept.

An anonymous reader writes "Tom's Hardware has published a lengthy article and a set of benchmarks on the new "Haswell" CPUs from Intel. It's just a performance preview, but it isn't just more of the same. While it's got the expected 10-15% faster for the same clock speed for integer applications, floating point applications are almost twice as a fast which might be important for digital imaging applications and scientific computing." The serious performance increase has a few caveats: you have to use either AVX2 or FMA3, and then only in code that takes advantage of vectorization. Floating point operations using AVX or plain old SSE3 see more modest increases in performance (in line with integer performance increases).

Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps

This discussion has been archived. No new comments can be posted.

Search 176 Comments Log In/Create an Account

Comments Filter:

Re:Would that improve hashing speeds in, say, Bitc (Score:5, Informative)

by slashmydots ( 2189826 ) writes: on Monday March 18, 2013 @05:05PM (#43207459)

Slightly, but you haven't been keeping up on the latest hardware? My pair of Sapphire 5830's graphics cards would top off at about 435MH/s at a total system wattage of around 520W. The new Jalapeno chips from butterfly labs will do 4500 MH/s using 2 watts total system power. For comparison, my i5-2400 performed 14MH/s at 95W or so. So the Jalapeno is about 321x faster and about 47x more power efficient so combined, I believe that's 15,267.864x more efficient.

Re:Let's see... (Score:5, Informative)

by 0100010001010011 ( 652467 ) writes: on Monday March 18, 2013 @05:06PM (#43207461)

It's a joke. The Intel P5 Pentium FPU had a bug where
4195835/3145727=1.333739068902037589 The correct answer is 1.333820449136241002.

Less rounding of floating point numbers (Score:5, Informative)

by raymorris ( 2726007 ) writes: on Monday March 18, 2013 @05:16PM (#43207539) Journal

While it's got the expected 10-15% faster for the same clock speed for integer applications, floating point applications are almost twice as a fast HTH
Integer and floating point are separately implemented in the hardware, so an improvement to one often doesn't apply to the other. You can add integers by counting on your fingers. To do that with floating point, you have to cut your fingers into fractions of fingers - a very different process.
See: http://en.wikipedia.org/wiki/FMA3 [wikipedia.org]
It's common to have an accumulator like this:
X = X + (Y * Z)
To compute that in floating points, the processor normally does:
A= ROUND(Y*Z) X=ROUND(X+A)
Each ROUND() is necessary because the processor only has 64 bits in which to store the endless digits after the decimal point. FMA can fuse the multiply and the add, getting rid of one rounding step, and the intermediate variable:
X= ROUND( X + (Y*Z) )
That makes it faster. Since integers don't get rounded to the available precision, the optimization doesn't apply to integers. The above processor would do Y*Z, then +X, then round, then X=. A CPU designer can make that faster by including either a "add and multiply" circuit or a "add and round" circuit or a "round and assign' circuit. Any set of operations can be done in two clock cycles, if the maker decides to include a hardware circuit for it.

FMA4 (Score:4, Informative)

by ssam ( 2723487 ) writes: on Monday March 18, 2013 @06:31PM (#43208271)

Pah. AMD had FMA4 since 2011

Re:Hope it's going in the new Mac Pro (Score:5, Informative)

by washu_k ( 1628007 ) writes: on Monday March 18, 2013 @07:29PM (#43208879)

The Core i7's are consumer-grade processors and are slower than the Xeon's the Mac Pros use
This is completely incorrect. The current Mac Pros use Nehalem based Xeons which are two generations back from the current Ivy Bridge i7s. Xeons may have differences in core count, cache and/or ECC support but their execution units are the same as their desktop equivalents. The base Mac Pro CPU is equivalent to an i7-960 with ECC support. The current Ivy Bridge i7s are a fair bit faster.

Re:Hope it's going in the new Mac Pro (Score:5, Informative)

by KonoWatakushi ( 910213 ) writes: on Monday March 18, 2013 @07:59PM (#43209131)

ECC memory is only marginally slower. Considering error rates and modern memory sizes, it is far past time that it became a standard feature. The extra cost would be totally insignificant if were standard, and not used as an excuse to gouge people on Xeons.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps 176

Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps More Login

Next-Gen Intel Chip Brings Big Gains For Floating-Point Apps

Re:Would that improve hashing speeds in, say, Bitc (Score:5, Informative)

Re:Let's see... (Score:5, Informative)

Less rounding of floating point numbers (Score:5, Informative)

FMA4 (Score:4, Informative)

Re:Hope it's going in the new Mac Pro (Score:5, Informative)

Re:Hope it's going in the new Mac Pro (Score:5, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot