NVIDIA Jetson TX1 Performance Shines For GPU Computing (phoronix.com) 22
An anonymous reader writes: Following last week's announcement of the Jetson TX1 development board, NVIDIA is now allowing independent reports of performance for their $599 USD 64-bit ARM development board. Linux results published by Phoronix show very strong performance for the Jetson TX1 when looking at the Cortex-A57 speed relative to the Tegra K1 and older Tegra SoCs along with other ARM hardware like Calxeda and Raspberry Pi. The Jetson TX1 was generally multiple times faster than ARM hardware a few years old. The graphics performance was twice as fast as the year-old Jetson TK1 thanks to the Maxwell GPU. Compared to x86 hardware, in CPU-bound tasks the performance is comparable to an AMD Sempron/Phenom except when utilizing GPGPU computing where it's then faster than Intel Skylake and Xeon processors. The Jetson TX1 had a peak power consumption of 16 Watts and an average power use of under 10 Watts.
Tegra X1 used to be the fastest ARM/GPU SoC (Score:1, Troll)
Tegra X1 used to be the fastest ARM/GPU SoC - but now the A9X in the new iPad Pro leaves it in the dust.
Meanwhile, the Cortex-A57 is probably one of ARMs worse cores to date. It really needs to be implemented on FinFETs to avoid overheating / throttling and performs poorly compared to custom ARM cores.
Re: (Score:1)
Tegra X1 used to be the fastest ARM/GPU SoC - but now the A9X in the new iPad Pro leaves it in the dust.
Meanwhile, the Cortex-A57 is probably one of ARMs worse cores to date. It really needs to be implemented on FinFETs to avoid overheating / throttling and performs poorly compared to custom ARM cores.
Yup, the A9X is faster than a chip several years old. Such breathtaking engineering prowess!
Re: (Score:1, Troll)
What are you talking about? Tegra X1 came out a few month ago
Heterogeneous Memory FTW (Score:1, Interesting)
Re: (Score:2)
It's too bad that despite the chip not being much more expensive (as evidenced by the fact that TX1 consumer products are reasonably priced - the TX1-based Shield ATV is $199 including a controller, the TK1-based Shield Tablet was $299, and $100+ for battery/display makes a lot of sense) or around the same price, the TX1 dev board is 3 times the price of the TK1 dev board. (Jetson TK1 was $192).
I was really hoping for a successor to the Jetson TK1 that used the X1, but this isn't really a successor - despi
Re: (Score:2)
One of the main problem of this on the desktop is the bandwidth. Having true heterogeneous memory arch between CPU and GPU but with slower than current in-device bandwidth (around 250GB/s) would be a waste for a vastly larger applications pool than for those which would benefit from it.
Re: (Score:1)
meanwhile, this is already been done for a few years with AMD APUs...
Re: (Score:2)
And even AMD were well over a decade late to the party, compared to the Silicon Graphics O2, which used UMA.
Peculiar omission (Score:5, Interesting)
I'm rather surprised there was no AMD APU in the GPGPU comparisons. That is, after all, rather the whole point of the APUs. And due to the super low latency of the AMD ones, they tended to do rather well compared to other chips. The HSA stuff seems to go rather beyond the GPGPU stuff in terms of its range of applications.
Re: (Score:2)
Whoa...
You wouldn't happen to have a link for that by chance please?
Re: (Score:2)
Re: (Score:2)
Sweet. Thanks!
Archtectural Similarities (Score:2)
Re:Archtectural Similarities (Score:5, Informative)
The USP of AMD's APUs used to be having the GPU and the CPU on the same die.
No, it's much more than that. It's not just on the same die, it;s the same side of the MMU as the CPU and the same side of the cache. This means you can pass data back and forth between the two units with a latency measured in nanoseconds, because you can simply hand over a pointer in the same memory space. I believe HSA also specifies things like atomics which are consistent across the CPU and GPU, as well as synchronisation primitives.
In other words HSA is much more than just bolting a CPU and a GPU onto the same bus on a die.
Re: (Score:2)
Some HSA features have been available since 2012, but coherent memory has only been available since 2014 with Kaveri
Yeah Kevari is where it got interesting, and where the LibreOffice benchmark with AMD trouncing everything else happened. Coherent memory turned it from a massive PITA into something much easier to program than normal GPGPU stuff. Lessons I learned from the supercomputer folks: latency is a killer.
None of the chips with HSA features use less that 12W.
That's comparable to the one in TFA: aver
Error on Page 7 ? (Score:2)
Looks like all the graphs on Page 7 has an incorrect TK1 label instead of the correct TX1 ?
http://www.phoronix.com/scan.p... [phoronix.com]
With nVidia getting serious about low power devices the next few years are going to be very interesting as AMD, Arm, Broadcom, and Intel all duke it out.
I can't wait till OpenCL is supported as well.