ARM's New Processors Are Designed To Power the Machine-Learning Machines (theverge.com) 27
An anonymous reader shares an article: Official today, the ARM Cortex-A75 is the new flagship-tier mobile processor design, with a claimed 22 percent improvement in performance over the incumbent A73. It's joined by the new Cortex-A55, which has the highest power efficiency of any mid-range CPU ARM's ever designed, and the Mali-G72 graphics processor, which also comes with a 25 percent improvement in efficiency relative to its predecessor G71. The efficiency improvements are evolutionary and predictable, but the revolutionary aspects of this new lineup relate to artificial intelligence: this is the first set of processing components designed specifically to tackle the challenges of onboard AI and machine learning. Plus, last year's updates to improve performance in the power-hugry tasks of augmented and virtual reality are being extended and elaborated. [...] ARM won't just be powering machine learning with its new chips, it'll benefit from ML too. The new designs benefit from an improved branch predictor that uses neural network algorithms to improve data prefetching and overall performance.
Re: (Score:2)
Purchase Cyrix processors only.
I loved my Cyrix 6x86 CPU back in the late 1990's. It ran Linux flawlessly for my file server.
https://en.wikipedia.org/wiki/Cyrix_6x86 [wikipedia.org]
Re: (Score:3)
Those where great processors for the money that you paid for them. I believe it used Pentium pro instructions instead of the pure Pentium. Ran Linux perfectly too. I was using in my main workstation.
Instruction sets (Score:2)
Those where great processors for the money that you paid for them. I believe it used Pentium pro instructions instead of the pure Pentium.
According to wikipedia [wikipedia.org] :
- the Cyrix MII - was Pentium Pro / Pentium II compatible, as you mention.
- before that : Cyrix 6x86 MX - was Pentium MMX compatible.
- even before : the previous Cyrix 6x86 & 8x86L - were more or less, but not entirely, Pentium compatible. (They officially identified themselves as "486") (I remember that bit)
Also:
- their FPU was less optimized, because most of the typical software workload was integer back then. (Also rings a bell)
For a linux server they would very likely have be
Re: ARM Sucks (Score:3)
"Were" you thick git
Re: (Score:2)
Need Memory Improvements Too (Score:5, Interesting)
IMHO what is mostly needed is faster memory. Modern ML often involves working with multi-Gigabyte domain models, stored in DRAM, where the access latency hasn't changed particularly in the last 10 years.
GDDR (Score:2)
But then everything moved onto GPU's.
Modern GDDR retains the capability to clear buffers by itself [wikipedia.org].
But indeed, the bitmasking capability of older WRAM and SGRAM have been made redundant by the much more general-purpose capabilites offered by the GPU coupled with the much more complex modern interface. (i.e.: It's opengl running your Linux Compiz / Apple Quartz and whatever was the windows equivalent).
latency thermal wall (Score:5, Informative)
You should write advertising copy.
Faster has many dimensions, yet you fixate on just one. It turns out, however, that slapping you down was a royal PITA: all of the vendors involved in HBM{1,2,3} pony up sweet-shit-all concerning latency (wanted: an edible, colour-coded haymark).
Finally I found this comment by one Tuna-Fish from 2010:
I'm not the only frustrated person.
* AMD's upcoming Fiji GPU will feature new memory interface [extremetech.com] — Joel Hruska, 30 April 2015
The gist of the fragments I managed to find is that HBM latency is roughly on par with the concurrent GDR generation, and this is—in most controllers—actually worse than the concurrent DDR generation, hence the industry-wide light-lip syndrome.
Only that's not the whole story. Because HBM has more channels than GDR and allows more pages to be open concurrently. For a sufficiently parallel workload, HBM latency as a function of bandwidth can be excellent compared to the alternatives.
And certainly the thermal density is yards superior. Which is itself interesting, because you hardly ever see plots pitting latency against J/bit-ns. Awesome! A brand shiny new thermal wall. Physical distance, aka latency, actually functions as an implicit thermal spreader, and this goes away when the engineers get too pie-eyed over rail-gun-drone–accelerated rolling drive-thru nirvana (recommended: a Kevlar fish net on a titanium pole, and a Quick eye).
A Study of Application Performance with Non-Volatile Main Memory [ucsd.edu] — Yiying Zhang (2015)
The fastest of the prospective non-volatile technologies (which are thermally desirable due to lack of refresh) is NRAM.
Fast NRAM to be released 2019-epsilon by Nantero/Fujitsu [nextbigfuture.com] — August 2016
It actually has the endurance to be used as an on-chip SRAM replacement with eDRAM access times, but I don't know whether joint fabrication with CMOS is viable (in particular, at the high end). Note that ultimate durability is as yet unknown, because their 10^14-cycle test bench is taking a while to return 0/1.
[*] I wou
Re: (Score:2)
Um... yea.. Thanks for that.
Are you high?
No they aren't (Score:5, Informative)
For those really interested Anandtech has the actual computer engineering of the whole thing: http://www.anandtech.com/show/... [anandtech.com]
Machine-Learning Machines? (Score:1)
If the Machines are Learning Machines, who is Learning the Machine-Learning Machine Machines? ;)
Bunch of marketing cock (Score:3)
Nothing in there points in any way to machine learning. There's just a fancy branch predictor in there, the design of which may have been informed by something related to neural networks, but that's true of all CPUs of the current generation. (kind of like integrated memory controllers were like 10 years ago.)
But that's just as well, given how AI is a marketing scam anyway.
no (Score:1)