ARM Goes 64-Bit With Its New ARMv8 Chip Architecture 156
angry tapir writes "In less than a decade, a microprocessor core could be no bigger than a red blood cell, the CTO of ARM has predicted. ARM has already helped develop a prototype, implantable device for monitoring eye-pressure in glaucoma patients that measures just 1 cubic millimeter, CTO Mike Muller said at ARM's TechCon conference. At the conference the company also introduced its first 64-bit chip. The ARMv8 adds 64-bit addressing capabilities, an improvement over the current ARMv7-A architecture, which is capable of up to 40-bit addressing. The architecture puts ARM into more direct competition with Intel and its 64-bit Xeon processors."
Architecture (Score:5, Informative)
Here's a better description of the new Architecture:
ARMv8 Architecture PDF [arm.com]
Re: (Score:3)
Full double-precision support in the vector unit is big win. Current ARM chips suck when you have to do anything with double precision floating point values.
No Thumb-3, so you're stuck with 32-bit instructions in 64-bit mode, which don't give as good i-cache usage. That's a shame, but I guess you can always run 32-bit Thumb-2 apps on your 64-bit kernel. There's no blx to 32-bit mode, you're stuck in 64-bit mode for an entire process (which makes sense).
Weakly ordered
Re: (Score:2)
They also seem to have doubled the number of registers to 32 (+32 SIMD registers) in the 64-bit mode, like AMD did with x86-64. I don't know if it'll provide much performance increase though since they already had a decent amount, unlike x86.
Re: (Score:3)
I suspect the improvements to the SIMD registers are going to make more of a difference. 16 integer registers is usually enough to avoid needing to spill to the stack. It really depends on how they're split between callee- and caller-save, but that's a decision for the ABI, rather than the ISA. A few more argument registers would probably help Objective-C, where you have two used for self and _cmd, so arguments are more likely to spill to the stack. A few more caller-save registers could reduce the numb
Re: (Score:3)
Re: (Score:2)
A Marvel kirkwood core @1.2Ghz, is about half as many bogomips as an intel Atom N280
Re: (Score:2)
You are comparing bogomips of an ARM vs X86? Now that's the stupidest 'benchmark' you could EVER do.
http://en.wikipedia.org/wiki/BogoMips [wikipedia.org]
"It is not usable for performance comparison between different CPUs."
Re: (Score:2)
Processor : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS : 1192.75
Features : swp half thumb fastmult edsp
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant : 0x2
CPU part : 0x131
CPU revision : 1
Hardware : Marvell GuruPlug Reference Board
Re: (Score:3)
New ARMs do have a FPU. Efficiently making use of it, though, requires an ABI change. Ubuntu still uses armel rather than armhf. It's not yet in the official Debian archive yet, too -- but you can already try the candidate in a chroot. Not surprisingly, floating point benchmarks get a massive improvement.
BS (Score:4, Insightful)
> "The architecture puts ARM into more direct competition with Intel and its 64-bit Xeon processors."
Who is writing and editing this BS? It is not in any way putting ARM in competition with Xeon CPUs. It is becoming a serious contender for low end CPUs: Atom, Pentium, Athlon, and it is getting more interesting for streaming and massive threading applications (like the SPARC T).
Re: (Score:2, Funny)
In other news, Toyota is now offering the Prius in a two door variant. This design puts the Prius into more direct competition with Ferrari and it's two door sports cars.
Re: (Score:2)
Nvidia project denver.
Re: (Score:2)
Who is writing
Well, if you look at submitter's name link you'll see "http://www.techworld.com.au/", which just happens to be where the summary links to.
and editing this BS?
Why that would be one of the crack Slashdot "editing" staff, who are more than happy to link to subby's techrag clickbait (probably collecting a fee for Geek.net).
Re: (Score:2)
I'm hoping ARM chips are performance-competitive with x86_64 chips within a decade just because AMD is having problems, and giving Intel an effective m
Re: (Score:2)
I won't argue that 40 isn't a lot of bits. But frequently bits are used for different purposes than part of addressing memory locations. Especially in the microcontroller world.
Re: (Score:2)
ARM isn't a microcontroller. A microcontroller is something with 1K RAM and 16K flash, and a set of pins useful for talking to external devices, like a serial port, digital outputs with PWM and integrated AD converters.
ARM is a low power CPU and if they're smart they'll do like x86 and require the unused bits to be all set to 1 or 0 so that they can't be repurposed.
Re: (Score:2)
Depends on the project. There are ARM-based microcontrollers out there with 128k flash and maybe 256k of RAM. And with onchip peripherals, havi
Re: (Score:2)
Core-to-core performance? Obviously, Xeon beats ARM lower than dirt. Watt-to-watt performance? Xeon gets thrashed.
Re: (Score:2)
Think of it like this, ARM is small and lean on power. You could pack dozens of cores onto a die giving it the power to compete with the Xeon. Blade servers can be shrunk down and more can fit into a single U of rack space since ARM does not dissipate tens of watts. We might see something along the lines of servers that are nothing more than a mini cluster in a box that appear as one whole system. A prepackaged beowulf cluster if you will.
There was an interesting video I saw a while back of a researcher who
Re: (Score:2)
Whoops, it is forced air cooled and built by Sandia National Labs: http://www.youtube.com/watch?v=UPyn9krjIRc [youtube.com]
Still it does demonstrate that you can really tightly pack ARM systems into a box to raise computing power.
Really needed? (Score:2, Informative)
Is 64-bit really needed in mobile devices? It increases the number of wires and data transfer, which means less power efficiency.
Re: (Score:2)
Mobile devices will soon need to pass the 4gb/process barrier, so yes, it's needed.
Oblig. WAY (Score:2)
"I got me 64 gigabytes of RAM;
I don't feed trolls and I don't ream SPAM;"
-- Weird Al
Hmm, will have to change the refrain, it's not all about the Pentiums anymore, baby.
Re: (Score:2)
Re:Really needed? (Score:5, Insightful)
I'd expect it within 5 years, which seems to be the rough time-frame in which ARM expects the first of these CPUs to be built. This is just the architecture announcement. They need to get it out there so people can begin building tools, etc. There's barely enough time to get all that work done in time before this becomes a serious handicap for ARM, so that's my definition of soon.
Re: (Score:2)
Re: (Score:2)
Seriously though, I would probably use SQLite or FirebirdSQL Embedded if I were to use a database on that tight of a hardware spec. I've been somewhat eagerly watching Raspberry Pi to see where that leads.
Re: (Score:2)
4GB is all it takes to break the barrier (Score:3)
These chips need a bunch of address space to access peripherals. When you are at 2GB it starts to get a little tight, depending on how big the windows are for your I/O space (64M per peripheral is not an uncommon size, even if it is just for the registers for a serial or I2C port). Once you get 4GB then you really are stuck and have to use extended addressing and play highmem games in the kernel.
Re: (Score:2)
Re: (Score:2)
yes, that has been my experience :(
When you do address decoding off a bus you can break a large range into equal sizes power-of-two blocks pretty easily, and have a very gate-conservative circuit to turn those bits in the middle of your address into enables for the peripherals on the internal bus. I would rather they slice it up into smaller pages and just give some of the more complicated parts multiple pages (tie the enable lines together with an OR). But I'm just a software guy, I might be over simplifyi
Re: (Score:2)
Re: (Score:2)
Doesn't really solve this particular problem. IOMMU helps solve the issue of needing to map virtual memory to linear device memory, for things like framebuffers for scanout and textures. But you still have a 3/1 split and device drivers want to access a lot of address space while at the same time the userspace wants to access an increasing larger amount of RAM.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
The registers are 32 bit, though, which means paged addressing. No one wants to write apps in that environment.
Re: (Score:3)
But with 32 bit registers, that's paged. No one wants to write paged applications.
Re: (Score:2)
But are the address registers limited to 32 bits? It's sizeof(void*) that you need, not sizeof(int). Also I'm not sure what you mean by 'paged' applications? Paging is an OS/MMU function - nothing to do really with the address space of the CPU. If you mean the addressable space from a specific process you might be onto something, but again a lot of that is an OS limitation, not a CPU one. You can access 36bits of memory on 32-bit x86 from a single process if you do the right magic for example.
Re: (Score:2)
Re: (Score:2)
Yeah, it's that magic I'm talking about. That's absolutely horrific to program with.
And if you cant mix your void * with your ints, that's also horrible to program with.
Re: (Score:3)
Is 64-bit really needed in mobile devices? It increases the number of wires and data transfer, which means less power efficiency.
Hey no one will ever need more than 2^64 bytes of RAM!!!
Re: (Score:2)
I'm going to guess that some day, mobile devices will have 16 gigs of ram. Battery tech will have advanced enough to let such a device run for 8 hours. So yes 64 bit will be needed because
Re: (Score:3)
Mobile devices are going to be the most common platform for games soon, including 3d games, and there you can definitely use more than 4GB for a process.
Re: (Score:2)
Re: (Score:2)
Probably Christmas 2013. 2014 at the latest. By then no 'gamer' system sold in the previous 2 years will have had less than 8gb ram.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I've have a few games that break 3GB regularly, but they are tweaked for low memory usage, so they have smaller textures/etc, but they have lots of objects. If they ever went with high res textures and high poly count models, 64bit may be needed.
Re: (Score:2)
I don't think so. Games of any moderate graphical complexity burn too much energy, and battery technology is advancing too slowly. For phones to take share in the gaming field away from consoles or PCs, they would need better ergonomics (connectable controllers), better power (probably plugging into an outlet while playing), and output to the TV.
In
Re: (Score:2)
I was including Nintendo/Sony handhelds. They surely aren't going to eliminate the other market, just be more common. I'm just predicting that handheld gaming will be 51% or more of gaming. It may already be true, but it surely will be true soon if not.
Re: (Score:2)
Re: (Score:2)
Mobile? Well, my current laptop is using 64-bit processes and none of them has even 1GB of address space mapped, so it will be a little while. That said, ARM won't release any core designs with this ISA until at least next year, and they probably won't make it into shipping products for another year.
Mobile isn't the only place ARM is aiming though. Low-power servers are a growing market and the 40-bit LPAE in the A15 is likely to look a little bit cramped in the next few years. Servers often want to
Re: (Score:2)
Actual implementations (Score:3, Interesting)
It is worth pointing out that current x86-64 implementations are limited to addressing "only" 48 bits [wikipedia.org] so it's not like that ARM was way beyond the curve with their 40 bit address space (that's 1 TB).
Re: (Score:2)
Re: (Score:2)
2^40 = 1099511627776 bits
1099511627776 / 1024 = 1073741824 KB
1073741824 KB / 1024 = 1048576 MB
1048576 MB / 1024 = 1024 GB
1024 GB / 1024 = 1 TB
Re: (Score:2)
Author seems confused... (Score:2)
- ARM press release [arm.com]
Re: (Score:2)
It also supports 64 bit addressing. So by whichever definition you prefer, it's a 64-bit processor. Unless of course you demand full 64 bit address space as your bar for true 64-bitness, in which case no one sells such a processor yet.
Re: (Score:2)
When I said no one offered such a processor, I think I pretty obviously meant the physical address limitations.
Not just Intel (Score:5, Informative)
The architecture puts ARM into more direct competition with Intel and its 64-bit Xeon processors.
Gee, what about AMD and the AMD64 architecture that they developed? You know, the one that Intel eventually had to adopt (license?) when their 64-bit Itanium didn't quite live up to their expectations of being the next architecture that everyone moved to?
Oh, and ARM Holdings don't make chips. They design architectures and implementations that others license and put into actual chips. The summary wasn't so clear on that, and it's a point that lots of people often overlook.
Re: (Score:3)
Re: (Score:2)
Benchmarks of their chips often seem to put them at rough parity with intel, when you look at price vs performance.
Re: (Score:2)
Re: (Score:2)
Direct Competition? (Score:4, Insightful)
Maybe I've just got a certain prejudice, but I don't see any direct comparison, let alone competition, between ARM processors and Xeon processors, no matter how wide their addressing is. ARM processors run some really sophistocated stuff ... in my smartphone. A Xeon processor allows my CAD workstation to handle 3D models with thousands of components, or run an ANSYS simulation that solves the equivalent of 10 million simultaneous equations.
Re: (Score:2)
Not for workstations, I agree (I have one as well at work). But once you start piling up CPUs in racks by the hundreds the raw power matters less than the power per watt. Power and heat dissipation becomes the limiting factor in how much processing power you can cram into a given facility, or the total size, construction and installation cost for a bespoke facility. And while the Xeon is fast it and its support circuitry is rather a power hog.
Re: (Score:2)
Re: (Score:2)
"So they compete with the Atom mega-processing racks Intel's been pushing, but not with Xeon."
Well, no. That's the purview of the current ARM7 architecture (or rather, Atom is designed to compete with ARM7, not the other way around). This seems aimed at a much higher point on the power/performance envelope.
Re: (Score:2)
Arm still has quite a long way to go un
Re: (Score:3)
The same was said about x86 when comparing it to the highend alpha/mips/sparc/ppc of the time.
Never underestimate competition coming from below...
Re: (Score:2)
Nvidia project denver:
http://pressroom.nvidia.com/easyir/customrel.do?easyirid=A0D622CE9F579F09&version=live&releasejsp=release_157&xhtml=true&prid=705184 [nvidia.com]
Re: (Score:2)
Competition to the Xeon line isn't going to happen in the workstation; it's going to happen in the data center. As we stand now, the major limiting factor in terms of how much performance you can squeeze into an data center is the ability to power and cool your cores, rather than the number of processors and associated infrastructure that you can physically squeeze into a given space. This has more or less been the limitation since the introduction of the U1 form factor, and has been getting even worse with
Re: (Score:2)
I usually compare ARM processors to PC CPUs from the late 90s and early oughts, but I think you attempt to conflate them with the MC68x00 is even funnier.
Re: (Score:2)
AFIAK, they're not. It is rather a spiritual (as in inspired by) successor to the 6502 cpu.
Re: (Score:3)
Your statement that ARM should be based on Motorola 68000 is incorrect. The ISAs of the two architectures is completely different. ARM has 32-bit instructions, for instance, while the 68000 has 16-bit instructions. ARM processes the entire 32-bit word, while the 68000 processes 8, 16 or 32-bit words. etc.
Were you confused by the Dragonball series of microcontrollers, that was used in the PalmPilot? Early versions had a 68000 core and later versions had an ARM core.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Not quite. The ALU was 16 bits, so instructions dealing with 32 bit operands were generally slower.
Re: (Score:2)
I wonder which one is lower power, the ARM or the Xeon.
Of course, you'd have to measure the power for the entire server board, not just the CPU, and you'd have to measure the power based on the same workload.
Re:Direct Competition? (Score:5, Informative)
ARM is *NOT* based on the 68000 design, it was an original CPU design by Acorn computers of Cambridge, England (ARM originally stood for Acorn Risc Machine) for their desktop computers in the late 1980s and during the 90s. ARM bears absolutely no resemblance to 68000.
Sophie Wilson and Steve Furber, the designers of the ARM, were inspired by the simple architecture of the 6502, but the ARM is not based on that either (the ARM does not resemble the 6502 either, nor is it based on the 6502).
Re: (Score:2)
*sigh* I hate being wrong, but I love being corrected. It's the only way to learn.
Re: (Score:2)
Actually you are wrong, the ARM has nothing to do with the Motorola 68K it was a development on its own
specifically designed for the Acorn Risc Computers. After Acorn went down ARM went independend and tried
to survive by the nieches Intel left over. The nieches now have become mainstream.
Re: (Score:2)
Do you realise that you are talking out of your arse, and that ARM was developed as a totally independent architecture from scratch - the Acorn Risc Machine? And that the old 68k is in no way RISC?
Re: (Score:2)
...the sneaky way MIPS eliminates the pipeline bubble during branching.
Hah! I knew someone must do this. None of this speculative execution or go both ways garbage, just keep executing instructions until the branch completes. If you're looping and don't have enough instructions, throw a couple of nops in there. All that headache goes away. Do it with arithmetic too: R1 = X, R1 = Y, R2 = R1, R3 = R1 gives R2 = X and R3 = Y. It becomes a pain to program in assembly, but compilers don't care and the CPU becomes almost trivial. If you have something embarrassingly parallel you ca
Re: (Score:2)
Re: (Score:3)
ARM multicore problems (Score:3)
Re: (Score:2)
ARM has an even more fundamental problem then that in it's current implementation. There's effectively no equivalent of 'IBM compatible' right now. If you look at the different devices which are using Tegra 2 chips (not just the same family, but the same actual chips), they're all using different GPIO pins. The end result is that we're using customized kernels for each device, which is obviously impractical. There's also no standardized way to load a bootloader - everyone's just using closed source bootload
Re: (Score:2)
An interview with AMD and Intel a long while back talked about how they want to move away from implicit cache coherency because of the scaling with core issues that you're talking about.
One of the ideas at the time was for the programmer/OS to have a way to signal threads for when a memory location changes, this way it would be more like a multi-cast instead of a broadcast.
Also, cores would be connected more like a network where each core/node is connected only to it's immediate neighbors, so keeping relate
Standardized boot process (Score:3)
Re: (Score:2)
Boot process is not a feature of the CPU architecture. Boot process is a feature of the motherboard that the CPU is on or the SOC that the CPU lives inside of.
I have an x86 machine that will not boot anything PC-like (a rather old Garmin handheld with an embedded 80386). The lack of a BIOS is more of a reflection that ARM is typically in embedded systems, not that you can't make a standard BIOS for one.
Re: (Score:2)
I don't think that the OP was suggesting that they can't make one. He was saying that they don't have one, yet.
Re: (Score:2)
Re: (Score:2)
Why both going through 128 and 256 bit?
Just make the leap up to 1024 bit! It's inevitable eventually...
Law of diminishing returns (Score:2)
Doubling the size of the registers requires a LOT of work internally to a CPU and is not done lightly - thats why 32bit held on for so long in the consumer world. Also there are 2 (main) types of bit measurement - address bus size and data bus size. An increase to 128 or more for the data bus size may be useful for some applications and that has already been done in some areas - eg graphics cards - but increasing the address bus size to 128 bits will bring no conceivable benefits as we're still a long way o
Re: (Score:2)
Note that register size and address size need not be the same. There were many processors out there with 16-bit memory addresses but only 8-bit registers and data bus. Likewise most 16-bit systems had some mechanism for accessing more than 2^16 memory addresses. Even 32-bit systems often had mechanisms for accessing more than 2^32 memory address though they were little used.
OTOH 64-bit CPUs often don't bother with support for full 64-bit addresses (though they are often designed so they can be allowed in fu
Re: (Score:2)
Re: (Score:2)
And as an added bonus the heat given off will pretty much ensure you don't have kids.
Re: (Score:2)
Anyway, ARMv8 has 64-bit registers, a 64-bit logical address space, a 48-bit physical address space, and 32-bit wide instructions.