Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Linux Business Hardware Technology

ARM Unveils One-chip SMP Multiprocessor Core 145

An anonymous reader writes "ARM Ltd. will unveil a unique multi-processor core technology, capable of up to 4-way cache coherent symmetric multi-processing (SMP) running Linux, this week at the Embedded Processor Forum in San Jose, Calif.. The "synthesizable multiprocessor" core -- a first for ARM -- is the result of a partnership with NEC Electronics announced last October, and is based on ARM's ARMv6 architecture. ARM says its new "MPCore" multiprocessor core can be configured to contain between one and four processors delivering up to 2600 Dhrystone MIPS of aggregate performance, based on clock rates between 335 and 550 MHz."
This discussion has been archived. No new comments can be posted.

ARM Unveils One-chip SMP Multiprocessor Core

Comments Filter:
  • ARM servers (Score:5, Interesting)

    by MrIrwin ( 761231 ) on Monday May 17, 2004 @08:39AM (#9172278) Journal
    I had thought of ARM processors being the future for client devices and embedded systems.

    Looks like here we are pointing at server technology.

    How long before we have a 64/32/16 bit vatiable word size Thumb like architecture?

    • ARM servers (Score:5, Interesting)

      by simpl3x ( 238301 ) on Monday May 17, 2004 @08:59AM (#9172395)
      Cobalt servers were originally based on ARM processors, and were for the most part really nifty. Most palmtop and cell devices also use the processors, so my question is, why don't we see more reasonable personal computers (or blades servers) based upon this architecture. People don't use the processing capacity available to them, and tuning of storage and networking often gives a better return per dollar. Somthing along the profile of the Psion Netbook or old (or new depending upon your perspective) Apple Newton (also ARM) would be very cool and useful. Give it some cellular/WiFi tech...
      • Re:ARM servers (Score:3, Insightful)

        by MathFox ( 686808 )

        why don't we see more reasonable personal computers (or blades servers) based upon this architecture.

        I was an Acorn Archimedes user for more than 10 years (the workstation that the ARM was originally designed for) and they were great systems. Affordable, decent speed and good operating system.

        Alas, they were not "PC-compatible" and at a certain time the Intel/AMD clones with Linux became much more attractive.

        Somthing along the profile of the Psion Netbook or old (or new depending upon your perspective

      • Re:ARM servers (Score:2, Informative)

        Cobalt servers were based on MIPS, and then migrated to AMD-K6 processors.

        Not that they wouldn't have worked just fine with ARM, but as far as I can tell the idea never even came up.
      • Re:ARM servers (Score:4, Informative)

        by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Monday May 17, 2004 @10:29AM (#9173156) Homepage Journal
        First try a google for Cobalt server ARM [google.com] and then try another one for Cobalt server MIPS [google.com] and see how you do. Cobalt Qube and Raq up to 2 were MIPS architecture machines, not ARM.

        ARM has been used in many PDAs as you say, and in Acorn/Archimedes computers. It's also in the Game Boy Advance (ARM7 I believe) and will likely be the foundation of the Dual Screen (ARM9 and ARM7 both will be in the box, if leaked specs can be believed.) Arm also begat StrongARM, and intel purchased (some level of) rights to the StrongARM II architecture, which they call XScale.

        • Not quite. Intel purchased the StrongARM rights when DEC was dismembered. XScale is a purely Intel product.

          If you need evidence, then consider the issues of time to wake from idle and average power consumption in idle. StrongARM did a fantastic job of managing them, given the clock speeds at which it ran. The earlier XScale chips...well, they just did not. Bulverde (gen 3 XScale) is finally starting to get a handle on those problems. XScale was designed to scale to high clock speeds, but not to handl
        • Re:ARM servers (Score:2, Informative)

          by pantherace ( 165052 )
          Actually StrongARM owes nothing to ARM (the company), as it was made by DEC when they realized that lower power could be possible by turning down voltage etc on alphas, and instead of either creating a new instruction set, or using the alpha's instruction set (the first pure 64-bit arch, which was needed in servers, but not really in ultra-low power stuff at the time.) they decided to use ARM.

          In a court case between DEC & Intel which was settled, DEC sold it's fabs (I think they had one or two left)

        • As soon as you stated that, I thought, RTFA... But there wasn't one! So, I just said duh!

          Yep! MIPS... But, Acorn, Now those are pretty nifty also.
      • First Cobalt Qube (2700) was based on MIPS, not ARM. Perhaps you're speaking of an in-house prototype that never saw the light of day?
      • >
        Cobalt servers were originally based on ARM processors

        You are probably thinking about the NetWinder.

      • Aside from the fact that Cobalts were actually based on MIPS parts, as several other posters have noted...

        The generally accepted reason for the failure of ARM to move up-scale so far is that the main company producing high-end ARMs these days has been Intel. Oddly, they seem to have issues with creating a competitor to their flagship CPUs, so they keep leaving the FPU off the part. In 2004, a CPU sans FPU is a pretty unlikely desktop box.

        I'm very excited about the ARM Ltd part for this reason. Not only

    • Re:ARM servers (Score:4, Insightful)

      by swordboy ( 472941 ) on Monday May 17, 2004 @09:08AM (#9172431) Journal
      I think the one thing that we're all waiting for is the introduction of on-chip system memory. Currently, the cache of a high-performance processor consumes more than half of the chip area because the penalty for a cache miss is so large. For decades now, memory frequency scaling has lagged that of the microprocessor. Although there has been some great strides recently, latency is still rearing its ugly head. External DRAM is too electrically distant to remain at the heart of any high-performance system.

      Once we get processor and memory combined, we'll see performance increasing by several orders of magnatude. Processor architecture will matter even less, since emulation of *any* architecture will become trivial in terms of available processing speed. Your Thumb-like prediction will most certainly pan out to some magnatude.
      • Because you have a minimum of one transistor per memory bit. So a GB of system memory requires 8 billion transistors (plus ECC). Not quite there yet on chip sizes. Latest P4 is around 180M.

        • Re:ARM servers (Score:3, Interesting)

          by MrIrwin ( 761231 )
          Of course much of the memory we require is due to inneficient software.

          Look at embedded systems and you will see fresh new well thougth out solutions which have much lowwer memory requirements.

          180M transistors means we could have e.g. 100Mb flash, 40Mb RAM and an ARM on the same chip.

          That could do an awful lot in some apps!

        • Actually the latest flash memory architecture can store up to 3 bits per cell (and 1 cell=1 transistor).
      • Re:ARM servers (Score:5, Informative)

        by Christopher Thomas ( 11717 ) on Monday May 17, 2004 @09:52AM (#9172779)
        For decades now, memory frequency scaling has lagged that of the microprocessor. Although there has been some great strides recently, latency is still rearing its ugly head. External DRAM is too electrically distant to remain at the heart of any high-performance system.

        Once we get processor and memory combined, we'll see performance increasing by several orders of magnatude.


        This idea has been around for what is almost certainly longer than either of us have been alive. It turns out that there are problems.

        The main problem is that no matter how much memory a system has, we find ways to use it. In the time I've been using computers, memory size has gone up four orders of _magnitude_, and I'm sure the greybeards listening will top that. The processor sitting in your machine right now has more on-die memory (the cache) than, say, an early XT had, but the tasks you're running have a memory footprint too large to fit. This is the price for being able to _do_ more than you could do on that old XT.

        Another problem is with the structure of memory itself. You've heard of "fast, cheap, good - pick two"? Memory is "large, fast, densely-packed - pick _one_". The reason why integrated logic/DRAM processes tend to do one or the other badly is that DRAM and logic have to optimize transistor characteristics for exactly opposite things (high "on" current for logic, low leakage current for DRAM). Among other things, this means that DRAM is either slow or very power-hungry. SRAM is bulky no matter what you do - it's the cost of playing, when you have six transistors instead of one. Any kind of large RAM array is slow no matter what you do - you have to propagate signals across a huge structure instead of a smaller one.

        The solution to date has been a hierarchical cache system, where small, fast, on-die memory is accessed whenever possible, and when that overflows, larger, moderately fast, on-die memory, and when that fails, DRAM. This works amazingly well, giving you almost all of the benefits of fully on-die memory for problems that fit in cache. Problems that don't fit in cache won't fit in on-die memory, so going with an on-die implementation doesn't help for them.

        Progress in improving memory response times is made in two ways. The first is to use a better cache indexing algorithm that is less suceptible to pathalogical situations. In the simpler indexing schemes, you can end up with situations where a short repeating access pattern can hammer on the same small set of cache blocks, causing cache misses even when there's plenty of space elsewhere. Higher associativity and tricks like victim caches reduce this problem. Techniques like a "preferred" block in a set reduce the time penalty for high associativity, and techniques like content-addressable memory reduce the power penalty. This is still a field of active research - build a better cache, and you get closer to a system that _acts_ as if it has all memory on-die.

        The second way of improving memory subsystem performance is to use memory speculation. This involves either figuring out (or even guessing) what memory locations are going to be needed and preemptively fetching their contents, or taking a guess at the value that will be returned by a memory fetch before the real result comes in. In both cases, you're masking most of the latency of the memory access, while paying a price for failed speculations (either in higher memory _bandwidth_ required, or in power for speculated threads that have to be squashed). Build a better address and data speculation engine, and you'll again approach performance of an impossible all-on-die-and-fast system.

        In summary, it turns out that putting all of the memory of a general-purpose system isn't practical now and won't be as long as requirements for memory keep increasing. However, caches already give you performance approaching this for problems tha are small enough to _fit_ in on-die memory, and cache technology is constantly being improved. This is where effort should be (and is) going.
        • Memory is "large, fast, densely-packed - pick _one_".

          Got ahead of myself.

          What I'm trying to say is that any large SRAM array will be slower than a small SRAM array, and neither will have very high capacity. A DRAM array has high capacity, but is horribly slow. So-called "single transistor SRAM" is actually DRAM with a cache tacked on.
        • An excellent treatise, thank you. To respond to your challenge, the first computer I programmed was the DEC PDP-1 in 1964; it had 4096 18-bit words of memory, for 3 times 2 to the 12 power 6-bit characters. My home PC today has 1.5 gigabytes of RAM, or 1.5 times 2 to the 30 power. The quotient is more than 5 orders of magnitude.

          My home PC also costs almost two orders of magnitude less than a PDP-1 did, even ignoring inflation.
          John Sauter, greybeard (J_Sauter@Empire.Net)

        • Re:ARM servers (Score:3, Interesting)

          by theCat ( 36907 )
          Nicely done. You write textbooks, I hope?

          Though not a _personal_ computer, I like many at the time ran programs in time-share environments at a university. When I started college in 1977 I got an account on the PDP-11/45 system, which came with some personal storage, access to BASIC, and all of 8K of core. Before that I had never touched a computer system. When I started serious projects I applied for more core, and got 16K.

          Later as a graduate student I programmed Apple][ systems in hybrid BASIC/assembly
      • Re:ARM servers (Score:3, Insightful)

        by addaon ( 41825 )
        One of the technologies you'll start seeing for high-performance embedded systems (and can find now, in a few places), is core pinouts designed as the mirror image of a standard DRAM memory pinout. With this setup, a CPU can be put on one side of a four (sometimes five) layer circuit board, normally, and a DRAM chip (single chip, so about 1Gb max for most usage; no double channel) can be put directly opposite it, with vias connecting the two. The electrical connection of the signalling wires between the two
      • My system has 1GB of main memory. That's simply too much to fit on any current die sizes - last I heard about the biggest you could pull off with an actual complex IC covering it (not just putting down some pads, painting it with liquid crystal and covering it with glass to make a reflective mono LCD, for example) was about 21x21mm. You're not putting 1GB of DDR on the die with my CPU any time soon, and by the time you can, 1GB will be a piffling amount. Meanwhile, there ARE chips with system memory on-core
      • I think the one thing that we're all waiting for is the introduction of on-chip system memory.

        Sorry, I prefer cheap commodity computing. Such a scheme would be VERY expensive. An expensive crutch programmers who got a 'D-' in computer architecture because of an utter failure to comprehend the memory heirarchy. An all RAM on die scheme would be an enourmous waste of money, as vast, contiguous regions would sit idle. No one does this at any scale of computing because caching strategies get us the vast
        • Apparantly the parent hasn't heard of chips like the atmel AVR, or PIC chips. The concept of on-chip memory (both program and RAM) is HUGELY popular among embedded microcontrollers. These can be incredibly inexpensive as well.
    • by Tune ( 17738 )

      If I recall correctly, chips prior to ARM6 had register 15 (ARM's PC) designed with the upper six bits reserved for status. Having a program address space of only 2^26 = 64 MB was a major obstacle, even for (successors of) Acorn's RiscPC, a desktop model. With that resolved in the ARM6 series, it is still unable to look beyond the 4GB boundary. In the 4 way SMP servermarket this is likely to become a major pain.

      So either they found a nice way to add yet more MIPS per megaherz (or per watt) to serve a highe
      • ARM6 != ARMv6 (Score:4, Informative)

        by hattig ( 47930 ) on Monday May 17, 2004 @09:56AM (#9172813) Journal
        One is a ~1990 era version of the ARMv3 architecture (IIRC).
        The other is ARM's latest version of the ARM architecture.

        26-bit addressing limitations were removed ~14 years ago. I don't even think any of the more recent versions of the ARM architecture support it.
        • 26-bit addressing... I don't even think any of the more recent versions of the ARM architecture support it.
          This is correct. Currently available ARM cores do not support 26-bit addressing.
    • Yeah, my first thought was, "where do I find an under-$100 ATX motherboard that'll take one of those?"
  • by System.out.println() ( 755533 ) on Monday May 17, 2004 @08:40AM (#9172287) Journal
    ..... .....

    What do you want, a cookie?

    Seriously though, this would be great to run Linux on... Like a new Zaurus perhaps :)
  • Interesting (Score:5, Interesting)

    by INeededALogin ( 771371 ) on Monday May 17, 2004 @08:40AM (#9172289) Journal
    The MPCore multiprocessor enables system designers to view the core as a single "uniprocessor", simplifying development and reducing time-to-market, according to ARM.

    The opposite of HyperThreading? 4 CPU's to one instead of 1 CPU to 2?

    The only thing that I can guess they mean by simplifying is that a developer would not have to design a multi-threaded application to take advantage of the other threads.
    • Re:Interesting (Score:5, Informative)

      by Tune ( 17738 ) on Monday May 17, 2004 @08:55AM (#9172364)
      It appears to be similar to other dual core technologies except developers need to worry less about threads accessing the same data. This is accomplished by cache snooping, which is a dated, but very fast way to avoid (L0) cache inconsistencies. That should take care of a major hurdle wrt. keeping SMP threads busy, especially if the clock speeds are relatively low.

      Notice that SMP has been a dream to the ARM team from its early Acorn/Archimedes days on. It seems they finally got it working...
      • Ah, brings back memories of the Hydra processor board for the Risc PC (which was never actually available, was it?) - I always felt Shiva would have been a more appropriate name, though, for obvious reasons...
        • But Shiva was already (I think - I could be mistaken) being used by a firm making remote access platforms (read: big boxes of modems).

          Being Cantabrigian, they probably preferred the Greek metaphor, anyway ;)

          • It's not quite as neat, though. Hydra had many heads, Shiva had many ARMs. The videogaming geek in me is desperate to suggest Goro as an alternate name if Shiva was already gone. :)
      • (Didn't read the article.) Are they cache snooping, which would be the obvious thing, or just using a shared cache and ignoring the problem that way?
        • by Tune ( 17738 )
          Reading the article is not required; just skimming it reveals a diagram with 4 CPU's, each with its own cache connected by arrows to a large blob called "Snoop Control Unit".
    • By "developer" they probable mean a hardware develeoper, not a software developer. I can definitely see how one chip would be easier to deal with than four.
  • by Anonymous Coward on Monday May 17, 2004 @08:47AM (#9172321)
    In case you were wondering what that is all about...

    Synthesis of a core is analagous to compiling your software- except in an FPGA it is processing a hardware definition language like VHDL or Verilog to create the 'code' used to load the FPGA.

    This is a big plus for people wanting to put a wicked fast processing unit in the core along with whatever custom IO goodies they can come up with.

    Too bad its not open source, as there are other wicked fast processor cores available. For example Xilinx can license you to put a PowerPC in its FPGA cores.

    • I'm not sure how to tell you this, but youre virtually totally wrong one very point.

      Synthisiable to Silicon, for ASIC's mostly though people like Philips turn them into micro-controllers and Intel make a few Micro-processors, the idea mostly is you can put a LCD controller, SIM Card reader, DSP, etc all on one lump of silicon with an ARM processor and put it in your mobile phone.

      And you don't licence a PowerPC core to put in a FPGA, you get a PowerPC chip actually inside the FPGA (Vertex2 Pro), any IP-Cor
      • Doh, that's what you get for modifying posts to much, You can put it on a FPGA, but you wouln't want to outside of development, if you look at the picture on the article, there's 2500 dolars worth of FPGA there, and the whole unit, probably looking at 10,000, and it's a tad big, put it on the intended final target, a silicon chip and youve got something which will fit in the tiny space behind the battery in your mobile phone.
    • by eclectro ( 227083 ) on Monday May 17, 2004 @09:58AM (#9172833)
      Too bad its not open source, as there are other wicked fast processor cores available. For example Xilinx can license you to put a PowerPC in its FPGA cores.

      There is this [cnn.com].

      You can find the code easily. There are a couple of other clones, but I have not heard much about them. Another one is BlackARM developed in Sweden a couple of years ago.

      I think these projects would be ok as long as they are instruction compatible, but not an internal clone. In which case ARM would pull out their lawyer dogs.

      But there are a couple of other open source cores available, which IMHO would be smarter to use because you could do more with them without the fear of legal reprisal from ARM.

      If you are designing an embedded system, you might could get by using such a core. The thing ARM has going for it is that commercial support and toolkits are available, which can be handy if you have a complex application that needs a lot of debugging. And there is a lot of third party support that you are not going to find with your homegrown core.

      That being said, you could save a fair amount of money using an open core. But if you need to get something important out the door quickly (like a toy for christmas) you go with the commercial solution. Unless you have the necessary in-house resources to troubleshoot problems.

      Just my .02
  • Wave of the future. (Score:5, Interesting)

    by Willeh ( 768540 ) <rwillem@xs4all.nl> on Monday May 17, 2004 @08:50AM (#9172337)
    Imo this new "multiple cpu's per chip" is the way forward. And the huge power savings is an added bonus. One question springs to mind though, how much performance can you gain by using this technique? i mean, sooner or later you will hit the limits of say, the memory bus or the graphics bus or whatever(speaking in layman's terms obviously), especially in environments where power consumption is an issue, and huge memory banks take alot of power to keep them refreshed. Still, i welcome the development, smp type deals can make a computing experience easier to cope with during intensive use like compiling and other cpu intensive tasks.
    • by pe1rxq ( 141710 )
      There are a lot of things right now where the cpu is the bottleneck. In making a system better it is wise to start with the weakest link and than with the second weakest, etc...
      Also you don't have to refresh static ram, its more expensive but might pay off in terms of energy.

      Jeroen
      • It might depend on the type of SRAM but IIRC, SRAM is much faster, much more expensive and takes more power.
        • SRAM is also much bigger. DRAM bits [necelam.com] are just capacitors with calculated (and designed-in) characteristics coupled with a single gate. SRAM bits are flip-flop [play-hookey.com]s. Note that they are each (at least of the three or four types on that page) made of eight gates. That's a lot of real estate for a single bit, I'm guessing two to four times as much (depending on the size of the capacitor...)
        • IIRC, SRAM is much faster, much more expensive and takes more power.

          In terms of power, I would think it depends entirely on the duty cycle. In terms of switching power, SRAM has a higher switching cost due to having more transistors. On the other hand, DRAM leaks power constantly AND has to have data restored on every read while SRAM has very low leakage.
  • by MrRuslan ( 767128 ) on Monday May 17, 2004 @08:56AM (#9172374)
    But what are some uses for this.If im not mistaken this is a 32 bit architecture so it has it's limits when it comes to scaling and its not powerfull inogh for one of those supercomps so whats is the target market?
    • Not all problems are solved easier by throwing more bits at it.... With more bits the number of instructions you can execute is still the same.

      Jeroen
    • by Anonymous Coward
      If somebody was smart, they'd sell a mini-PC with this as the core. 4 (or just 2) CPUs + decent I/O subsystem = Awesome response times = Average consumer will swear it's faster than those Puntium64 thingies.
    • First, the desktop users will not exceed the limits of 32 bit computing for quite some time now, unless you are trying to implement operating systems where everything is mapped into a flat address space, like every location of every storage device. I don't know too many people with more than 1GB of memory in a system they ordinarily use, and a 32 bit system can successfully address 4GB. Of course not all of that can be system memory, but the point is, we're a ways away from needing 64 bit on the desktop.

      S

      • I don't know too many people with more than 1GB of memory in a system they ordinarily use, and a 32 bit system can successfully address 4GB. Of course not all of that can be system memory, but the point is, we're a ways away from needing 64 bit on the desktop.

        That's only 3 years away at Moore 1.

        (I hereby define "Moore" to be the scale upon which the growth of computation power can be measured. Moore 1 represents a doubling every 18 months, Moore 2 a doubling every 9 months, Moore 0.5 a doubling every 3
        • I'm sorry, but your comment did not compile, as your define trailed your usage :)

          In three years, this design will only be used in embedded devices. It's only interesting for desktop use right now, and then only if you get four 550MHz cores. Otherwise a single higher-speed processor is going to beat its pants off.

  • by TheLoneCabbage ( 323135 ) on Monday May 17, 2004 @09:06AM (#9172423) Homepage
    Exactly what I was looking for! Finally a comuter capable of letting me balance my checkbook, use a word processor, watch a video, and browse the web!

    Is any one else getting the impression that our entire industry is driven by penis envy [theregister.co.uk]?

    "It's bigger, it's faster, stronger! More Power!" About the only flaw in my theory is the continuing trend of decreasing computer sizes. But I can atribute that to the fact that it lets people put them in their pockets.

    BTW: If you actully use your CPU(s), this doesn't apply to you. Your penis is bigger.

    • I use my computer to decompress a complex audio compression format at pretty high bit rates while browsing the web, downloading files and having an IM conversation all the time. That's quite a bit of math to do. Guess what, it's still sluggish for some tasks. If I'm using photoshop, filters run slower if I'm playing music. I want music playing. I also want a fast computer.

      That's what SMP gives you. A single CPU can easily do anything I want, but by partitioning it, my Vorbis player doesn't slow my AutoCAD
  • MMP ARM server (Score:4, Insightful)

    by Gadzinka ( 256729 ) <rrw@hell.pl> on Monday May 17, 2004 @09:18AM (#9172491) Journal
    Just the other day I was thinking about "Massively Multiprocessor" ARM computer. It came to me after reading about cluster of VIA low-power computers.

    So, ARM are even lower power, they are designed quite correctly from the ground up[1] and the only thing that's missing is FPU. But the computer with 100 ARM CPUs would run faster than any ix86 today and probably would consume less power than the latest P4/K7/K8.

    Give me for 64 proc (*4 cores per proc, so 256 proc) Linux machine anytime ;)

    Robert

    [1] Anyone who knows internals of today ix86 processor from any vendor knows what a mess is it in order to use today's technology with ancient ISA like ix86.
  • That's nice but, (Score:5, Interesting)

    by dbretton ( 242493 ) on Monday May 17, 2004 @09:24AM (#9172527) Homepage
    Let's talk some real numbers.

    How will it fare against, say a Xeon with HT or 2 Opterons?
    How will it stack up in price?

    • by Anonymous Coward
      If you're comparing it to Xeons and Opterons, you're not even in the same market.
    • How will it fare against, say a Xeon with HT or 2 Opterons?

      It won't be able to heat up your house during winter like the Xeon with HT or 2 Opterons can.

      This may not be so important during the summer months though.
    • Notice the "embedded" tag.

      This kind of product is designed for PDAs, where most products are going to 300-400mhz processors.
  • by ebunga ( 95613 ) * on Monday May 17, 2004 @09:46AM (#9172704)
    PMC-Sierra's MIPS-based RM9000x2GL's [pmc-sierra.com] are really neat. It's been out for some months now. I'd love to see a machine with several dozen of these.
  • by Anonymous Coward on Monday May 17, 2004 @10:00AM (#9172849)
    This is one of the reasons why Linux will eventually win in the handheld/cell phone space. Unlike WinCE, Symbian and PalmOS, Linux already supports SMP. Linux is light years ahead of WinCE, Symbian and PalmOS on all all key core technology features such as SMP. I know for a fact that Linux is being used to validate these features on future ARM processors. So, companies that based their products on Linux won't have to worry about the OS running on the new processors. The proprietary OSes will be playing catchup forever. I will not be surprised if Microsoft has to redesign WinCE from scratch yet again to accommodate SMP.

  • Why? (Score:4, Informative)

    by Anonymous Coward on Monday May 17, 2004 @10:00AM (#9172855)
    Low power. Die size. Cost.

    You don't use an opteron in the same situation as an arc core. Its a synthesisable mini processor used for controlling real time systems. It can be embedded in chips with custom VLSI logic to provide a platform for an operating system. Its not meant for competing with Opterons or any of the other such stupid ideas.


    Why 4 cores?


    Not all customers need 4 cores, some only need 1 (washing machines) or maybe 2. The system is therefore scalable to die size/power/cost requirements. Note its configurable, it does not have to have 4 cores. If I were a customer of arc I could chose how much die space to devote to the core and how much power I really needed.

    4 cores, instead of one bigger more complex one is easier to engineer and get right. Look at modern graphics architectures, its the same principle (though one can argue about cache coherency).

    Multiple cores would make dynamic power management much easier to handle I imagine. An entire core could shut down when its process(es) are not busy. A properly designed embedded system could benefit enourmously from this power saving and the hardware design is made relatively easy rather than trying to cut voltage for on one large core.

    Embedded systems using arc cores often need to meet real time needs. One advantage of a multicore system would be to place a critical software component on a single core and, with correct use of memory, guarantee a fixed throughput rate of data. Of course I can use thread priorities but this makes things harder IMO. Maybe thats what they refer to by easier programming.


    To me, this looks like a clean idea, which although not revolutionary in terms of an idea, does provide significant advantages for embedded device designers by being synthesisable.


    Wroceng
    (no association with ARM at all but I forgot my password temporarily)

  • I would imagine that a wristwatch that can do voice processing and movie rendering.

    This would seem to hand in hand with the current thinking on on the fly OCR/language translation. I watched a show last night about a camera and PDA gizmo that could translate a road sign for you. I think that one did it via a server based imageing system. But if you do all that internal the posiblilites are endless, and hopefully not trivial, like SMP pong or really fancy ringtones.

    low electical power + high CPU power
  • Nice! (Score:2, Informative)

    I've been an ARM fan for many, many years, so it's great to see this development. I've always thought this kind of thing should happen with ARM chips, and that the ARM should be well suited for this kind of application.

    ARM cores have a great advantage of having an incredibly low transistor count. As a result the simpler ARM chips tend to have incredibly good production yields. I don't know if that's true for the more complex ARM variants like XScale. This multi-core processor should also be an order of
  • Are you sure it's the 1st time ARM has produced a synthesizable core? (despite what the article says)

    A little over a month ago I sat through a presentation by one of the guys near the top of ARM's research division...

    It was a general overview of ARM's business model (it's an IP company) and products followed by some other material. During the presentation some cores were marked as synthesizable, others were marked as the opposite (I forget the specific term that was used).

    To the best of my knowledge al
  • I feel that experience with ARM based embedded system will be a good item on an EE student's CV. I wonder what's the most cost effective platform that I should get if I want to play with it?
  • by Anonymous Coward on Monday May 17, 2004 @11:56AM (#9173958)
    One thing I've always wanted is a comparison of the general efficiencies of different processors. That is, if you made different types of processors the same clock speed, gave them equivalent caches, and ran a benchmark entirely out of cache, how would they all compare?

    X86s are supposedly awfully inefficient architectures, so would they come out on bottom? Where would various ARM, xScale, 68k, and PPC processors end up?

    Although x86 CPUs have scaled up to some amazing clock frequencies, it seems like their growth has slowed. Intel seems to have implicitely acknowledged this since they're dropping the P4 line for an updated P3 architecture. AMD did the same thing with the Athlon64s, which have slower clock speeds but are faster in the end.

    If it turned out that an ARM at, say, 600 MHz turned out to be as fast as a P3 at 1 GHz, then I would say the ARM could leave the embedded market and could become competition in the desktop market. If such systems were significantly cheaper, cooler, smaller, and less power hungry than similar x86 systems, I think they could seriously compete.

Never test for an error condition you don't know how to handle. -- Steinbach

Working...