Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
Get HideMyAss! VPN, PC Mag's Top 10 VPNs of 2016 for 55% off for a Limited Time ×
Emulation (Games) Hardware

Variable Instruction Computing: What Is Old Is New Again (hackaday.com) 52

szczys writes: Higher performance, lower power. One of the challenges with hitting both of those benchmarks is the need to adhere to established instruction sets like x86. One interesting development is the use of Variable Instruction Sets at the silicon level. The basic concept of translating established instructions to something more efficient for the specific architecture isn't new; this is what yielded the first low-power x86 processors at the beginning of the century. But those relied on the translation at the software level. A company called Soft Machine is paving the way for variable instructions in hardware. Think of it as an emulator for ARM, x86, and other architectures that is running on silicon for fast execution while sipping very little power.
This discussion has been archived. No new comments can be posted.

Variable Instruction Computing: What Is Old Is New Again

Comments Filter:
  • by JoeMerchant ( 803320 ) on Friday February 19, 2016 @08:11PM (#51545601)

    Can they really get better performance per watt on general computing using a flexible substrate? Seems like whatever design they set up the flexible (FPGAish?) circuit to do, could be faster and lower power if it were put into a fixed silicon (or similar) implementation. Maybe if your workload devolves down to very simple needs for long periods of time, this might take advantage of that.

    • by Aighearach ( 97333 ) on Saturday February 20, 2016 @12:17AM (#51546757) Homepage

      The problem with your analysis is that the words "flexible" and "fixed" don't have technical meaning here. Those aren't differing physical characteristics to choose between, those are just human-level descriptions of how the programming will be organized.

      An FPGA intentionally has a whole bunch of extra circuits supporting each logical unit, those are expected to take a lot of extra power because it is additional functionality. An FPGA doesn't use more power per physical transistor, it just has a whole bunch of transistors and other logic devices for each programmable unit. When you then implement the circuit as an ASIC, it uses less power because it uses less logic devices, not because there is some other qualitative difference.

      Something like this, any extra logic devices would be specifically designed to manage the other logic devices for low power use. That is a very reasonable thing to try to do. If their implementation is successful and useful in the market is a whole different issue, of course.

      Transmeta was successful from an engineering perspective; their products used less power than their competitors. Problem was, they were only a few months ahead, and required too many changes in devices. All other companies had to do was be richer, and more able to secure access to new fab technologies.

      One big difference here is that this will potentially change thread management for programmers in a way that many people will like. It might very well be able to fragment the industry and corner a significant chunk of interest.

  • by BitZtream ( 692029 ) on Friday February 19, 2016 @08:13PM (#51545607)

    So instead of the current situation where we have intel/amd processors doing something under the hood, using microcode as the language that translate the x86 environment into whatever is actually on the silicon ... and you're going to add ARM to it, and maybe some other ones?

    Thats cool and all, but its not really all that useful, and intel can pretty much already do that on any CPU it wants with a microcode update. ARM may not run as efficiently on the core that intel uses, but it can be done from a technical point of view.

    Its not worth it. Thats why no one does it.

    You'll effectively do nothing well.

    Intel was an ARM licensee (probably still is), they know ARM as good as anyone outside of ARM itself ... and they made entirely new silicon to run it (well technically they bought it if I recall correctly) ... and it even had its own microcode ... But what they never did was share a single core between both ARM and x86 CPUs that could change modes with a microcode update. No reason they couldn't other than its not efficient.

    • Too bad both Intel and AMD keep their microcode closed. There's so much fun we could do if they were documented and non-Tivoized.

      Just the first use: shuffle opcode numbers, make your compiler emit those and recompile your software per-installation. Any exploits that use machine code are instantly thwarted.

      And that's just a start...

      • by Dwedit ( 232252 )

        So that wouldn't stop pure return-oriented-programming, or if anyone knew that you were doing something like that, exploits that can read the code memory.

        • If we're already recompiling per-install, it would be easy to randomize a lot about the code, making return-oriented-programming moot (or at least massively harder if you have too little entropy). Shell/perl/etc can be "compiled" into a scrambled form. We can randomize kernel syscall numbers even today. But all of that is worth comparatively little if the biggest risk, machine code, is easily exploited.

          If you can read code memory then this technique can be defeated -- but needing to have two separate exp

    • Intel was an ARM licensee (probably still is),

      Naw, they sold off XScale and the license presumably went with it

      they know ARM as good as anyone outside of ARM itself

      Naw, they knew how to make the ARM of the day fast, but not power-efficient. Everyone else's ARM sipped power per MIPS compared to XScale under Intel. Have not followed it under Marvell so I don't know if it ever turned out, but they still make it so it probably did.

      But what they never did was share a single core between both ARM and x86 CPUs that could change modes with a microcode update. No reason they couldn't other than its not efficient.

      it's just a waste of time. the demand is not there. why mess around with it?

  • nikto is the last word. as written, it's a jelly doughnut.
    • I scrolled down and saw what you're referring to:

      "Gort, klaatu nikto barada." -- The Day the Earth Stood Still

      Slashdot, turn in your stash of fake nerd cards. You're not even at poser level anymore.

    • nikto is the last word. as written, it's a jelly doughnut.

      "Gort, I am a jelly doughnut"?

      • "Gort, Klaatu needs a jelly doughnut to be brought back to life. Oh, and please don't destroy the earth."

  • by wbr1 ( 2538558 ) on Friday February 19, 2016 @08:31PM (#51545709)
    Transmeta with a hardware morphing layer?
    • Transmeta with a hardware morphing layer?

      Maybe, maybe not. An article about them on SemiAccurate [semiaccurate.com] says "SM can run what it calls personalities in software but they are not implemented in the expected way. Personalities are software and are loaded at boot time, but they are both light and low-level. They don’t emulate code, they just translate it to the native ISA, a 32-bit add is a 32-bit add on both native and emulated hardware, but probably have differing opcodes." and "Personalities are not purely software though, there are hardware hooks

      • by Pulzar ( 81031 )

        a 32-bit add is a 32-bit add on both native and emulated hardware, but probably have differing opcodes

        That breaks down very quickly when you get to any memory operations, as well as all the various flavours of SIMDs...

        It really doesn't make much sense that you can be more power efficient in your implementation of the behaviour and ordering of an exclusive store-release transaction using generic ops compared to hardware that was explicitly built and optimized for that instruction.

        Yeah, maybe your integer and

      • a 32-bit add is a 32-bit add on both native and emulated hardware

        Hate to tell you this, but no...

        On x86 a 32-bit add also updates a flags register that is commonly leveraged. A full emulation of this register would be quite expensive on architectures that dont automatically track all of the same things.

  • Translating instruction to micro-ops to run on a VLIW-ish backend? I think every high performance architecture does that now (arm and x86)

    Share processing resources between cores? AMD tried to share the FP pipeline (flex FP?) between cores starting with their bulldozer architecture, but it looks like they are going to abandon that with their zen architecture after getting beat up about single thread perf...

    • Translating instruction to micro-ops to run on a VLIW-ish backend? I think every high performance architecture does that now (arm and x86)

      x86 - and z/Architecture, with the z13 chip - but do any ARM processors (or other RISC processors) do that?

  • The 6502, for example, makes for a beautiful stack machine...

    As the 6502 only had a single stack, limited in size to 256 bytes, and hard coded to reside at memory address range 0x0100-0x01ff, I might tend to disagree with that assessment.

    • by jeremyp ( 130771 )

      You forget that the 6502 could do indirect addressing through any of the zero page locations giving you a potential 128 stack pointers for your stack machine. Also, the zero page had a special address mode so that loads, stores and increments/decrements could be done with two byte instructions instead of three byte instructions, reducing the fetch execute time by one cycle.

      However, I think the main reason stack machines were often implemented on 6502 has more to do with its relative lack of registers. There

      • by mark-t ( 151149 )
        While zp access takes fewer bytes, and would indeed reduce the fetch time, the time it took to execute indirect addressing instructions on the 6502 was 5 to 6 cycles, and were among the instructions that take the longest time to execute (the only instruction taking more time was the 7-cycle ASL instruction).
  • by Anonymous Coward

    Make a cpu with just a few instructions and do complex stuff by repeating simple things many times, fast....oh hang on...

  • I can only wonder... if the Crusoe and Efficeon patents are being licensed from Intellectual Ventures (who ended up owning them), or if we are going to see another East Texas lawsuit over this.

  • There is a reason the idea fizzled: If you have very special code, it may be able to compete speed-wise, otherwise it will be slower. As compilers optimize better these days, it will be even worse today. And the "low power" is a red herring: If you want that (at slow speed), compile to ARM code, not to x86.

    My guess is somebody is looking for funding from clueless people.

  • Transmeta? (Score:4, Informative)

    by istartedi ( 132515 ) on Saturday February 20, 2016 @04:59AM (#51547527) Journal

    This sounds like Transmeta. Remember that, Slashdot old-timers? The company had trouble, and was eventually bought by private equity. I'm too lazy to find out if this is a re-emergence by the rights holders, or if they're going to get sued by the guys who bought Transmeta's IP. IIRC, It was an Israeli company that took it off the US exchange. After that I lost track of it.

The system was down for backups from 5am to 10am last Saturday.

Working...