Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
Security Hardware

ARM TrustZone Hacked By Abusing Power Management ( 60

"This is brilliant and terrifying in equal measure," writes the Morning Paper. Long-time Slashdot reader phantomfive writes: Many CPUs these days have DVFS (Dynamic Voltage and Frequency Scaling), which allows the CPU's clockspeed and voltage to vary dynamically depending on whether the CPU is idling or not. By turning the voltage up and down with one thread, researchers were able to flip bits in another thread. By flipping bits when the second thread was verifying the TrustZone key, the researchers were granted permission. If number 'A' is a product of two large prime numbers, you can flip a few bits in 'A' to get a number that is a product of many smaller numbers, and more easily factorable.
"As the first work to show the security ramifications of energy management mechanisms," the researchers reported at Usenix, "we urge the community to re-examine these security-oblivious designs."
This discussion has been archived. No new comments can be posted.

ARM TrustZone Hacked By Abusing Power Management

Comments Filter:
  • Every time (Score:5, Funny)

    by DontBeAMoran ( 4843879 ) on Sunday September 24, 2017 @09:32AM (#55254013)

    Every time I hear about security, viruses and hacks, it's done via "opcodes", "registers" and "bits". Isn't it time we design more secure processors without these flaws?

    • It would all be more secure if there were a backdoor engineered into the design so the government could have unfettered access to our data. You know, to make sure it's secure.

    • Normally at this level for the hack we start to cross the line from the digital to the analog. While most of us coders just worry about 0 and 1, on the processor we are looking at a values between a threshold, where wires are so close that a power change could cause a little static arch that in theory can change a bit.
      However these hacks normally need to be times perfectly and with intimate knowledge on what is going on at that time. Such a hack would most likely cause a program to fail, or some bad data t

    • by elrous0 ( 869638 ) on Sunday September 24, 2017 @10:30AM (#55254171)

      Better yet, don't name your product with a name that can later become ironic, like "TrustZone." Try naming your product "ShitStorm" or "ClusterFuck" instead. That way, when it gets hacked or turns out to be buggy as hell, you can say "What did you expect? We told you so upfront."

      • by glitch! ( 57276 )

        Better yet, don't name your product with a name that can later become ironic, like "TrustZone." Try naming your product "ShitStorm" or "ClusterFuck" instead.

        "With a name like that, you know it HAS to be good!" Like that old Saturday Night Live skit where they come up with bad names for the jelly. "Fruckers! You know it must be good!" Followed by "Monkey Pus!", "Painful Rectal Itch!", and "Death Camp! Look for the barbed wire on the label!" Then "10,000 Nuns and Orphans! What's wrong with that? They were eaten by rats!"

    • by gtall ( 79522 )

      I think if we could just have pink unicorns, we could ask their magical advice on how to design new processors.

  • Easy fix (Score:4, Insightful)

    by Anonymous Coward on Sunday September 24, 2017 @09:39AM (#55254025)
    Don't allow non operating system code to muck with the system clock. Problem solved. Why would this functionality ever be exposed? This is something that non-OS code should NEVER be able to do.
    • Even if the program isn't given direct access to change the speed and voltage, it can trigger those changes indirectly.
    • Non OS code doesn't need that capability for non OS code to actually perform those actions via proxy.

    • by gweihir ( 88907 )

      And then somebody hacks the OS and can compromise the Trust Zone anyways. No, what we need to do is secure the OS, because this is just one more case where anybody that owns the OS owns everything.

    • Problem not solved, because breaking TrustZone means breaking the machine BENEATH the OS level.

  • by Ayano ( 4882157 ) on Sunday September 24, 2017 @09:57AM (#55254063)
    These Goldilocks voltages will vary by small margins.. too small to be accurately predicted for an actual attack.

    TFA tries to make the argument that this physical hack can be done remotely despite the highly controlled conditions by relying on the power and energy management utilities...

    Now i've got news as an embedded developer, that sh*t isn't accurate for anything this sensitive.
    • It would probably work for a *very* targeted attack. A specific rev of a specific device running a specific OS.

      Useful for spooks, not much for anyone else. There were a bunch of these kinds of hacks in the NSA leaks - a MITM attack given a specific version of Apache, OpenSSL, and a specific version of a particular web browser.

      • The voltage variations in question are driven by the random defects in the silicon and in the fabrication process, not so much the CPU design or the OS (or even firmware) running on the chip.

        • by Anonymous Coward

          The voltage variations in question are driven by the random defects in the silicon and in the fabrication process, not so much the CPU design or the OS (or even firmware) running on the chip.

          RTFM, idiot:

          Thus any frequency or voltage change initiated by untrusted code inadvertently affects the trusted code execution.

        • by Anonymous Coward

          But it's the same kind of vulnerability where you take advantage of a race condition and multithreading. In software, you set up some handlers to catch the segmentation faults, and whatever. But just keep trying again and again until you get lucky.

      • The claim that you can not manipulate the keys was made and clearly thats not the case... the team at Columbia University : Adrian Tang, Simha Sethumadhavan, and Salvatore Stolfo deserve credit for showing that was not always the case...

        I wonder how many side attacks the PLA have...

    • by Anne Thwacks ( 531696 ) on Sunday September 24, 2017 @01:00PM (#55254739)
      This is the same, or very similar, to an Intel bug described about a month ago:

      The issue in both cases is either:

      a) The device can be set to operate under conditions that are known to cause it to be unreliable (be out of spec)

      b) The device fails to operate reliably when operated within spec

      If it is (a) then perhaps the manufacturer should test devices more thoroughly - and then blow fuses to limit operation within spec. If it is (b) the manufacturer should test the devices more thoroughly.

      You may know that (eg) Intel sell processors "locked" to prevent over clocking. This prevents (a). It obviously fails to prevent (b) either the manufacturer chose not to lock the device as (or the buyer chose not to buy locked ones) and the suer was "free" to use the devices out of spec, or the article describes devices where the tests were inadequate.

      In reality, device performance is not consistent within a batch, and devices are sorted for performance - hence processors with different speed and power options. This has been true since the beginning of TTL. As devices have higher part count (see Moore's law) they have a higher probability of failure - since there are more failure modes, there is a much higher time-to-test. Time to test maps directly to device cost. Because time-to-test adds to cost, semiconductor devices are not tested 100%*: some parameters are, and others are only sampled to ensure that the tests are identifying the duds. The problem here is that the parameters tested by sampling may not be as reliably characterised as they are believed to be. If you assume that (for example) all static ram cells in the chip have essentially the same logic levels and speeds within a certain margin, and that margin has a wider spread between devices under circumstances that have not been identified, then testing some sample registers won't tell you that others are not reliable on chips with this unknown and unidentified problem.

      Complexity does not scale linearly with transistor count - it is partly that, but it also scales with number of modules, module complexity, and number of interfaces between modules (hardware equivalent of API's not API instances). A more complex CPU has more of all three of these factors. Any way you look at it, a more complex chip will be more likely to fail in modes that are hard to identify.

      About 15 years ago, I was part of a team that identified a problem in a CPU of fairly low complexity caused by data leakage between pipeline stages in a processor used in safety critical applications (AFAIK, no one actually died as a result of these failures). These failure modes are very hard to find. This one took about a man-year of very expensive engineers using very expensive equipment.

      I predict that Moore's law will eventually be hit the Thwacks Barrier: Processor complexity will reach the stage where a processor cannot be adequately tested within a timescale that makes it worth producing.

      I therefore hereby, formally pronounce that testability will be the barrier that ends Moore's law.

      *Some /. users who are old enough to afford lawns may recall the national Semiconductors Mil-spec scandal: devices were sold as 10% tested when in fact they were only sampled, because the failure rate was "very low". No Aircraft carriers or space rockets were actually lost, but crimes were found to be committed anyway.

      • This is fascinating. I'm curious if the bulk of the testing techniques are things that could eventually be automated. If AI could bite off some of the burden, perhaps the chips could still be tested in a feasible time frame.
    • You can run any number of tries though, until you manage to change a bit.
      I don't know, but you can probably also use any number of tries of getting a corrupted trustzone key?
  • by Anonymous Coward

    If the power management can change the state of the processing engine then the power management methodology is broken. There should be no way to flip bits or change any of the processing state by manipulating the power state. That is is possible shows a serious flaw in the design.

  • by Dwedit ( 232252 ) on Sunday September 24, 2017 @12:01PM (#55254503) Homepage

    This looks just like Rowhammer all over again. Flipping bits by messing with something nearby.

    • It's flipping bits by gaining root access, profiling the system, crashing it many times in the process, then mess with something nearby.

      • It's flipping bits by gaining root access, profiling the system, crashing it many times in the process, then mess with something nearby.

        True, but that doesn't mean it's not bad.

        The whole point of TrustZone and similar technologies is to provide a place for computations that you wish to remain secure even in the event of complete compromise of the main operating system. Note that I'm not claiming that the attack is practical, it may or may not be sufficiently automatable to carry out remotely, on a large number of devices. That's for future research to determine. But it does make me nervous (my main project for the last four years is an An

  • I'm actually more terrified by the notion that activities in one core can cause bits to flip for another, completely randomly. We have a _lot_ of important stuff riding on the correct calculations happening in all those CPUs, worldwide, and the idea that you can pretty trivially cause random results is not a happy one.

  • by Anonymous Coward

    Making things secure is much harder than breaking into things. Given that, this one is an easy fix. The hypervisor controlling security can make sure the security states are stable before granting access (Accross dvfs variations) The security software can monitor votalge varations beyond allowed and lock down the system/user program ( red alert)
    Btw voltage can be varied from outside without power management commands to bypass pm control software. So a best solution is a voltage monitors (most chips have thi

  • You need software access to the registers that control the core voltage regulators.
    So you first need to gain root access.
    They changed the DVFS tables to make the soc run outside it's operating areas.

    They had to profile the DVFS operating points for the specific device they used to find the right values to used. The profiling causes device reboots or freezes. Not something you can do without being noticed.

    Step 1: probe DVFS tables, profile system to find points where it causes bit flips without rebooting or

  • that you can't have computer security yet? That it is not possible? That what a man can make, a make can take apart?
  • Apps cannot be granted permission to control DVFS, which is necessary to induce the faults, but they can manipulate it because Android responds to the application's load/behavior.

    However, the application has no specific knowledge of the overall system load and therefore it cannot consistently induce faults. The scenario in a lab is probably far, far easier than real life---it eliminates the effect of other apps, network state changes, etc on the power state.

    Very clever proof of concept, but it will take a H

IN MY OPINION anyone interested in improving himself should not rule out becoming pure energy. -- Jack Handley, The New Mexican, 1988.