Forgot your password?
typodupeerror
AMD Hardware Hacking Upgrades Hardware

Hidden Cores On Phenom CPUs Can Be Unlocked 251

Posted by CmdrTaco
from the what-could-possibly-go-wrong dept.
An anonymous reader writes "One of the major ways a semiconductor manufacturer manages to make the most of its chips is through binning. Chips able to cope with high clock speeds with all cores running end up as premium product lines, while others end up as models rated at lower speed grades, or with fewer cores. In the case of AMD's Phenom CPUs, dual and triple core models are quad cores with some disabled, while some newer quad core CPUs are actually six core models with two disabled. To this end both ASUS and MSI have announced that they have modified versions of AMD 890FX- and 890GX-based motherboards to unlock these hidden cores. Much like overclocking, there is no guarantee that you will gain anything by unlocking the hidden cores — everything depends on just why your CPU ended up in a certain product line."
This discussion has been archived. No new comments can be posted.

Hidden Cores On Phenom CPUs Can Be Unlocked

Comments Filter:
  • already done, wtf? (Score:3, Informative)

    by drinkypoo (153816) <martin.espinoza@gmail.com> on Tuesday April 20, 2010 @12:08PM (#31911630) Homepage Journal

    various boards have permitted unlocking the cores. I'm looking right now for proper BIOS to do it with my Gigabyte GA-MA770T-UD3P [pcper.com] with which many people have reported success (see thread)

  • Engineering margin (Score:2, Informative)

    by Anonymous Coward on Tuesday April 20, 2010 @12:16PM (#31911748)

    it could also be possible that one of the disabled cores happens to have been disabled because of safety margins : it night not be 100% reliable under all circumstances (using officialbspec's voltage and being able to operate in a wide rrrange of temperatures. Including some constructor branded machine which place priority on silence rather on temperature, and including some badly hacked together beige box with lousy PSU and Thermal mangement). Thus they got disabled to avoid a barrage of recalls from Dells or from small shops (machines which could easily reach 65-70C under load)

    but the same core, giving a modest increase of voltage and a very aggressive cooling solution, like water cooling or oil immersion cooling (the kind with which the CPU never rises above 35-38C even when running BOINC 24/7 in background) could still function reliabily.

    just like over clocking : It won't work for the full spec (operating range) but might work in the specific controlled environment of an enthousiast.

    Of course, if the core was disabled because it's fried, no matter how much liquid nitrogen you pour on it, it won't work.

  • by mea37 (1201159) on Tuesday April 20, 2010 @12:17PM (#31911754)

    Yeah, maybe. Then again, GP has a point and you're being an asshat.

    TFS makes a comparison to overclocking. It points out that there is no guarantee of a benefit - but doesn't point out that there is a risk. In the case of overclocking, the risk is that you will overheat a chip that was rated at a particular clock speed for good reason. Of course you can combat this risk by improving the cooling system. You can combat the risk because you know exactly what the risk is.

    Now in the case of "hidden cores", what's the risk? Do you even know? Do you know what kind of flaw would lead them to legitimately disable a core? Is that one core unable to tolerate the same clock speed as the others? Is it functionaly broken such that it will return incorrect results for some operations? How would you tell the difference between that, vs. a chip that was perfectly fine but sold in a degraded state to balance out supply and demand?

    You could shell out for a special motherboard just to test your chip, and if no flaw in the normally-disabled chip causes any damage to the rest of the chip (or do you have some basis on which to rule that possibility out?) you at least won't lose anything. Or, could the defect be intermittant such that your tests might miss it?

    And if your computer is for hobbying and you enjoy working with a potentially-unstable system, good for you. A lot of people think that's a fine trade-off for what they're going to do with their systems. None of which invalidates GP's question - which is "what exactly might a disabled-by-default core do if you turn it on when it really was disabled for a reason?"

  • by electrosoccertux (874415) on Tuesday April 20, 2010 @12:25PM (#31911858)

    You're showing a complete lack of understanding as to how processors are rated and sold. AMD determines they need to meet a certain quota for each model of CPU. If it works out and all of the CPUs in their 1 million unit run works flawlessly, they will maximize their profit by disabling some of them and selling them for less money to account for that market without flooding the market with their top performing part.

    True, but there's also a good possibility that the your part wasn't binned to fulfill an order. Chips go through a severe set of stress tests that often exceed what will be encountered in practical use. During these tests, it may be revealed that a core doesn't function properly or well enough (it gives bad results) to qualify. All chips go through that, and that's why there's many redundant structures on a chip (to improve yields). (Sony PS3 has 7 SPUs when they build 8 on a chip, Xbox360's got 3 PowerPC cores even though it has 4, Intel disables cache lines and/or functional units, etc. etc. etc.)

    So the question is, are those cores disabled because AMD had extra parts and an outstanding order they could fulfill? Or are there actually potential issues that may only be revealed under certain loads? FOr the most part, it just means a game crashes a bit more often than usual (since mission critical servers never do wierd things like this - the money saved isn't worth the potential for extra downtime), or maybe a file gets corrupted. Or worse, your disk gets corrupted.

    Plus, AMD's historically been supply-bound and unable to fulfill demand for their product, so there's a potential that instead of getting a binned part, it's actually one that failed their test patterns.

    And yes, you see the same behavior with flash chips - NAND flash traditionally ships with bad blocks, and the majority of those can probably be erased and used quite safely (having accidentally destroyed the bad block information before due to buggy software...), but you never can tell why it was marked bad in the first place.

    I bought a Ph2 720BE and unlocked it to a quad. Stress tested with 12 hours of Prime95, no failures. When the core is bad, you usually can't even boot into Windows; never have I heard of one that could withstand gaming for more than 5 seconds. If something in it is broken, you know it.

    So I paid $120 back when the Ph2 965 cost $240, and unlocked and overclocked the 720BE I bought to a quad at 3.5ghz. 4 cores for the price of 3. Love it.

  • by Cowclops (630818) on Tuesday April 20, 2010 @12:28PM (#31911892)

    Eh... theres really no such risk with regular overclocking. The biggest threat to your CPU is increasing the voltage - which would strictly be overvolting, not overclocking. If you turn up your clock speed high enough that it "could" cause damage to it at load... odds are you've turned it up so high that it won't make all the way through bootup. And the solution to that is simply revert it back to its stock speed, or cut the difference between stock and what it won't run at until you find a working speed. The chance of permanent damage to a CPU without changing the core voltage is essentially zilch.

    The big difference between overclocking and unlocking hidden cores is that you can make small incremental overclock adjustments, say from 2.6ghz to 3ghz or 3.2ghz or 3.5ghz or whatever until you find that its unstable, and just back off a bit. You can't incrementally unlock one core, its unlocked or it isn't. And if it was disabled due to being flawed, it should stay that way or else your computer is just gonna blue screen right in the middle of some important work/gaming session.

  • by UnknowingFool (672806) on Tuesday April 20, 2010 @12:30PM (#31911936)
    The problem is you don't know if a particular core was disabled for legitimate flaws or for marketing. From AMD's standpoint, they probably don't want to disable the cores unless there was not other choice than they really needed to fill orders because they can sell the fully functional chip for lots more money.
  • by xiando (770382) on Tuesday April 20, 2010 @12:45PM (#31912186) Homepage Journal
    My "AMD Phenom(tm) II X3 720 Processor" does not work with the fourth core enabled. This is to be expected, X3 is sometimes sold as that because the fourth core is just broken and sometimes it's just got a diabled fourth core.
  • by Anonymous Coward on Tuesday April 20, 2010 @12:49PM (#31912244)

    I'm not sure how this is news... I've unlocked 7 or 8 AMD cores over the past couple years, as well as having a couple that wouldn't.

    Anyway, there are some of both scenarios - slightly damaged CPU's and order-filling CPU's being sold. You can visit any one of at least a dozen forums to see if the model / serial / day-of-the-week of your CPU is generally unlockable.

    BTW, ASUS and MSI are far from the only boards with ACC. I personally prefer the Gigabyte MA785x lines.

    Note: I'm neither a teenager nor terribly poor, just exceedingly frugal.

  • by garyrich (30652) on Tuesday April 20, 2010 @01:04PM (#31912448) Homepage Journal

    "The manufacturer will *always* bin the partially flawed parts as their low end units first."

    True, but the after market CPU is not the low end, not at any price point. You would put the real X2 and X3 chips in the low end consumer boxes, where the mobo doesn't support unlocking and the consumer doesn't know/care. You sell the perfectly good ones to newegg, fry's, etc. Happy geeks that unlock cores or overclock successfully are morle likely to recommend to others and buy next time. AMD and Intel understand this very well.

    Why do you think AMD has a "black label" line in the first place?

  • No, not so much (Score:5, Informative)

    by Sycraft-fu (314770) on Tuesday April 20, 2010 @01:11PM (#31912528)

    If a core is just flat out non-functional then yes, you are right, a system wouldn't boot. However that it works mostly doesn't mean there isn't a problem. There could be a single instruction that has a flaw, so everything is fine unless that instruction gets executed but when that happens you get a crash or worse, data corruption.

    If you think Prime95 is an accurate test, you are kidding yourself. Prime95 tests the FPU mainly, and is good for heat testing. It is not a full CPU test. So maybe the FPU works great, but one of the other units doesn't.

    So no, you don't know that nothing is broken. You assume nothing is broken. Maybe that's fine, however then no bitching if you get data corruption or the like because there was a problem that you didn't know about.

  • by marcansoft (727665) <hector@@@marcansoft...com> on Tuesday April 20, 2010 @01:18PM (#31912634) Homepage

    memtest86 is a diagnostic test for RAM. Prime95 isn't a diagnostic test for anything. Both are reasonable CPU burn-in tests, but they don't test all (or even most) features of the CPU. I'm not even aware of memtest86 using more than one core. Sure, if you run them for a while you can be reasonably sure that the critical parts of a core are working properly, but there's a very real possibility that its problem is a more obscure one that only shows under certain circumstances. For example, some specific app might corrupt data, while everything else works fine.

    In order to properly test a CPU core, you at least need a full suite of tests for that architecture, including OS/kernel-level tests, and even those are likely to miss things particular to the specific manufacturer's implementation of the architecture.

  • by Anonymous Coward on Tuesday April 20, 2010 @01:21PM (#31912682)
    I basically had the same thing happen to me. A few months ago I upgraded to an X3 720 for $120 on Newegg. After testing that it works, I tried unlocking the 4th core. It was Prime95 stable and I tried all the games/apps and they all worked. Of course I got a newer processor after all the problems AMD had with are gone, so it was more than likely market demand that they disabled the core (I hear the X3 720 is fairly popular). The only setback with it is that it disables the temp. monitoring. So, all cores now read 0c. But it wasn't that bad for a black edition quad core for $120.
  • by pyrr (1170465) on Tuesday April 20, 2010 @01:31PM (#31912876)
    This story requires illustrations and a publishing company.
  • by marcansoft (727665) <hector@@@marcansoft...com> on Tuesday April 20, 2010 @02:21PM (#31913614) Homepage

    You don't think the diagnostic puts any sort of stress test on anything other than the memory?

    The diagnostic doesn't put any sort of uniform stress on anything other than memory. Even wondered why it does a ton of passes on a ton of different modes with a ton of patterns on RAM? That's testing for as many possible RAM failure modes as it can. No attempt is made to test the CPU. You're stressing some parts of the CPU, but you're neglecting the vast majority (e.g. floating point and SIMD).

    Really? You don't think a test that is notorious for pushing the CPU to high load and high temperatures is a diagnostic for anything?

    If anything, it might be a diagnostic for your cooling system. Sure, it helps ensure that nothing is blatantly wrong with the CPU, and it does a better job at testing the CPU than memtest86, but it isn't even remotely a comprehensive test of CPU functionality.

    This isn't overclocking we're talking about here. When you overclock, you stress the entire CPU more as a whole. When tests like memtest86 and Prime95 start failing, you know that your CPU is definitely unstable. Then you back off and you hope the untested parts of the CPU will do OK with whatever safety margin you gave it.

    When you enable a core, it might have some broken parts, or it might not. Those parts can be flaky, or they can be borked, period. Unless you run software that has a chance of testing those parts, you will never find out. E.g. if the hardware for a specific floating point instruction is borked, memtest86 will be useless, and Prime95 will be useless unless it happens to use that specific instruction. If the transistor in charge of forbidding kernel memory access from user mode is borked, you won't find out until an unstable application takes down your entire system by scribbling all over the kernel.

    Unless you are suggesting that there are absolutely no diagnostic tests that are available to consumers to test stuff like this

    I am absolutely sure there is no test that will match what Intel and AMD do - because they know exactly how their CPUs work and what to test for. I do know that you can do a whole lot better than memtest86 or Prime95. I haven't checked whether someone actually has attempted to produce a comprehensive architecture test of this sort.

    Your mistake is attempting to extrapolate from tools used for testing overclocking (which typically results in overall instability) as a means to test for disabled and possibly subtly broken hardware. Any failures from a defective core are likely to show up only with workloads that exercise the defective bits, and the rest of the CPU will work fine.

  • by Ancient_Hacker (751168) on Tuesday April 20, 2010 @02:34PM (#31913772)

    There are bazillions of combinatorial tests that your average stresser program does not do, and cannot foresee that it needs doing.

    There's a whole lot more than the basic instruction set that needs to be tested for.

    For instance all the superscalar stuff -- pipeline loading, serializing, register interlocks, register renaming, stack register lookahead, jump prediction, cache prediction, cache-snooping, cross-core interlocks-- all things that require a certain complex SET of carefully primed and timed instructions to test.

    Not to mention the extra MMX, SSE, SSE2, SSE3, and later instructions.

    Your basic CPU heater program is not going to test for these, at least not intentionally, and not often.
     

  • Re:No, not so much (Score:4, Informative)

    by IorDMUX (870522) <mark.zimmerman3NO@SPAMgmail.com> on Tuesday April 20, 2010 @02:53PM (#31913988) Homepage

    If the production run says we're only going to have 3 out of 4 possible cores (with cache), they're not going to bother testing the fourth core (and its cache) if the first three test successfully. Worse, if they're calling for 2 cores out of 4, they test and get two good ones and DO NOT TEST the third and fourth core.

    You don't seem to realize how the economics of this really works out. Nobody will set up a production run before hand and say "this line only needs to produce 3 usable cores". Nobody will do this because no fabrication process has 100 % yield... in fact, most cutting-edge runs have far less.

    Let's say your fabrication process produces 1M chips per run, and we have the capacity to do 2 runs at once. You set up a '2-core' run, and find that 95% (950k) of the results have two working cores. Well, that seems great, now you can sell this 95 % at your bargain bin price. However, your '4-core' run may have had a yield of only 40 % (400k)... now, you have to spend more time and money producing more 4-cores to meet demand (lets say... 750k each), while you are selling perfectly good 4-cores at a 2-core price.

    Instead, all of the chips will be fabricated and tested at a 4-core 'level'. If 1 core fails, put it in the 3-core bin. If 2 cores fail, put it in the 2-core bin. If your yield was better than you expected, then you can bump some 4-cores down to the lower bins to meet demand. If your yield was poor, you are drawing from a much larger sample of chips (2M, so 40 % yield --> 800k 4-cores), so you don't have to produce more to fill the demand!

  • by nabsltd (1313397) on Tuesday April 20, 2010 @03:04PM (#31914106)

    Granted, it does directly or indirectly stress the fpu, cache, maybe task switching and interrupt handling. However, there are many more things that can go wrong.

    Off the top of my head, I can think of a lot of things that specifically need tested that one program probably won't do. For example, you need to verify both 32-bit and 64-bit operations. Prime95 is specifically compiled for one or the other, so would stress less of the "other" version.

    There are also a lot of SIMD [wikipedia.org] instructions that need tested. Some are obscure enough that only a few apps would use them.

    Then, there's all the instructions that support virtualization. I have found that bad hardware running a hypervisor will fail much more frequently than if it is running a "normal" OS (YMMV).

    But, unlike Memtest86+ [memtest.org] for RAM, there doesn't appear to be any program that specifically tests all CPU subsystems (registers, cache, instruction execution, etc.).

  • by derrida (918536) on Tuesday April 20, 2010 @03:42PM (#31914542) Homepage

    To be fair, I don't know of a better way to test, and I'd love to see a discussion of better utilities. If I tried this I'd probably do mprime and keep an eye out for MCE's in the system logs, but don't delude yourself into thinking that core is error free because you ran prime95.

    There are quite a few tools, mainly found in the overclocking communities. OCCT [guru3d.com], Linx [xtremesystems.org] and Intel Overburn [intel.com] just to name some.

  • by TheRaven64 (641858) on Tuesday April 20, 2010 @03:43PM (#31914548) Journal

    Prime95 does not execute every instruction

    Even if it did, a lot of CPU errata in the past have related to interactions between instructions. One story the Intel guys tell is of a particular condition flag being set accidentally on 486 CPUs after a sequence of other instructions. Apparently, game developers discovered this and started using it for optimisation. The first Pentiums, when they were run in simulation, crashed these games, so the final silicon had to do the same (wrong) thing or Intel would get the blame for breaking everyone's games.

    The test suites that CPU manufacturers use are exhaustive and cover a lot of combinations of instructions. You may be able to run 90% of your programs without any issue on a flawed core, only to have one particular program crash strangely. Irritating if it's a userspace program, disastrous if it's your operating system's block device driver...

  • by yoyhed (651244) on Tuesday April 20, 2010 @04:05PM (#31914808)
    Exactly, especially if you follow the advice of the article (or any article on the subject, as this has been known for quite some time before Slashdot picked it up) and choose newer versions (steppings) of the processor. It makes sense that over time AMD would get better yields as they improve the manufacturing process - but they still have many different market segments to fill. Choose a newer chip and you have a lot better chance at having a fully-functional one.

    However, nowadays the monetary benefit is somewhat diminished - you can pick up AMD's top of the line Phenom II X4 965 for $180, the X4 925 for $140, the X3 720 for $100, or an X2 550 for $80. Hell, the entire Athlon II line (just Phenom IIs without L3 cache) ranges from $65 to $105. Is the difference really worth potentially days of tinkering and testing? Depends on how much you like tinkering. :-)

Good salesmen and good repairmen will never go hungry. -- R.E. Schenk

Working...