Forgot your password?
typodupeerror
AMD Hardware

Smarter Thread Scheduling Improves AMD Bulldozer Performance 196

Posted by Soulskill
from the almost-up-to-par dept.
crookedvulture writes "The initial reviews of the first Bulldozer-based FX processors have revealed the chips to be notably slower than their Intel counterparts. Part of the reason is the module-based nature of AMD's new architecture, which requires more intelligent thread scheduling to extract optimum performance. This article takes a closer look at how tweaking Windows 7's thread scheduling can improve Bulldozer's performance by 10-20%. As with Intel's Hyper-Threading tech, Bulldozer performs better when resource sharing is kept to a minimum and workloads are spread across multiple modules rather than the multiple cores within them."
This discussion has been archived. No new comments can be posted.

Smarter Thread Scheduling Improves AMD Bulldozer Performance

Comments Filter:
  • Re:So basically... (Score:4, Interesting)

    by fuzzyfuzzyfungus (1223518) on Friday October 28, 2011 @01:05PM (#37871478) Journal

    So basically they suck. I shouldn't need to tweak my os thread scheduler just so a cpu can suck less. AMD needs to fix their shit instead of lame excuses.

    I've got some very bad news for you: While I have no particular knowledge of, or interest in, today's architecture pissing match, the days when the OS was allowed to ignore architectural details and expect things to just work optimally are good and over(if they ever existed in the first place).

    Dynamic processor clocks? Why should I have to deal with some performance governor shit when Intel can just make a CPU that either uses almost no power at 3GHz or runs like a bat out of hell at 800MHz? Oh, because they actually can't. Sorry. Multiple cores? WTF? Why do they expect me to program in parallel for 2 3GHz cores instead of just giving me a 6GHz core? Oh, because they actually can't. Sorry. NUMA? Memory access times already blow! Now you want to make them unpredictable? Well, we can either repeal the speed of light and restrict every system to a single memory controller or deal with nonuniform access times and cry into our 128GB of RAM... The list just goes on. Hyperthreading can provide anything from less than zero improvement, if it increases contention for resources that were already being fully used, to fairly substantial improvement, if the CPU was being starved at times under a single thread. Now the Bulldozer cores have implemented something between full multi-core(with 100% duplication of resources per core) and hyperthreading(with virtually zero additional resources for the HT 'core'). Shockingly, performance depends on whether the two semi-independent cores are stepping on one another's shared toes or not...

    Even if, in this specific instance, AMD happens to have fucked up and made the wrong architectural choice, that doesn't change the fact that you can't escape architectural oddities unless you are willing to stay quite far from the forefront of performance, or deal with some sort of hardware/firmware abstraction layer that ends up being at least as complex as the OS-level hackery would have been, but more likely to be vendor specific and have its cost spread across far fewer units. It certainly isn't the case that all architectural deviations are good, some are ghastly hacks best forgotten, some are perfectly OK ideas dragged down by products that overall aren't much good; but the path of progress has been liberally sprinkled with oddities that have to be accounted for somewhere in the overall stack.

  • by unity100 (970058) on Friday October 28, 2011 @01:10PM (#37871612) Homepage Journal
    applications, like photosop cs5 or truecrypt, including some more :

    http://www.overclock.net/amd-cpus/1141562-practical-bulldozer-apps.html [overclock.net]

    also, if you set your cpuid to genuineintel in some of the benchmark programs, you will get suprising results :

    try changing cpuid=genuineintel for +47% INCREASE IN SCORES.

    changing cpuid to GenuineIntel nets 47.4% increase in performance:
    [url]http://www.osnews.com/story/22683/Intel_Forced_to_Remove_quot_Cripple_AMD_quot_Function_from_Compiler_[/url]

    PCMark/Futuremark rigged bentmark to favor intel:
    [url]http://www.amdzone.com/phpbb3/viewtopic.php?f=52&t=135382#p139712[/url] [url]http://arstechnica.com/hardware/reviews/2008/07/atom-nano-review.ars/6[/url]

    intel cheating at 3DMark vantage via driver: [url]http://techreport.com/articles.x/17732/2[/url]

    relying on bentmarks to "measure performance" is a fool's errand. dont go there.

  • Re:So basically... (Score:3, Interesting)

    by Anonymous Coward on Friday October 28, 2011 @01:19PM (#37871728)

    You did when it was initially launched. Windows 2000's scheduler does not cope well with hyperthreading /at all/ by default. You saw similar things when dual core CPUs were launched. Now hyper threading and multicore are standard and OSs are aware of these cases.

    It's already been pointed out that windows 8's scheduler is bulldozer aware and performs much better than windows 7. I would not be surprised to see a patch from Microsoft that specifically addresses scheduler performance improvements for bulldozer CPUs. We've seen similar things in the past.

    By the way I'm seeing this unsusual phrase "Esoteric Tweaking" showing up a lot out of nowhere. It smells of astroturf. Could intel be affraid?

    Could it bet that bulldozer architecture, with its uneven fpu-integer core ratio, be the key to significant future scaling above and beyond what 1:1 can offer?

  • by Kjella (173770) on Friday October 28, 2011 @01:38PM (#37871960) Homepage

    Well, it doesn't seem to apply when you get up to supercomputing levels at least. I checked the TOP500 list [top500.org] and it's 76% Intel, 13% AMD. As for Bulldozer, it has serious performance/watt issues even though the performance/price ratio isn't all that bad for a server. On the desktop, Intel hasn't even bothered to make a response except to quietly add a 2700K to their pricing table, with the 2600K left untouched. On the business side (where after all margins fund future R&D) then Sandy Bridge's 216mm2 is much smaller than Bulldozer's 315mm2. Intel can produce almost 50% more in the same die area, in practice the yields probably favor Intel more because the risk of critical defects go up with size. Honestly, I don't think Intel has felt less challenged since the AMD K5 days...

  • Re:So basically... (Score:4, Interesting)

    by Kjella (173770) on Friday October 28, 2011 @01:54PM (#37872148) Homepage

    There shouldn't need to be any OS level tweaks because Windows already knows how to schedule for hyper-threading optimally. If BD reported it's true core count properly then no OS level changes would be needed.

    Except that hyperthreading quite obviously has one fast thread and one slow thread filling the gaps. In AMDs solution both cores in a module are equal, but they share some resources. To use a car analogy the Intel solution is a one-lane road with pullouts where the hyperthread sneaks from one pullout to the other while there's no traffic while the AMD solution is a two-lane road with one lane chokepoints. Both sorta allow cars to travel simultaneously, but I don't think the optimization would be the same.

You don't have to know how the computer works, just how to work the computer.

Working...