Follow Slashdot stories on Twitter


Forgot your password?
Intel AMD Hardware

Not All Cores Are Created Equal 183

joabj writes "Virginia Tech researchers have found that the performance of programs running on multicore processors can vary from server to server, and even from core to core. Factors such as which core handles interrupts, or which cache holds the needed data can change from run to run. Such resources tend to be allocated arbitrarily now. As a result, program execution times can vary up to 10 percent. The good news is that the VT researchers are working on a library that will recognize inefficient behavior and rearrange things in a more timely fashion." Here is the paper, Asymmetric Interactions in Symmetric Multicore Systems: Analysis, Enhancements and Evaluation (PDF).
This discussion has been archived. No new comments can be posted.

Not All Cores Are Created Equal

Comments Filter:
  • by Eto_Demerzel79 ( 1011949 ) on Monday December 22, 2008 @10:03PM (#26207755)
    ...programs not designed for multi-core systems don't use them efficiently.
  • Re:unsurprising. (Score:3, Insightful)

    by ElectricTurtle ( 1171201 ) on Monday December 22, 2008 @10:16PM (#26207841)
    Mod parent to 5, seriously, it's so true. There are more than a few times after working support for decade when I've had to say, 'that should be impossible' but a symptom nonetheless exists.
  • Linux and Windows (Score:4, Insightful)

    by WarJolt ( 990309 ) on Monday December 22, 2008 @10:29PM (#26207923)

    I don't know if Linux or Windows has an automatic mechanism to schedule task priority based on processor caches, but the study didn't even mention Windows. Seeing that the scheduling and managing the caches are OS problems this seems kind of important.

    The other thing that seems odd is they were using a 2.6.18 Kernel and in 2.6.23 they added the Completely Fair Scheduler which could potentially change their results. It doesn't seem logical to base a cutting edge study on stuff that was released years ago.

  • Re:This isn't news (Score:5, Insightful)

    by Clover_Kicker ( 20761 ) <> on Monday December 22, 2008 @10:58PM (#26208095)

    80s mainframe tech is NEW and EXCITING to a depressing number of tech people, look at how excited everyone got when someone remembered and re-implemented virtualization.

  • not a surprise (Score:5, Insightful)

    by Eil ( 82413 ) on Monday December 22, 2008 @11:00PM (#26208119) Homepage Journal

    Here's an exercise: Take 2 brand-new systems with identical configurations and start them at the same time doing some job that takes a few hours and utilizes most of the hardware to some significant degree. Say, compiling some huge piece of code like KDE or OpenOffice. System administrators who do exactly this will tell you that you'll almost never see the two machines complete the job at precisely the same time. Even though the CPU, memory, hard drive, motherboard, and everything else is the same, the system as a whole is so complex that minute differences in timing somewhere compound into larger ones. Sometimes you can even reboot them and repeat the experiment and the results will have reversed. It shouldn't come as a surprise that adding more complexity (in the form of processor cores) would enhance the effect.

  • Well known problem (Score:4, Insightful)

    by sjames ( 1099 ) on Tuesday December 23, 2008 @12:01AM (#26208411) Homepage Journal

    The problem is a complex one. Every possible scheduling decision has pluses and minuses. For example, keeping a process on the same core for each timeslice maximizes cache hits, but can lose if it means the process has to wait TOO long for it's next slice. Likewise, if a process must wait for something, should it yield to another process or busy wait. SHould interrupts be balanced over CPUs or should one CPU handle them?

    A lot of work has gone in to those questions in the Linux scheduler. For all of that, the scheduler only knows so much about a given app and if it takes TOO long to 'think' about it, it negates the benefits of a better decision.

    For special cases where you're quite sure you know more than the scheduler about your app, you can use the isolcpus kernel parameter to reserve CPUS to run only the apps you explicitly assign to them.

    You can also decide which CPU any given IRQ can be handled by (but not which core within a CPU as far as I know) wilt /proc/irq/*/smp_affinity.

    Unless your system is dedicated to a single application and you understand it quite well, the most likely result of screwing with all of that is overall loss of performance.

  • by timeOday ( 582209 ) on Tuesday December 23, 2008 @12:07AM (#26208439)
    No, the programs are not the problem. The programmer should not have to worry about manually assigning processes to cores or switching a process from one core to another - in fact, there's no way the programmer could do that, since it would require knowing what the system load is, what other programs are running, and physical details (such as cache behavior) of processors not even invented yet. This is all the job of the OS.
  • by Anonymous Coward on Tuesday December 23, 2008 @12:22AM (#26208531)

    Summary (I didn't RTFA) says that the performance of a program can vary depending on which core it is executing on. No mention of multi-threading or using multiple cores at once. The article is not about using programs using cores efficiently. it is the about unpredictability and differences between seemingly identical cores and how the OS can detect and correct those problems.

  • Re:unsurprising. (Score:2, Insightful)

    by paulgrant ( 592593 ) on Tuesday December 23, 2008 @03:27AM (#26209451)

    Damn it, get one!
    At least a name for christs sake!

  • Re:unsurprising. (Score:4, Insightful)

    by sowth ( 748135 ) on Tuesday December 23, 2008 @04:38AM (#26209653) Journal

    They probably put in the if(1) lines because they were testing various aspects of the program, or maybe some like to turn off various aspects of the program, but don't want to be arsed to write the proper code to select options. I commonly do that in POVray (3d raytracing) scripts when testing, so I don't have to wait for long renders--fog, radiosity, lots of light and such take orders of magnitude more time.

    As for the AI adding crap, it is probably more trying random code than truly thinking about how the code should work. This leads to the useful code intertwined with lots of crap code. Unfortunately, there are programmers who write like this too... (cue funny mod)

    As for the code not working on other FPGAs, maybe the researcher should not use real chips to check the iterations. A simulated one which conforms to the spec exactly and upon where quirks and such are expected, dies or sends a signal back to the AI program. Testing after the fact on real chips to verify the AI didn't exploit bugs in the simulator would be more proper procedure.

    Maybe I have too much of a background in theory, but I am not completely sure why the FPGAs would be so different. Is it race time conditions? Or is the FPGA being used in some analog way? Or does the circuit depend on the exact timing of some input, so the speed / capacitance of each component make a huge difference? Or was the poster talking about FPGAs with different specs?

    Crazy things happen when you enter the real world. I remember back when I was in electronics assembly. One would first assume all the solder would wick onto the metal, but the boards would always have tonnes of solder bridges, and we had to carefully examine every component and correct them. Friggin' microprocessors had countless tiny legs too!

  • Reporter bias (Score:2, Insightful)

    by symbolset ( 646467 ) on Tuesday December 23, 2008 @04:49AM (#26209683) Journal

    Often, an issue presents that isn't reproducible in the presence of a tech support person who knows what he's doing.

    Sometimes it's a user error they don''t want to admit, and so they won't reproduce it in front of somebody who knows they should not have done that.

    Sometimes it's just a glitch. Regardless, the best thing to do is smile and say "The bug must be afraid of me" and close the ticket.

  • Re:Close (Score:1, Insightful)

    by Anonymous Coward on Tuesday December 23, 2008 @08:14AM (#26210413)

    You have to think about it too much.
        Console.Write("{0} is cool, but in parallel", thing);
        # serious business goes here

    The problem isn't the parallel loop in itself, it is about the secondary effects the loop has. And you can not think about those effects too much. In your example, even ignoring the serious business, what does Console.Write() do? Does it write to a buffer -> is that buffer thread-safe? Does it do port I/O without locking -> forget about threading. Does it do port I/O with locking -> then all single-threaded applications incur unnecessary overhead. If it writes to a buffer, does it build first and flush the entire string atomic or are the items fed element-wise?

    In regular imperative languages (C, Basic and its descendants), there exist almost no "pure" loops without side effects. This means that the theoretical performance gain from going multi-thread is outweighed by the complexity of isolating orthogonal processes. Add to this the fact that memory-sharing between threads leads to considerable delays because all cores must synchronize their caches, and you have the reasons why most programs are not (yet) parallel.

    It isn't because the language constructs for parallel programming are ugly (Java's semantic approach to threading is quite nice IMHO), it is because imperative languages are sequential by definition. If you want easy parallelism, then don't use an imperative language.

  • Re:unsurprising. (Score:1, Insightful)

    by Anonymous Coward on Tuesday December 23, 2008 @09:45AM (#26210837)

    Yet the probability of your random number being generated is the EXACT SAME as the probability of his random number being generated.

  • Re:unsurprising. (Score:2, Insightful)

    by Ant P. ( 974313 ) on Tuesday December 23, 2008 @03:54PM (#26215281) Homepage

    If overclocking is the cause of so many of these problems, why hasn't Intel or AMD got a mechanism to tell the OS that the hardware's being run out of spec? The blame for these crashes should be directed where it belongs - with the -funroll-loops ricers.

A university faculty is 500 egotists with a common parking problem.