Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Operating Systems Software Windows Hardware Linux Technology

Windows and Linux Not Well Prepared For Multicore Chips 626

Mike Chapman points out this InfoWorld article, according to which you shouldn't immediately expect much in the way of performance gains from Windows 7 (or Linux) from eight-core chips that come out from Intel this year. "For systems going beyond quad-core chips, the performance may actually drop beyond quad-core chips. Why? Windows and Linux aren't designed for PCs beyond quad-core chips, and programmers are to blame for that. Developers still write programs for single-core chips and need the tools necessary to break up tasks over multiple cores. Problem? The development tools aren't available and research is only starting."
This discussion has been archived. No new comments can be posted.

Windows and Linux Not Well Prepared For Multicore Chips

Comments Filter:
  • by mysidia ( 191772 ) on Sunday March 22, 2009 @03:39PM (#27290139)

    Multiple virtual machines on the same piece of metal, with a workstation hypervisor, and intelligent balancing of apps between backends.

    Multiple OSes sharing the same cores. Multiple apps running on the different OSes, and working together.

    Which can also be used to provide fault tolerance... if one of the worker apps fails, or even one of the OSes fails, your processor capability is reduced, a worker app in a different OS takes over, use checkpointing procedures, and shared state, so the apps don't even lose data.

    You should even be able to shutdown a virtual OS for windows updates without impact, if the apps that arise get designed properly...

  • by mcrbids ( 148650 ) on Sunday March 22, 2009 @03:41PM (#27290175) Journal

    Languages like PHP/Perl, as a rule, are not designed for threading - at ALL. This makes multi-core performance a non-starter. Sure, you can run more INSTANCES of the language with multiple cores, but you can't get any single instance of a script to run any faster than what a single core can do.

    I have, so, so, SOOOO many times wished I could split a PHP script into threads, but it's just not there. The closest you can get is with (heavy, slow, painful) forking and multiprocess communication through sockets or (worse) shared memory.

    Truth be told, there's a whole rash of security issues through race conditions that we'll soon have crawling out of nearly every pore as the development community slowly digests multi-threaded applications (for real!) in the newly commoditized multi-CPU environment.

  • BeOS (Score:5, Interesting)

    by Snowblindeye ( 1085701 ) on Sunday March 22, 2009 @03:42PM (#27290191)

    Too bad BeOS died. One of the axioms the developers had was 'the machine is a multi processor machine', and everything was built to support that.

    Seems like they were 15 years ahead of their time. But, on the other hand, too late to establish an other OS in a saturated market. Pity, really.

  • by thrillseeker ( 518224 ) on Sunday March 22, 2009 @03:43PM (#27290209)
    Did you ever follow the Occam language? It seemed to have parallelization intrinsic, but it never went anywhere.
  • by phantomfive ( 622387 ) on Sunday March 22, 2009 @03:59PM (#27290423) Journal
    From the article:

    The onus may ultimately lie with developers to bridge the gap between hardware and software to write better parallel programs......They should open up data sheets and study chip architectures to understand how their code can perform better, he said.

    Here's the problem, most programs spend 99% of its time waiting. MOST of that is waiting for user input. Part of it is waiting for disk access (as mentioned in the AnandTech story [], the best thing you can do to speed up your computer is get a faster hard drive/SSD). A miniscule part of it is spent in the processor. If you don't believe me, pull out a profiler and run it on one of your programs, it will show you where things can be easily sped up.

    Now, given that the performance of most programs is not processor bound, what is there to gain by parallelizing your program? If the performance gain were really that significant, I would already be writing my program with threads, even with the tools we have now. The fact of the matter is in most cases, there is really no point to writing your program in a parallel manner. This is something a lot of the proponents of Haskell don't seem to understand, that even if their program is easily paralellizable, the performance gain is not likely to be noticeable. Speeding up hard drives will make more of a difference to performance in most cases than adding cores.

    I for one am certainly not going to be reading chip data sheets unless there's some real performance benefit to be found. If there's enough benefit, I may even write parts in assembly, I can handle any ugliness. But only if there's a benefit from doing so.

  • Re:The Core? (Score:3, Interesting)

    by ( 1195047 ) <.ten.yargelap. .ta. .sidarap.pilihp.> on Sunday March 22, 2009 @04:00PM (#27290441) Homepage Journal
    Hey, at least we aren't dealing with the lovely world of Cyrix [] anymore... those were truly fun times with respect to compiler optimizations (or lack thereof, as it turned out). That and the, um, heat "issues."
  • Re:Adapt (Score:5, Interesting)

    by Dolda2000 ( 759023 ) <> on Sunday March 22, 2009 @04:05PM (#27290487) Homepage

    No, it's not about adaptation. The whole approach currently taken is completely, outright on-its-head wrong.

    To begin with, I don't believe the article about the systems being badly prepared. I can't speak for Windows, but I know for sure that Linux is capable of far heavier SMP operation than 4 CPUs.

    But more importantly, many programming tasks simply aren't meaningful to break up into such units of granularity is OS-level threads. Many programs would benefit from being able to run just some small operations (like iterations of a loop) in parallel, but just the synchronization work required to wake up even a thread from a pool to do such a thing would greatly exceed the benefit of it.

    People just think about this the wrong way. Let me re-present the problem for you: CPU manufacturers have been finding it harder to scale the clock frequencies of CPUs higher, and therefore they start adding more functional units to CPUs to do more work per cycle instead. Since the normal OoO parallelization mechanisms don't scale well enough (probably for the same reasons people couldn't get data-flow architectures working at large scales back in the 80's), they add more cores instead.

    The problem this gives rise to, as I stated above, is that the unit of parallelism gained by more CPUs is to large to divide the very small units of work that exist among. What is needed, I would argue, is a way to parallelize instructions in the instruction set itself. HP's/Intel's EPIC idea (which is now Itanium) wasn't stupid, but it has a hard limitation on how far it scales (currently four instructions simultaneously).

    I don't have a final solution quite yet (though I am working on it as a thought project), but the problem we need to solve is getting a new instruction set which is inherently capable of parallel operation, not on adding more cores and pushing the responsibility onto the programmers for multi-threading their programs. This is the kind of the the compiler could do just fine (even the compilers that exist currently -- GCC's SSA representation of programs, for example, is excellent for these kinds of things), by isolating parts of the code in which there are no dependencies in the data-flow, and which could therefore run in parallel, but they need the support in the instruction set to be able to specify such things.

  • by dansmith01 ( 1128937 ) on Sunday March 22, 2009 @04:24PM (#27290695)
    Perl has excellent support for building threaded applications. See [] . I code multi-threaded apps in perl all the time and they utilize my quad-code very efficiently - in fact, my biggest hassle with multithreading is keeping the CPU cooled! There's also a threads::shared module ( for handling locks, etc. I'd be hard pressed to imagine better language support for threading. Hardware, operating systems, and a lot of languages support threading. Granted, it isn't always easy/possible/worth it, but as things currently stand, the only bottleneck is programmers who are too lazy to design their algorithms for parallel execution.
  • by Anonymous Coward on Sunday March 22, 2009 @04:29PM (#27290749)

    You need to establish/prove purity to the compiler so it can actually make use of it.

    Lisp, Scala and Erlang don't have that property.

    Haskell does.

    Haskell and other pure languages are where the future of parallelism might lie.

  • Re:BeOS (Score:1, Interesting)

    by Anonymous Coward on Sunday March 22, 2009 @04:35PM (#27290805)
    Haiku and Syllable are pretty much trying to continue the model that BeOS used, where threading is heavily used and threads communicate via. high level message passing.
  • Re:Adapt (Score:5, Interesting)

    by Yaa 101 ( 664725 ) on Sunday March 22, 2009 @04:36PM (#27290817) Journal

    The final solution is that the processor measures and decides which part of which program must be run parallel and which are better off left alone.
    What else do we have computers for?

  • Re:Adapt (Score:5, Interesting)

    by Dolda2000 ( 759023 ) <> on Sunday March 22, 2009 @04:43PM (#27290889) Homepage

    As I mentioned briefly in my post, there was research into dataflow architecures [] in the 70's and 80's, and it turned out to be exceedingly difficult to do such things efficiently in hardware. It may very well be that they still are the final solution, but until such time as they become viable, I think doing the same thing in the compiler, as I proposed, is more than enough. That's still the computer doing it for you.

  • by Anonymous Coward on Sunday March 22, 2009 @04:58PM (#27291073)

    A computer that is used in an efficient way will at any time either do nothing (and hopefully switch to standby/hibernate after a couple minutes) or do several things parallelly. While I read Slashdot, my computer is mostly downloading mail, uploading files to a web server, defragging the disk, encoding a video, doing a background backup, etc. Or if it isn't, it can fold proteins. Modern browsers will also soon be multithreaded, some already are, so every tab, plugin etc. can run on its own core.

    Apps that lack multithreading can also be a blessing - less overhead, and they are restricted to one core, so no matter how bad an app behaves, there will always be a core that isn't affected by the CPU hog so the machine stays responsive. Responsiveness is much more important than raw computing speed.

  • by FlyingGuy ( 989135 ) <flyingguy@gmai l . c om> on Sunday March 22, 2009 @05:10PM (#27291211)

    it is the answer to the question that no one asked...

    In a real world application, as others have mentioned pretty much all of a programs time is spent in an idle loop waiting something to happen and in almost all circumstances it is input from the user in whatever form, mouse, keyboard, etc.

    So lets say it is something life Final Cut. Now to be sure when someone kicks of a render this is an operation that can be spun off on its own thread or its own process, freeing up the main process loop to respond to other things that the user might be doing, but that is where the rubber really hits the road is user input. The user could do something that affects the process that was just spun off, either as a separate thread or process on the same core or any other number of cores so you have to keep track of what the user is doing in the context of things that have been farmed out into other cores/processes/threads.

    Enter the OS.. Take your pick since it really does not matter which OS we are talking about, they all do the same basic things, perhaps differently, but they do. How does an OS designer make sure any of say 16 cores ( dual 8 core processors) are actually well and fairly utilized? Would it be designed to use a core to handle each of the main functions of the OS, lets say Drive Access, Com Stack pick your protocol here, Video Processing etc., or should it just run a scheduler like those that they now run which farms out thread processing based on priority? Is there really any priority scheme for multiple cores that could run say hundreds of threads / processes each? And what about memory? A single core machine that is say truly 64 bit can handle a very large amount of memory and that single core controls and has access to all that ram at its whim ( DMA not withstanding ), but what do you do now that you have 16 cores all wanting to use that memory, do we create a scheduler to schedule access from 16 different demanding stand alone processors or do we simply give each core a finite memory space and then have to control the movement of data from each memory space to another, since a single process thread ( handling the main UI thread for a program ) has to be aware of when something is finished on one core and then get access to that memory to present results either as data written to say a file or written into video memory for display?

    I submit that the current paradigm of SMP is inadequate for these tasks and must be rethought to take advantage of this new hardware. I think a more efficient approach is that each core detected would be fired up with its own monitor stack as a place to start so that the scheduling is based upon the feedback from each core. The monitor program would be able to ensure that the core it is responsible for is optimized for the kind of work that is presented. This concept while complicated could be implemented and serve as a basis for further development in this very complex space.

    In the terms of "super computers" this has been dealt with but in a very different methodology that I do not think lends itself to general computing. Deep Blue, Cray's and things like that aren't really relevant in this case since those are mostly very custom designs to handle a single purpose and are optimized for things like Chess or Weather Modeling, Nuclear Weapons study where the problem are already discretely chunked out with a known set of algorithms and processes. General purpose computing on the other hand is like trying to heard cats from the OS point of view since you never really know what is going to be demanded and how.

    OS designers and user space software designers need to really break this down and think it all the way through before we get much further or all this silicon is not going to used well or efficiently.

  • by hazydave ( 96747 ) on Sunday March 22, 2009 @05:25PM (#27291353)

    Multithreading is a system-level thing, not a language level thing.

    Sure, there have been languages that make threading ubiquitous, but they've never caught on, and it's hardly necessary.

    You'll notice that internet, graphics, and many other programming necessities are not built into C/C++ either. They are higher level functions, and thousands of programmers have no problem understanding C's role here. People have been writing multithreading code in C/C++ for decades... I've personally done in from the 80s until now, under a dozen or so OSs.

    Don't use your chosen language as a crutch for sicking to the level of programming practiced when that langauge debuted. The whole point of C was not to define much of anything in C itself.. in truth, the language proper doesn't even do I/O... that's handled via a library. So is threading, so is graphics, etc.

  • by hazydave ( 96747 ) on Sunday March 22, 2009 @05:30PM (#27291411)

    That's incorrect, at least in part. Modern MacOS is based on CMU's Mach, which has had lightweight threading support since long before Apple got into the picture. The OS was completely designed for multiple CPUs, down to the very core.

    If modern MacOS apps are not heavily multithreaded (I have no idea, I don't run priorietary hardware anymore, regardless of the OS), that's the fault of programmers not advancing past the days of MacOS 9... it has nothing whatsoever to do with the OS.

  • by caerwyn ( 38056 ) on Sunday March 22, 2009 @05:31PM (#27291419)

    I don't entirely agree with you here. A lot of current applications *do* suffer from CPU-induced latency after user interactions, and the problem is simple: they don't differentiate between the things that must get done before control is returned to the user, and the things that need to happen in response to the action but can be allowed to happen whenever resources are free. Even when the problem is resource-access latency, multithreading can be a win because that latency no longer contributes to the latency that the user perceives if it happens on a background thread.

    Something as simple as tossing function calls off on a background thread to deal with some of these tasks would do a great deal to improve latency from the user's perspective, and is really quite trivial to implement. Most programmers don't do it, though. Part of that is that in most situations there aren't ready-made solutions- you can't just say "run this function call on a background thread", you've got to go through the pthread creation process, etc. (Apple's Cocoa framework is actually an exception to this with it's NSOperation).

    The situation is analogous to that of an interrupt task: Do absolutely as little as possible before returning; everything else should happen on some other thread.

    I agree with you regarding optimization, but it's been my experience that many applications *can* benefit from these sorts of simple multithreading techniques- the programmers just don't do them, either from lack of ability or lack of resources.

  • Re:Adapt (Score:3, Interesting)

    by Trepidity ( 597 ) <> on Sunday March 22, 2009 @05:31PM (#27291421)

    The problem is still the efficiency, though. There are lots of ways to mark units of computation as "this could be done separately, but depends on Y"--- OpenMP provides a bunch of them, for example, and there's been proposals dating back to the 80s [], probably earlier. The problem is figuring out how to implement that efficiently, though, so that the synchronization overhead doesn't dominate the parallelization gains. Does the system spawn new threads? Maintain a pool of worker threads and feed thunks to them? Some hybrid approach? How does it determine when it's worth the effort of doing anything for a particular bit of computation versus just doing it inline and saving the overhead? Etc.

    Basically Grand Central is yet another in the decades-long line of proposals for specifying parallelizable computations. What's still an open question is whether they've solved the harder part, a way to, as you say, "[route] computation packets to wherever they can go, and then [receive] the results", without that routing and receiving taking inordinate overhead.

  • Re:Adapt (Score:4, Interesting)

    by Dolda2000 ( 759023 ) <> on Sunday March 22, 2009 @05:38PM (#27291481) Homepage

    All that which you say is certainly true, but I would still argue that EPIC's greatest problem is its hard parallelism limit. True, it's not as hard as I tried to make it out, since an EPIC instruction bundle has its non-dependence flag, but you cannot, for instance, make an EPIC CPU break off and execute two sub-routines in parallel. Its parallelism lies only in very small spatial window of instructions.

    What I'd like to see is, rather, that the CPU can implement a kind of "micro-thread" function, that would allow two larger codepaths simultaneously -- larger than what EPIC could handle, but quite possibly still far smaller than what would be efficient to distribute on OS-level threads, with all the synchronization and scheduler overhead that would mean.

  • by SpuriousLogic ( 1183411 ) on Sunday March 22, 2009 @06:03PM (#27291743)
    I'm not sure I totally agree that Haskell if the future, although I do think that functional programming right now looks to be the the most promising way to deal with muli-cores. Scala has some very strong points that can see it's adoption beat the other, specifically being able to run in the Java JVM and make use of existing Java libraries. You can use the function aspects of Scala when you need to, but still use Java where you do not need parallelism.
  • Re:BeOS (Score:3, Interesting)

    by verbatim_verbose ( 411803 ) on Sunday March 22, 2009 @06:45PM (#27292185)

    It may have been an axiom, but really, what did BeOS do (or want to do) that Linux doesn't do now?

    The Linux OS has been scaled to thousands of CPUs. Sure, most applications don't benefit from multi-processors, but that'd be true in BeOS, too.

    I'd honestly like to know if there is some design paradigm that was lost with BeOS that isn't around today.

  • by Anonymous Coward on Sunday March 22, 2009 @07:05PM (#27292387)

    With 4 30 minute pieces there will only be 3 problematic areas where the pieces are supposed to fit together, This shouldn't cause any major issues.
    The first couple of frames in each block wouldn't be able to use any previous frames as reference frames though which is a drawback (raises the bitrate for those frames) but if each block is large enough the negative effect should be negligable.

  • Re:Adapt (Score:4, Interesting)

    by AmiMoJo ( 196126 ) <> on Sunday March 22, 2009 @07:42PM (#27292719) Homepage Journal

    So, we can broadly say that there are three areas where we can parallelise.

    First you have the document level. Google Chrome is a good example of this - first we had the concept of multiple documents open in the same program, now we have the concept for a separate thread for each "document" (or tab in this case). Games are also moving ahead in this area, using separate threads for graphics, AI, sound, physics and so on.

    Then you have the OS level. Say the user clicks to sort a table of data into a new order, the OS can take care of that. It's a standard part of the GUI system, and can be set off as a separate thread. Of course, some intelligence is required here as it's only worth spawning another thread if the sort is going to take some appreciable amount of time.

    At the bottom you have the algorithm level, which is the hard one. So far this level has got a lot of attention, but the others relatively little. The first two are the low hanging fruit, which is where people should be concentrating.

  • by ( 760528 ) on Sunday March 22, 2009 @08:36PM (#27293133)

    It's really quite frustrating to see posts like this. Posts that dont take into account what is needed and focus on what we are incapable of doing - even when they dont need to.

    So lets look at reality for second. First, most modern OS's scale very very far past 4 cpu's (not sure what windows scales to, but linux certainly has no limitation based on current cpu reality). So the kernels are just dandy for multi-core cpu's, bring it on! 128 cores, we're ready for ya!.

    The same is not true at the application level, and that is a fair comment. But dont confuse linux and windows with their apps for crying out loud! From an application point of view we are capable of parallel coding, but its non-trivial. Its also not something we need alot of the time.

    For instance, we now buy servers (our cheapest models) with dual cpu's and quad cores and we're tending to virtualise it up into several machines with 1 or 2 cpu's each. Now whether you do this because you assume the OS will utilise one cpu and the apps will utilise another (as one person told me is irrelavent). Surfice it to say, having 2 cpu's is usually quite nice.

    But what requires more then that in reality? well, your desktop might - after all theres alot of things going on at once right? In some point cases, thats true (there are quite a number of very heavy applications out there, and supprise supprise, they can multitask *GASP*).

    Same at the server, not many things require that many CPU's and even at the application level, we've gotten good at spreading heavily loaded applications across multiple servers (we call it load balancing, was that too sarcastic?). Take mail (weather its exchange or postfix or sendmail or whatever), or web servers, etc. Those server applications that do require heavy grunt tend to already be coded with "parallel" in mind, even across multiple servers (think oracle RAC).

    As for cache contention - well it sounds like the hardware makers are finally fess'ing up to the fact they have a problem, Houston!

  • Re:Adapt (Score:3, Interesting)

    by giorgist ( 1208992 ) on Sunday March 22, 2009 @09:20PM (#27293499)
    You havn't seen bulding go up. You don't place a brick render it, paint it hang a picture frame and go to the next one.

    A multi story building has a myriad of things happening at the same time. If only computers were as parralel processing.
    If you have 100 or 1000 people working on a building, each is an independant process that shares resources.

    It is simple, 8 core CPUs is a solution that arrived before the problem. A good 10 year old computers can do most of todays
    office work.
  • by Tiger4 ( 840741 ) on Sunday March 22, 2009 @09:24PM (#27293523)

    2. I recode for a cluster. Why stop at a multi-core computer? If I can get a 2:1 to 10:1 speed up by writing better code, then why stop at a dual or quad core? The application might require a 100:1 speed up, and that means more computers. If I have a really nasty problem, chances are that 100 cores are required, not just 2 or 8. Multi-core processors are nice, because they reduce cluster size and cost, but a cluster will likely be required.

    I think I agree with you, BUT... don't fall into the old trap: If ten machines can do the job in 1 month, 1 machine can do the job in 10 months. But it doesn't necessarily follow that if one machine can do the job in 10 months, 10 machines can do the job in 1 month.

    Also, the problem with runtime interpreters is not that they don't generate assembly code. The problem is that it is harder to get at the underlying code that is really executing. That code could be optimized if you could see it. But seeing it is just more difficult.

  • Re:Adapt (Score:3, Interesting)

    by try_anything ( 880404 ) on Sunday March 22, 2009 @10:03PM (#27293785)

    Short answer: only one thing I mentioned involved disk I/O, RAM is cheap, and application frameworks typically limit the number of jobs being run at one time.

    If there's really a performance need to serialize tasks involving disk I/O, then go ahead and serialize them. Eclipse, the application framework I'm most familiar with, makes this straightforward: just define a scheduling policy that allows only one job to run at a time and apply that policy to all your disk I/O jobs. Other jobs will continue to be scheduled and run according to the default policy or whatever other policy you specify -- might as well get some work done while you're waiting for the I/O to complete.

  • Re:Adapt (Score:3, Interesting)

    by erroneus ( 253617 ) on Sunday March 22, 2009 @10:31PM (#27293957) Homepage

    Multi-core processing is one thing but access to multiple chunks of memory and peripherals are also keeping computers slow. After playing with running machines from PXE boot and NFS rooted machines, I was astounded at how fast those machines performed. Then I realized that the kernel and all wasn't being delayed waiting on local hardware for disk I/O.

    It seems to me, when NAS and SAN are used, things perform a bit better. I wonder what would happen if such control and I/O systems were applied into the same box? Smart RAID controllers are a step in that direction, but they are still accessed as SCSI devices. What might the results be if the secondary storage systems were in a server within the box dedicated to optimized disk I/O? The same sort of thing is being done with GPU processing, but I wonder how much more removed the graphics systems could become?

    Devices need to become smarter and faster to really make things perform at their best speed.

  • by coryking ( 104614 ) * on Sunday March 22, 2009 @10:38PM (#27293987) Homepage Journal

    But most computing in the world is done using single-threaded processes which start somewhere and go ahead step by step, without much gain from multiple cores.

    The fact that all we do is sequential tasks on our computer means we are still pretty stupid when it comes to "computing". If you look outside your CPU, you'll see the rest of the computers on this planet are massively parallel and do tons and tons of very complex operations far quicker than the computer running on either one of our desks.

    Most of the computers on the planet are organic ones inside of critters of all shapes and sizes. I dont see those guys running around with some context-switching, mega-fast CPU, do you?**. All the critters I see are using parallel computers with each "core" being a rather slow set of neurons.

    Basically, evolution of life on earth seems to suggest that the key to success is going parallel. Perhaps we should take the hint from nature.

    ** unless you count whatever the hell consciousness itself is... "thinking" seems to be single-threaded, but uses a bunch of interrupt hooks triggered by lord knows what running under the hood.

  • Re:Adapt (Score:4, Interesting)

    by david.given ( 6740 ) < minus author> on Sunday March 22, 2009 @10:46PM (#27294043) Homepage Journal

    I expect the future of CPUs will be heterogeneous multicore.

    You may be interested to know that, as far as I can tell from the rather fuzzy documention, the MSM7201A processor used in the G1 smartphone has at least three dissimilar cores, and potentially five:

    • an ARM11 for the application stack
    • an ARM9 for the radio stack
    • a QDSP4000
    • possibly a QDSP5000, the spec is unclear as to whether you get both this and the QDSP4000
    • a PowerVR 3D accelerator unit, although the spec is again unclear as to whether this is actually in silicon and not just a particular firmware load for the DSP

    I gather that it's pretty hard to make them share address spaces, even the two ARMs; so SMP is probably not feasible. Message-passing via specific shared memory segments is the usual approach.

  • by tftp ( 111690 ) on Monday March 23, 2009 @12:22AM (#27294561) Homepage

    If you look outside your CPU, you'll see the rest of the computers on this planet are massively parallel

    You don't even need to look outside of your computer - it has many microcontrollers, each having a CPU, to do disk I/O, video, audio - even a keyboard has its own microcontroller. This is not far from a mouse being able to think about escape and run at the same time - most mechanical functions in critters are highly automated (a headless chicken is an example.) I don't call it multithreading because these functions are independently operated, just as I don't call a 386 computer dual-core because it has an independent ATA controller or an independent network card. The HDD gets written to and the network data is sent without using the main CPU, but these are independent functions performed by independent hardware. IBM/360 had that already.

    Some people (very few) have an ability to do two dissimilar tasks at the same time. That would be a perfect analogy. But the rest of us, all critters included, are single-threaded, just as you mentioned yourself. Logically thinking, any single thought can't be easily parallelized, but why couldn't we think two thoughts at the same time? I wonder why is that? This question is, IMO, very important because a brain should, technically, be capable of that feat - and nevertheless it doesn't do that! I guess this could be because the brain has (or must have?) only one VM to run our consciousness (our persona) on. Since most thoughts [queries] are executed in volatile context of owner's persona [database] it could be that allowing two thoughts at the same time, on two copies of the persona, would result in independent modification of both personas, and how to you merge them back then? And if the brain doesn't copy a full database that defines a person for each trivial thought, then running of two or more queries in parallel may result in unpredictable results (does a brain have semaphores, mutexes and spinlocks? I doubt that; if you are asked to "hold that thought" it takes a considerable effort to separate and memorize the context, and often we fail.

  • Re:Adapt (Score:2, Interesting)

    by KingMotley ( 944240 ) on Monday March 23, 2009 @01:17AM (#27294841) Journal

    I guess that would be highly dependent on your particular field. First, .NET has functional languages like F#, and M. I also find that you dismiss the importance of profiling code in .NET simply because it doesn't generate machine language code. I'm at a loss as you why would you think it is any less important. Determining the areas which are being stressed the hardest and deserve more of your intention is completely unrelated to whether the code generates ASM, ML, or IL.

    You say .NET has "minimal support for it", but I suspect that's speaking more of your understanding that support. Background Workers is the easy way for highly independent routines to execute in many common scenarios. If that isn't enough or doesn't fit your need, then ThreadPools make highly parallelizable code a snap to implement. Example:

                    Dim eventhandles As New List(Of EventWaitHandle)
                            For Each site As String In sites
                                    Dim ewh As New EventWaitHandle(False, EventResetMode.ManualReset)
                                    Dim param As New ThreadData(ewh, "http://" & site)
                                    Threading.ThreadPool.QueueUserWorkItem(AddressOf DoDownload, param)
                    Catch ex As Exception
                    End Try

    Now you are free to write the "DoDownload" routine, that could download some data, validate it, and do some processing on it. With no further changes, the above code would work well on any machine with a single processor, to one with 64 or more cores (I haven't tested more than 64 cores). If more control is needed, you can set the number of worker threads that will execute concurrently based on the number of processors in the machine with a single call, or you could implement your own threadpool, or create your own implementation by overriding specific functions of it. I also left the eventhandle code in the example, although it isn't needed for the example. It does show exactly how easy it is to create and use some more advanced thread synchronization primitives in .NET.

    Lastly, you could also spawn your own threads if you need/want even more control. It's incredibly easy. Example:
    Dim t1 as new thread(AddressOf DoDownload)

    Not hard stuff, really. Of course if you want to get into larger scale outs, then you may want to look into the Azure set of .NET features, which is supposedly specifically designed for large scalability for cloud computing (I myself have no experience in that area).

  • by coryking ( 104614 ) * on Monday March 23, 2009 @01:55AM (#27294981) Homepage Journal

    Logically thinking, any single thought can't be easily parallelized, but why couldn't we think two thoughts at the same time?

    Yes, but there is increasing evidence (dont ask me to cite :-) that many of our thoughts are something that some background process has been "thinking about" long (i.e. seconds or minutes) before our actual conscious self does. There are many examples of this in Malcolm Gladwell's "Blink", though I dont feel much like citing them. Part of that book, I think, basically says that we should really trust the underlying parallel part of our brain and "go with our gut" more often then western society often feels comfortable doing.

    Basically, yeah, our train of though it single-threaded, but that doesn't mean our train of though isn't just a byproduct of lower-level processes that have figured stuff out long before "we" become aware of it.

  • Re:This is incorrect (Score:3, Interesting)

    by dkf ( 304284 ) <> on Monday March 23, 2009 @08:33AM (#27296591) Homepage

    It's better to have specifically declared shared memory with inherently limited access. At the very least, analysis could catch unlocked accesses to known-shared memory.

    You're better off going to a message-passing model; they're theoretically much more tractable (there are several schemes that have had decades of work done and even spawned programming languages) and they scale up to multi-machine computing (e.g. cluster-scale) much more easily.

    Shared memory parallelism is just plain nasty. Occasionally useful, but always nasty. Use with care and good taste.

  • Re:Adapt (Score:3, Interesting)

    by mr_mischief ( 456295 ) on Monday March 23, 2009 @11:15AM (#27298585) Journal

    So let the office workers keep the two-core machines. I'll take the 8-core machine since I'm not doing just word processing and spreadsheets.

    BTW, complex spreadsheets are actually an ideal application to break into parallel execution if there aren't too many dependencies in the functions. A slower and more power-efficient multi-core processor could update all the cells in many spreadsheets just as fast as a faster single-core one.

In seeking the unattainable, simplicity only gets in the way. -- Epigrams in Programming, ACM SIGPLAN Sept. 1982