Science Grid Genesis 166
Cranial Dome writes "According to this Cnet.com story, the Department of Energy (DOE) is working to interconnect the first two computers which will form the genesis of the DOE Science Grid, a virtual supercomputing system which will eventually encompass many more systems at several locations. The larger of the two machines: DOE National Energy Research Science Center's (NERSC) IBM SP RS/6000, a distributed memory machine with 2,944 compute processors. This machine, together with a smaller 160 processor Intel system, will make up a combined 3,328 processor Unix system with 1.3 petabytes(!) of storage space. And this is only the beginning..."
Re:Dammit... (going OT, deal with it) (Score:1, Offtopic)
Do you realize just how many submissions every hour these poor folks get??
You may submit one... and just maybe it was your submission that stood out from the hordes of trolls and tweaked their interest enough to look into the story.
So what if someone else's submission with similar verbage was chosen over yours, it made it onto the site. Good enough for me.
As long as they aren't practicing nepotism in the submission choosing process (which they probably are,) then it won't bother me.
A suggestion: write in your journal, post your rants and leave comments enabled. I am sure there are other
You never know, maybe the person who's story was picked is a subscriber.
oh well. <soapbox>
(Moderation totals: Insightful: 1, Troll: 2, Offtopic 3, Total M=mc^2
Could you imagine... (Score:1)
CYIaBCoX... SUYA?!!! (Score:1, Funny)
(I think Stanislaw Lem wrote about that, IIRC, the story was "The First Sally, or The Trap of Gargantius".)
Re:CYIaBCoX... OW?!!! (Score:1, Offtopic)
1.3 petabytes? (Score:3, Funny)
Or... (Score:2)
Re:1.3 petabytes? (Score:2, Funny)
Ok, I don't know the size of the Internet. I'm just guessing...
Re:1.3 petabytes? (Score:2)
Re:1.3 petabytes? (Score:1)
Re:1.3 petabytes? (Score:1)
Used to be under conf.modules,
and then modules.conf
5 years from now (Score:3, Funny)
Re:5 years from now (Score:2)
Re:5 years from now (Score:3, Funny)
It's nice to see... (Score:1)
Re:It's nice to see... (Score:1)
An interesting next step might be to have a "Science Grid@home" program that people can run as a screen saver on their PCs, or something. Not for all projects, but a little extra programming might be justified for all those unused CPU cycles.
Re:It's nice to see... (Score:1)
Here is an idea. Make a skeleton of an animated cartoon, and then let the actual ray tracing be done by a network of machines. Ray tracing lends itself nicely to this sort of thing; you can even have a team of computers work on a single frame simultaneously. Its not useful, but it sure as hell would be more fun than calculating m-primes and processing SETI data.
1.3 petabytes (Score:1, Interesting)
A use for this type of power and storage is simulating nuclear detonation. It's possible we noo longer have to actually detonate nukes on a test basis.
Re:1.3 petabytes (Score:1)
Everybody would start simulations, and the one with the worst damage lose...
Rrr... sorry
Re:1.3 petabytes (Score:1)
The long running war was damaging their artwork and architecture, so instead they ran a simulation, and whenever a strike was successful, the computer said how many died, and then the government rounded up that many people and killed them (in some non gory, painless ST kinda way)
Please God, take me back to the good old days with Bones using a remote control to make Spock walk because his brain had been stolen, or Kirk doing horse impressions whils being ridden by a midget. Oh, yeah.
Re:1.3 petabytes (Score:1)
Maybe I was wrong about the war run by computers being TNG, but I was not seriously arguing that TOS was some sort of perfect masterpiece
Re:1.3 petabytes (Score:1)
hehehe, I can see it now, some world leader halfway around the world simulates a nuclear explosion...
"no fucking way you hit me with that last nuke, I was right behind you. This guy's using a fuckin bot!"
bah, it's payday, I'm in a weird mood
Re:1.3 petabytes (Score:1)
Re:1.3 petabytes (Score:1)
(Slightly OT) 1.3 Petabytes? (Score:3, Informative)
Using the current prices, this amounts to roughly 150.000. It's not that impossible to store your entire life on a single computer anymore. These guys show that such a thing can be built.
Guess it's finally time to answer the question (Score:2)
Looks like interesting times for AI researchers. Does AI require as many transistors as the brain has neurons? Does it require the same amount of storage and information? Is there something else needed? Looks like we're soon to answer at least one of these.
Re:Guess it's finally time to answer the question (Score:1)
Re:(Slightly OT) 1.3 Petabytes? (Score:1)
Okay, so would that be [E]150,000 or [E]150 + 000 cents? I'm guessing you meant the former, but that's still completely "impossible" for most everybody.
Re:(Slightly OT) 1.3 Petabytes? (Score:2)
Note that the price actually gets spread out throughout all your life. If you started now, you'd need only about $5000 every year to buy the necessary hard drives. And considering the speed at which prices have been going down per Megabyte, it is likely that the original estimate of $120,000 is the upper bound, and the REAL price is a lot lower.
Re:(Slightly OT) 1.3 Petabytes? (Score:1)
Suppose my sensory input is of the order of
100Kb/sec (that's pretty conservative, since the visual input alone is probably more than that, say 10 frames/sec, 10Kb image/frame).
If I live to 50 years I will have been awake for roughly 10^9 secs. Thus my total informational inflow is around 100*10^12 Kb = 100 petabytes.
Look I have a paper right there.
Just mix in some inane ramblings about digital immortality and voila! Too bad I don't work at Microsoft.
Hmmm, This and the PS3 (Score:4, Insightful)
Seriously though, What type of security system is the DOE building into this, which is essentially a large mainframe? Its understandable to be worried when the DOE handles things such as nuclear secrets that sometimes slip into the hands of certain researchers, much like they were picking them up at a drive-through.
Im curious to see how the data will be encrypted/decrypted along such a vast system.
Re:Hmmm, This and the PS3 (Score:1)
Re:Hmmm, This and the PS3 (Score:1)
Re:Hmmm, This and the PS3 (Score:1)
However, the Globus [globus.org] toolkit was build on the GSSAPI which would allow it to run on anything you want to write an interface to.
Re:Hmmm, This and the PS3 (Score:2)
Re:Hmmm, This and the PS3 (Score:1)
It is being sponsored by DARPA [darpa.mil].
Cool stuff. They built it with security in mind and it runs under:
Linux [kernel.org], OpenBSD [openbsd.org], FreeBSD [freebsd.org] and Solaris [sun.com].
Re:Hmmm, This and the PS3 (Score:1)
Also, anything the DOE would do on this network would be unclassified, and completely non-export controlled. Classified work is done on internal networks separated from the internet by an air gap.
...sounds familiar.... (Score:1)
I suppose it wouldn't have the same reach, as it isn't grounded with scientists / universities as the original. Wishful thinking I suppose.
Whoo! (Score:3, Interesting)
It started off with something like four nodes. Look where it is today.
Re:Whoo! (Score:2)
Re:Whoo! (Score:2)
:)
1.3 Petabytes!! (Score:1)
Re:1.3 Petabytes!! (Score:1)
That's about enough to hold two and a half millennia of MP3's, or the Microsoft DirectX SDK.
Odd math (Score:1)
I just wonder...these must be those first
generation Pentiuns with faulty math
anyway.
Re:Odd math (Score:1)
SETI (Score:2, Funny)
Re:SETI (Score:1)
Re:SETI (Score:1)
And that's just for Bill's salary accounting... (Score:1, Troll)
Windows NT 2k2 (laugh thee not, M$ doth speak of such a beast)...
Oh, and don't forget that wonderful
-TG, more power = faster virii production, woohoo!
PS: In all seriousness (ack, there goes my Funny)... It would be cool if we could put this bad boy to work on some nasty stuff, like Superstring Theory, Proteins, and other Monstrously Huge Data Crunching Projects. But somehow I get the feeling this is going to be a toy for atom-smashers... never something practical or real-world.
Re:And that's just for Bill's salary accounting... (Score:2)
For Windows- 2008
Everyone else- 12,234
*schwing!*
Re:And that's just for Bill's salary accounting... (Score:2)
Another point is the fact that Windows has only been released on a handfull of architectures. To have systems such as this, you need support for ungodly amounts of memory. The best platform for windows at this point is X86, which is limited without more hacks that are worth the time and money.
Even with windows NT on Alpha, windows didn't even come close to tapping the full potential of the architecture. At the time windows NT was the core product for MS servers, MS had a different agenda. Now that the Itaniums are coming, its a good bet that MS may want to try their hand at this market...but I don't think they'll get far.
Code cracking becomes borring (Score:1)
yeah but... (Score:1, Funny)
Fat lot of good all that super computing is going to do you if your frame rate sucks. You'll be fragged in minutes.
Misfit
Hm (Score:1)
The scheme of it all (Score:5, Informative)
Here, for the lazy, are some of the objectives:
Thus, the applications are enormous. Not that you couldn't do it distributed across desktops à la SETI, but here we're talking data integrity, and let's not forget that even SETI has a kick-ass centralised server setup or the whole thing wouldn't work anyway.
But especially interesting is the document filename:-
DOE_Science_Grid_Collaboratory_Pilot_Proposal_03_1 4.nobudget.pdf
Now, who can get me the version WITH the budget? I want it. Hehe.
Brain Mapping? (Score:1)
Re:Brain Mapping? (Score:2)
Re:Brain Mapping? (Score:1)
Re:The scheme of it all (Score:2)
Anyone who has ever used a parallel machine quickly realizes that in most "interesting" problems, a great deal of inter-processor communication is involved. Even apparently "trivially" parallelizable tasks, such as a CG ray-tracing of a shot from a movie scene, often carry bottlenecks which limit their degree of parallelization. For instance, in the ray-tracing case, even though each ray can indeed be traced independently of the rest, each processor must store the 3D volumetric model it is rendering in memory. Eventually the size of the volumetric model exceeds the memory capacity of the processor, and rays must then be swapped among processors. The same limitations apply to any number of other tasks -- data mining (where one needs to search for correlations in a huge volume of data, too large to be stored on a single processor), simulation (where hyperbolic, or even more bandwidth consumptive, parabolic or elliptic PDEs are often solved), etc...
Achieving good load balance in parallel applications is a key challenge in computational science today. It's quite fair to say that on the current generation of IBM SP2s, which are the most common architecture in high-end computing, the parallel performance for most applications is poor at best. Slapping on an additional machine, with an even tigher bottleneck over the network between them, is not going to magically solve any problems. It is going to push the state-of-the-art of a very LIMITED set of applications a bit further, but a lot more work at the hardware and algorithmic levels needs to be done before MOST applications can really benefit from the scale of these machines.
Bob
Connectivity (Score:2)
Re:Connectivity (Score:1)
Get Linda Hamilton on the phone... (Score:2)
Quick, someone tell Linda Hamilton to head for the mountains! Her unborn child will be the only one to stop all of this madness!
Re:Get Linda Hamilton on the phone... (Score:2)
*grin*
.
.
.
*duck*
All gov boxes should be connected to this... (Score:1)
1) could reduce the future (taxpayer) costs for "supercomputer grade" applications.
2) could be applied to help solve socio-economic problems in addition to the 'hard' sciences
3) would get "bang for the taxpayer's buck" by utilizing the idle horsepower of publicly purchased computers
I do think, however, they should employ a commercially available distributed computing platform, such as that from www.ud.com
I don't feel that tax dollars need be spent on duplicate research in that area.
-
UK Perspective (Score:1)
fear the future! (Score:1)
sigh... oh well, i guess Evangelion is getting a little closer though.
Re:fear the future! (Score:1)
Like creating a time machine, going back to the Bible's time, and walking around telling everyone I'm Jesus. Because of that, there really was a Jesus...me!
Then my brain starts to hurt from all of this time-travel paradox thought, and I think about something else
Re:fear the future! (Score:1)
I Would Be Really Amused... (Score:2)
...if they used it to run a simulation of climate and discovered that the Science Grid was responsable for global warming.
(insert your comments about how hot Company X's chips run below)
Re:I Would Be Really Amused... (Score:2)
*puts on fire protection*
A little more information (Score:3, Informative)
This is a little surprising that it got posted and all because it's not all that earth shatterning news, but I'll provides some additional information about grids in General.
There are a wide variety of systems like this that are either currently available or are being developed. Among them are Particle Physics Data Grid [ppdg.net], NEESGrid [neesgrid.org] and various European [eurogrid.org] and Asian [apgrid.org] counterparts.
The basic premise is to allow access to various resources you don't have at your desktop. This is not to be confused to with putting all these computers together an forking a process a billion times and having it run it run all over the globe. It's more like saying I have a process that requires 128 processors and 4GB of ram, go find it an run it for me.
Most of the systems use Globus [globus.org] which is pretty much the defacto standard. There are other systems out there such as Legion [virginia.edu] and Condor [wisc.edu] which serve slightly different purposes.
I've also seen some issues about security raised, so I'll mention them quickly. Globus is built upon an API called GSS (Generic Security System), I believe it will soon (if not already) have an RFC published. This is a layer on top of various other security systems that may be local to the server running it. It can use Kerberos or PKI to do encryption across the network (don't flame me if it's wrong, I'm not security expert).
When I wish to start using the grid, I start up my proxy that takes care of all authentication for me. Then my proxy connects to the gatekeeper on the remote machine which authenticates me based on my private key and then authorizes me via a mapping (usually just a text file). The task is then executed by the gatekeeper via the mapping on the remote machine. Input and output can be redirected over a secure layer if you so desire.
My certificate is issued by an authority. In this case the Globus CA. The nice thing if that if you want to set up a grid of your own computers, you can get a cert from them too. Install Globus and it will tell you how.
Certificates also allow you to get access to data. This allows me as a user A to run program B at site C providing results to user D at site E for a period of time F.
It's all terribly neat and remarkably easy to install on your favorite Linux or Solaris box. It's also fairly easy to write programs to utilize the Grid thanks to the various CogKits [cogkits.org] for Python, Java and Perl.
Latency (Score:2)
These grids are all great and wonderful as far as peak performance is concerned, but I'm wondering how the latency associated with long haul networks affects peformance for a range of applications that are not embarrassingly parallel.
Did we have the ..achieves consciousness Joke yet? (Score:1)
Obligatory Gibson reference. (Score:1)
DOE National Energy Research Science Center's (NERSC) IBM SP RS/6000, a distributed memory machine with 2,944 compute processors WINTERMUTE , together with a smaller 160 processor Intel system NEUROMANCER , will make up a combined 3,328 processor Unix system ... And this is only the beginning... expect alien artificial intelligence to be contacted very soon.
Supercomputers - oink (Score:2, Flamebait)
This is the fallacy of "supercomputer centers" and "supercomputer networks". You don't want 1% of a supercomputer; you want a machine of your own.
There was a time when sharing big number-crunching machines made sense. Until the mid-1980s, there were commercial scientific computing service bureaus running big iron and selling CPU time. They're all gone, along with Control Data Corporation, Cray, and the commercial market for supercomputers.
If you really want a shared big engine cheap, cut a deal with a big hosting provider for off-hours time on the server farm. Set up a Beowulf cluster of a thousand rack-mounted 1U servers, crunching from midnight to 6AM every night. All you'd really need to do is negotiate a bulk buy of offpeak-only shell accounts. All the machines are identical and the cluster has lots of internal bandwidth, so you can get real coordinated work done, not just the low-bandwidth stuff like SETI and cryptanalysis.
Re:Supercomputers - oink (Score:2)
Tell that to IBM...
Set up a Beowulf cluster of a thousand rack-mounted 1U servers
Clusters have their own set of issues and problems.
This is the fallacy of "supercomputer centers" and "supercomputer networks". You don't want 1% of a supercomputer; you want a machine of your own.
But everyone can't have a machine of their own that processes huge parallel jobs. You have to buy one, and share it between many users. So while you may only get 1% of a supercomputer's time, during that 1% of time you can use 10-100% of it's power. Considering the type of jobs we're talking about, that's a hell of a lot better than having a regular desktop crunching 100% of the time. It could take months to complete a job that could be done in an hour on a supercomputer, and waiting months for each step during your research would really suck.
The fact remains, supercomputers are not dead. They're still widely in use and people are still buying them for good reason.
Genesis Device? (Score:1)
NERSC? Designed and built by the dot-com effluvia of the 1990s (Eugenics Wars)
But the good news is, Nimoy will have a final resting place when he dies.
GRID Computing (Score:2)
GRID Computing is the current sexy term in scientific computing, but its something that is so vague that it can mean all things to all people. Which is perhaps why its suddenly so popular, everyone can get their pet project funded.
To some people it means actualy hardware, routers, fibre, supercomputers, that sort of thing. Certainly in the UK and Europe this group consists mostly of Particle Physicists, see the GridPP [gridpp.ac.uk] Project Homepage for details of whats going on there...mostly the Particl Physicsts seem to have ridiclous amounts of data on their hands (Petabytes/day) that they have to ship. Fun stuff!
To the astronomical community it means software, virtual observatories, data mining and intelligent agents. In the UK and Europe have a look at the AstroGrid [astrogrid.ac.uk] and the AVO [eso.org] projects. Although some of us are talking about hardware, the project I'm working on for instance, eSTAR [estar.org.uk], is putting robotically operated telescopes onto the GRID. However even here the main focus of the project is on the fun stuff we can do with the software, intelligent agents and data mining spring immediately to mind. In the US the NVO [us-vo.org] is the main focus of GRIDs for the astronomers there...
Al.Cool (Score:1)
Science Grid Genesis... (Score:1)
Does it have to be asked! (Score:1)
Re:Yahoo News (Score:1)
Re:Could you imagine....no, seriously (Score:1)
First, this is a sys-admin nightmare. keeping thinks running smooth in a system of HP-UX, Sun, IRIX, NT, and 2000 gets ticky sometimes. Add about a dozen variaties of OSes and patch levels, and you'll need another cluster of supercomputers just to keep everything straight on the first cluster.
Also, the idea of clustering supercomputers isn't new. Cray has been pushing this idea for some time, selling nodes that can be stand-alone computers or part of a bigger cluster.
There is also a point after which keeping an old SGI isn't worth the cost of space, power and upkeep.
just some disjointed thoughts...
Re:Could you imagine....no, seriously (Score:3, Funny)
And that point comes precisely 4 days 7 hours and 29 minutes after unpacking and turning it on.
Re:Could you imagine....no, seriously (Score:1)
** Warning, mistake in above ** (Score:2)
It should read;
This is approximately 1 trillion bytes or 1,048,576 gigabytes.
under petabytes, not terabytes.
Slashdot regrets the error.
I could care less.
Re:petabytes (Score:2, Informative)
This gives us:
Re:petabytes (Score:1)
A standard megabyte is 1 million bytes. A 1048576 byte unit should be referred to as a mebibyte. See for example this link [nist.gov] for more details.
It is indeed the case that the error of using powers of two to approximate powers of ten starts getting very large with these big numbers. The solution to that problem is NOT to continue using decimal prefixes for binary units and expect everybody outside of the computing field who now uses those exact same prefixes correctly to adapt themselves, the solution is to use the correct prefixes for the correct values. That way the confusion will disappear in the long term.
Tom
Re:petabytes (Score:2)
I also note that when your format a 100 Mb disk, your OS (MacOS and Windows - probably *nix systems too - I haven't tested it) reports the volume size as about 72 Mb. I propose that the cause of the "problem" is the fact that disk manufacturers redefined the term to make their disks appear to have greater capacity.
Think about this: if you are a consumer, do you really care if a megabyte equals 2^20 bytes or one million bytes? I propose that you do not - you simply care that everyone who uses the term "megabyte" means the same thing so you can accurately compare apples to apples. AFAIK, one megabyte of RAM is still 2^20 bytes of RAM. Why shouldn't it be the same for non-volatile media?
How many consumers have called disk manufacturers or other help lines asking "Where did my space go? The label says 100 Mb but my computer says there are only 72. I want my other 28." If you adopt the "solution" you propose, you have to get the RAM industry and the OS providers to adopt it as well to be consistent.
Re:petabytes (Score:1)
Disk megabyte = 1,000,000 bytes
REAL Megabyte = 1,048,576 bytes
Difference = 48,576 bytes, or about 15 floppies worth of space per Mb
Wow, I've never seen floppies that are 3238 bytes! Either that or you have found a new form of mathematics, perhaps?
Re:petabytes (Score:2)
So, here is the math I should have done:
Re:petabytes (Score:1)
Re:petabytes (Score:2)
National Institute of Standards [nist.gov]
Re:petabytes (Score:1)
Re:petabytes (Score:1)
Re:Let Science Grid Listen for SETI. (Score:1)
Re:Call me crazy.... (Score:1)
Re:Call me crazy.... (Score:1)
It doesn't matter, really. When you're talking about clusters with this many processors, the software is infinitely more interesting than the hardware.
It's plenty easy to go out and buy a thousand chickens. But when you've harnessed them all together and trained them to cooperate in pulling your wagon, then you've created something new.
Re: (Score:2)