Need To Move 1.2 Exabytes Across the World Every Day? Just Effingo (theregister.com) 37
An anonymous reader shares a report: Google has revealed technical details of its in-house data transfer tool, called Effingo, and bragged that it uses the project to move an average of 1.2 exabytes every day. As explained in a paper [PDF] and video to be presented on Thursday at the SIGCOMM 2024 conference in Sydney, bandwidth constraints and the stubbornly steady speed of light mean that not even Google is immune to the need to replicate data so it is located close to where it is processed or served.
Indeed, the paper describes managed data transfer as "an unsung hero of large-scale, globally-distributed systems" because it "reduces the network latency from across-globe hundreds to in-continent dozens of milliseconds." The paper also points out that data transfer tools are not hard to find, and asks why a management layer like Effingo is needed. The answer is that the tools Google could find either optimized for transfer time or handled point-to-point data streams -- and weren't up to the job of handling the 1.2 exabytes Effingo moves on an average day, at 14 terabytes per second. To shift all those bits, Effingo "balances infrastructure efficiency and users' needs" and recognizes that "some users and some transfers are more important than the others: eg, disaster recovery for a serving database, compared to migrating data from a cluster with maintenance scheduled a week from now."
Indeed, the paper describes managed data transfer as "an unsung hero of large-scale, globally-distributed systems" because it "reduces the network latency from across-globe hundreds to in-continent dozens of milliseconds." The paper also points out that data transfer tools are not hard to find, and asks why a management layer like Effingo is needed. The answer is that the tools Google could find either optimized for transfer time or handled point-to-point data streams -- and weren't up to the job of handling the 1.2 exabytes Effingo moves on an average day, at 14 terabytes per second. To shift all those bits, Effingo "balances infrastructure efficiency and users' needs" and recognizes that "some users and some transfers are more important than the others: eg, disaster recovery for a serving database, compared to migrating data from a cluster with maintenance scheduled a week from now."
But what happens.. (Score:5, Funny)
..when it doesn't Effingo
Re: (Score:2, Funny)
..when it doesn't Effingo
Then you take an effing laxative.
Simple (Score:2)
Re: No pricing. (Score:2)
Technical details? (Score:1)
Re:Technical details? (Score:4, Funny)
Re: Technical details? (Score:1)
Re: (Score:2)
And what are you waiting for? There's a link to an academic paper and a video in the summary. Or was this a lame retelling of a 60 year old joke?
Re: Technical details? (Score:1)
Re: Technical details? (Score:2)
No need to post in "line noise" either, dummy.
Re: (Score:3)
If it were simply moving data once a day between two sites, it might be interesting to compare the cost of the two solutions. The private jet might just win.
Doing 28TB/s can't be cheap. And that's bytes. So call it around a quarter petabit per second.
But how much space does that much data storage take up? Maybe you could push 10PB/rack, so you're probably talking 100 racks. That's a large cargo plane, not a private jet. Assume you have the data center at the airport, and you create a custom setup desi
Re: (Score:2)
If it were simply moving data once a day between two sites,
It obviously isn't. I suspect it is a constant flow of data, propagating in pseudo-real real time mostly, more like database or zfs replication so in the end the data is closer to their multiple users.
So, by the time a plane or a station wagon would get there, the data would already be out of date anyway...
Re: (Score:2)
There are two items with shuffling data. One is bandwidth. We all know to never underestimate the bandwidth of a station wagon full of tapes (or these days, an EV CUV to keep up with the times.)
Then there is latency. Latency is a critical item for a lot of items, and just bulk batching data won't work for things like real time monitoring, climate modeling, HFT, and many other time-sensitive things. Pushing petabytes as fast as they can go is an achievement.
Area51.dat (Score:1)
Xi and Putie love this tool! It can transfer all kinds of secret gov't files from the USA in a snap!
Re: Area51.dat (Score:2)
Hilarious! Wow what a zinger! The same country that still uses film-based spy satellites somehow has the resources to tap exabyte internet... with washing machine CPUs!
Ah ha ha! What a knee-slapper!
Re: (Score:1)
SMIC already has process nodes under one nanometer going to be ready by 2025, even better than Intel's 18A. Russia is also borrowing that as well. Not sure where you got that.
Re: (Score:1)
Their shit works because they tune instead throwing it out and starting over like American contractors.
And those are all stalkery advertising profiles (Score:3)
Everything you do, eveyrthing you say, everything you think, Google Knows.
Sure, f***in' go (Score:5, Funny)
But I can't say blacklist.
Re: (Score:2)
We can shorten that to "effinmo".
Coarse (Score:2)
Why don't you just call it Fuckinggo Google, we know what you meant.
Just effingdoit!
Re: Coarse (Score:2)
It reminds me of the backup software called Memeo, which is sort of fun in Spanish. (I pee myself)
Re: Coarse (Score:2)
Re: (Score:2)
A remnant of the days when Google valued whimsy.
Yah, a lot of programs have hidden gems in their name, like a friend that named his software programs after pR0n stars.
Re: (Score:2)
Well. there was the time I named my workstation after the floating island from Gulliver's Travels, Laputa. I couldn't figure out why some coworkers were scandalized.
I later changed its name to Elvis. That what when I pinged it it would say "elvis is alive". About a couple months later the ping program was changed so that it showed ping times rather than "is alive"; oh well...
Having fun names was a tradition. And then in the midst of all the Unix workstations named after animals or myths or whatnot, the IB
Name = marketing ploy to preteen boys Re:Coarse (Score:1)
You know every middle school boy will have this product's name indelibly drilled into their memory by the time they hit high school (or next Tuesday, whichever comes first).
Re: (Score:2)
Re: (Score:2)
Product names internally often would be what it was known by at the local office where the person who developed it happened to be, or something you would say at lunch to not talk shop in public but refer to it anyways. Likely this is something that was already mature and dog-food within the company 12-14 years prior. From what I remember it was astounding how people outside Google had little to know idea whatsoever what was being developed. Public information was stale by at least 5-10 years.
Gasses up (Score:2)
the old big block Impala station wagon, fills the back with M2 drives and heads for the Transatlantic Tunnel [HURRAH!].
Re: (Score:2)
Do we tell him now that there's no tunnel, or wait and watch what happens?
Old joke (Score:2)
How do you put an elephant into a Safeway grocery bag?
A hint, you take the "s" out of "safe", and you take the "f" out of "way".
"Effingo", eh? (Score:2)