Where Facebook Stores 900 Million New Photos Per Day 121
1sockchuck writes: Facebook faces unique storage challenges. Its users upload 900 million new images daily, most of which are only viewed for a couple of days. The social network has built specialized cold storage facilities to manage these rarely-accessed photos. Data Center Frontier goes inside this facility, providing a closer look at Facebook's newest strategy: Using thousands of Blu-Ray disks to store images, complete with a robotic retrieval system (see video demo). Others are interested as well. Sony recently acquired a Blu-Ray storage startup founded by Open Compute chairman Frank Frankovsky, which hopes to drive enterprise adoption of optical data storage.
They could save space (Score:4, Funny)
If anyone ever asks to see the image again, they can just show one that is "close enough" and nobody would ever know the difference.
I personally, have never posted a photo to Facebook, so I'd be OK with that.
Re: (Score:1)
It's a bit like memory that way. You have some short term memory, and those become long term memory, which you can never really recall exactly. In some ways, this might be the solution to the "problem" of the internet never forgetting.
Re: (Score:2)
Yeah computers have a system like that, it is called memory.
Re: (Score:2)
Sure: "Show me the picture of me and my wife on the beach 10 years ago"
Wife: "Who the hell is that in the picture with you? And when did that happen?"
Re:They could save space (Score:5, Interesting)
They could just delete most of the photos after they age a bit, analyzing it with some of their AI whiz-bang software....
More than a few of my [real world] friends use facebook as their archive for photos, eschewing local or cloud-based storage for their historical family photos. They would be unhappy if facebook were to randomly start deleting photos just because they've been on facebook for a period of time.
.
Of course, I've told those friends that facebook may not have the same photo-preservation goals as they do, but they seem to be unconcerned.
Re: (Score:1)
More than a few of my [real world] friends use facebook as their archive for photos, eschewing local or cloud-based storage for their historical family photos.
. If they are storing their photos on facebook, they ARE storing them in the cloud.
Of course, I've told those friends that facebook may not have the same photo-preservation goals as they do, but they seem to be unconcerned.
Why would they be concerned? Ignorance is bliss.
Re: (Score:3)
.... If they are storing their photos on facebook, they ARE storing them in the cloud....
In a general sense, correct.
.
However, when I said "cloud-based storage" I meant the cloud service was a storage service, not a social media service. If I had meant facebook, I would have said cloud-based social media service.
Re: (Score:2)
If they are storing their photos on facebook, they ARE storing them in the cloud.
It depends on what you mean "store". Dictionary.com [reference.com] provides this as a definition: "to accumulate or put away, for future use". (emphasis mine)
I don't think Facebook guarantees future retrieval, so it is probably not proper to classify it as storage.
Re: (Score:2)
FTFY. I can kinda understand posting stuff to Farcebook so others can view it, but using it as your primary storage medium? That's at least a dozen different kinds of wrong.
Re: (Score:1)
Long time passing...
Re: (Score:2)
Facebook seems to have your friends in mind, at least for now. They have a system where old photos are store quite cheaply, because they simply fail to display the first time you try to view them. By giving up on storing them in a way that can serve a web page hit, Facebook can be quite cheap (though I hear they use powered-down HDDs, not optical - and Western Digital has a new line of HDDs just for this purpose).
Re: (Score:2)
More than a few of my [real world] friends use facebook as their archive for photos
hahahahahahahahahahaha
Of course, I've told those friends that facebook may not have the same photo-preservation goals as they do, but they seem to be unconcerned.
So what makes you think they would be unhappy if facebook started deleting their photos? Apparently they don't care :p
Re: (Score:1)
Replace them (Score:2)
After 3 months of no views, just replace them with a goatse image.
That way, you only need to store one image which replaces 99.999% of all pics uploaded. No need for complex storage solutions!
Another advantage would be that you can serve it really, really fast. No wait time!
Re:Replace them (Score:5, Funny)
After 3 months of no views, just replace them with a goatse image.
Dear God, there is more than one!?!
Re: (Score:2)
1. Besides hello.jpg, there's giver.jpg.
2. The goatse image (hello.jpg) comes from a set of 40 photos.
You can find more in the proper encyclopedia [encyclopediadramatica.se].
Delete? (Score:2)
Re:Delete? (Score:5, Insightful)
What happens when a user wants to delete an image permanently.
What gave you the idea that's a service Facebook offers?
Re: (Score:3)
The average person doesn't care about long term punishments when the short term gains are attractive. This is why I use Facebook. But I treat Facebook like a loud speaker, it's a great place to share my idiotic ideals but I try to avoid saying anything damaging/damning. (btw, this is what we call acceptance, I long ago welcomed our new Facebook overlords).
Re: (Score:2)
I'm glad to see someone besides me on /, isn't terrified of Facebook.
I use it and I think it's relatively harmless as long as you understand, as Rasperin says, it's a loud speaker. I expect everything I post on FB will be available to everyone, everywhere, forever. I long ago, many years before Facebook was a thing, figured out that if I never posted anything online I wouldn't want my sainted mother to see, I'd never have anything to worry about*. I speak my mind freely, but I would have no problem if my
Re:Delete? (Score:4, Informative)
I see you haven't read Facebook's terms of service.
There is no delete.
Re: (Score:2)
So long as there are no links to the image, it is effectively "deleted". Same as magnetic storage. You just null the index, you don't actually go back and wipe the data back to zeros. Technically the offending bits still reside on the disk, but it's close enough if there is no way to access the data short of using forensic tools.
Re: Delete? (Score:3)
Another implementation would be to encrypt each item with a unique key and destroy the keys, rather than the underlying item, in a delete event, such that not even forensic tools would have a reasonable chance at recovery once the key-storage media has been re-written.
Re: (Score:2)
That is certainly a very secure way to do it - but of course they probably would have a backup of their index, and thus the keys. At some stage, you have to declare "good enough!" - and for 40+ years the removal of the index entry has been a "delete". We go to extra lengths for a "secure" delete, and they would have to take some extra steps here as well... but it is hard to speak intelligently without knowing the details :)
Re: (Score:2)
Another implementation would be to encrypt each item with a unique key and destroy the keys, rather than the underlying item, in a delete event, such that not even forensic tools would have a reasonable chance at recovery once the key-storage media has been re-written.
Then you'll need cold storage for all those keys you never use. Which, of course, can't be deleted unless they're encrypted with yet more keys, which will themselves need cold storage, so you have to....
Re: (Score:3)
No, keys are small enough to store without needing cold storage.
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
There was an uproar about not being able to trace a user account just two days ago regarding a revenge porn case [bbc.co.uk] in Holland.
Now, how are they going to physically remove data from a cold storage solution? I highly doubt they'll be using R/W discs as removing the data would require wiping the disc and rewriting 50gb of data again.
Re: (Score:2)
Title is wrong (Score:3)
Should have read:
You won't believe this one weird trick [datacenterfrontier.com] Facebook uses to store data!
Other than that, fascinating look at how all that data is being stored and retrieved.
Re: (Score:2)
https://xkcd.com/908/ [xkcd.com]
There is a lot of caching
All I can say... (Score:2)
is that their monthly AWS fees must be ENORMOUS!
Re:All I can say...(question) (Score:2)
As in, when will FB conclude that it again needs to widen its revenue stream portfolio, and it therefore makes sense to offer its own version of AWS?
Any predictions on FBWS?
And there's the FB hardware development division, a business unit that so far has also remained in-house but has its own revenue potential. I think people tend to underestimate MZ's am
FB hardware may be lucrative... (Score:2)
It might be that using Blu-Ray autochangers may be a very useful thing to have, especially for something that can fill the gap between HDDs and LTO tapes for backups [1].
The pathetic thing is that this technology isn't new. We used to have 100, 200, even 400 disk CD and DVD carousels. By replacing the CD reader with a burner, and using 128 GB BDXL media, that means tens of terabytes of tamper-resistant (important with all the ransomware out there) WORM storage.
The trick is getting BD media into the teraby
Re: (Score:2)
Re: (Score:2)
My last spindle of 25 GB BD-Rs cost me maybe $0.60 each or so. I could drive down to Fry's right now and pick up a spindle for about $0.80 each. A 4x increase in storage density isn't worth a two-order-of-magnitude increase in price. I would be surprised if Farcebook didn't
computer output to laser disc (Score:1)
is it cold in here or what?
Going on for a while (Score:4, Informative)
I've noticed large latency for rarely used pictures in FB for over eight months now, and by large latency I mean visit the page, then come back the next day to see the next batch of > 5 year old pictures and wait another day for the final batch of ~10 years ago pictures.
Re: (Score:2)
Joking aside - its always good practice to have electronic AND hard copies (optical disc, microfiche paper) of all critical data including copies off site. That way even if some hackers from somewherestan manege to totally trash the companies electronic systems the data can still be recovered.
Re: (Score:2)
What critical data? Personal? Business?
At what point is it critical enough to go out of your way to store terabytes of data on CD/DVDs? Isn't an offline HD good enough?
I have done the following for a long time and I believe this is more than enough for most businesses
1. Backup to NAS (or equivalent)
2. Backup to offline disk (done monthly but could be done more often depending on business requirements)
3. Offsite Backup on the west coast (We are on the east coast)
At what point are you spending too much money
Re: (Score:2)
What I don't get is why FB doesn't just use tape. Tape drives are expensive, but the media itself is cheap -- LTO-4 cartridges are $15 apiece, and tape is a true archival grade media.
Plus, with tape, you copy it to that, yank the tapes out of the autochanger, and toss them in an unused corner of a room. Tapes take 0 watts in storage (other than what it takes for HVAC), so other than physical access concerns, they are easily stashed and will remain usable for quite a long time.
If any industry needs a kick
Re: (Score:2)
What I don't get is why FB doesn't just use tape
Because of the seek time. They still want the content available and the BlueRay method yields a 10 second delay from what I read (I may have read that wrong).
Plus, with tape, you copy it to that, yank the tapes out of the autochanger, and toss them in an unused corner of a room. Tapes take 0 watts in storage (other than what it takes for HVAC)
They can't just toss it. That's the whole point of the article. They still need access on demand.
I think that the BlueRay solution is cheap too. The article was making reference to how much colder that area was (because of the lower HVAC requirements I assume).
Re: (Score:3)
There are not 900 million unique pictures per day. (Score:1)
People upload the same memes all the time. Just hash and store the common images and you'll reduce the unique photos to one or two unique images per day. :)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
Still, they could benefit from some proprietary algorithm that better compresses common facebook-photo features. E.g. a duck lip algorithm.
Re: (Score:1)
They resize them first, then compress. A 3~5mb pic is stored around 10% of the uploaded size.
Amazing (Score:4, Insightful)
Wow, they discovered HSM [wikipedia.org] only 40 years after it was introduced. Amazing.
Re: (Score:3, Insightful)
Pointless arrogant comment.
Nobody claimed it was new or that they had reinvented anything. They just applied modern technology to a well know strategy to solve a known problem. In the modern age of storage and data centers I have yet to see this (not to say that nobody has done it).
When someone shows you an electric car do you tell them cars have had 4 wheels since before 1903? I assume you do.
Facewhat? (Score:1)
blu ray? (Score:1)
Seems like there could be an easier solution to this: hard drives in racks. No robots, no optical drives, and no blu ray discs.
One 500gb hard drive already has 10x the amount of storage as a dual layer bluray. In fact, a 10 pack of dual layer blu ray discs on amazon costs twice as much as a 500gb 3.5" drive. Am I missing something?
shoulda read the article, bro (Score:2)
it would have answered your questions, and you wouldn't have looked like a tool, and i wouldn't have mocked you. the world would have been a better place! if only.
Re: (Score:1)
Re: (Score:2)
Electricity use - did you even watch the video? Of course not. Also, the data survives a drive failure.
What I wonder is why they think this is better than LTO6, which already has robots etc COTS solution. It's possible, maybe, that it takes less space. It is resilient to stray magnets in a way tapes maybe wouldn't be - but is that a common issue with LTO?
Re: (Score:1)
Re: (Score:2)
3 TB will fit on 120 25-GB BD-Rs. At 40 cents each [newegg.com], that's $48 in media costs. If you do like I do and reserve 20% for dvdisaster error-recovery data, you're still only looking at $60.
A 3 TB WD Green will set you back $95 [newegg.com]. (Want to spring for the NAS-rated Red drives instead? That'll be $119 [newegg.com]. Their absolute cheapest 3 TB hard drives are a couple of models from Seagate and Toshiba at $90 each.)
Re: (Score:2)
Re: (Score:1)
You are right. There's no reason for why you can't 'spin down' a rack of cheap server grade HDs to save power.
What happened to Bernoulli disks anyway?
This article gave them free load testing (Score:2)
First thing I did was open facebook and look to see what my oldest picture was. I don't have that many and it came up pretty quickly but I'm sure lots of other people had the same impulse.
Wait a minute... (Score:2)
Duh (Score:3)
In the cloud, obvs.
You don't think (Score:2)
The nsa built that huge data center in Utah for nothing?
Now if the nsa would just open an api to retrieve it....