Please create an account to participate in the Slashdot moderation system

Build Your Own 135TB RAID6 Storage Pod For $7,384 239

Posted by CmdrTaco on Thursday July 21, 2011 @10:05AM from the for-all-your...-movies-yeah-thats-it dept.

An anonymous reader writes "Backblaze, the cloud-based backup provider, has revealed how it continues to undercut its competitors: by building its own 135TB Storage Pods which cost just $7,384 in parts. Backblaze has provided almost all of the information that you need to make your own Storage Pod, including 45 3TB hard drives, three PCIe SATA II cards, and nine backplane multipliers, but without Backblaze's proprietary management software you'll probably have to use FreeNAS, or cobble together your own software solution... A couple of years ago they showed how to make their first-generation, 67TB Storage Pods"

This discussion has been archived. No new comments can be posted.

Build Your Own 135TB RAID6 Storage Pod For $7,384

Load All Comments

Search 239 Comments Log In/Create an Account

Comments Filter:

My God... (Score:2)

by AngryDeuce ( 2205124 ) writes:

It's full of stars!!
- Re: (Score:3)
  
  by ByOhTek ( 1181381 ) writes:
  
  It's full of slashvertisements!!
  - Re: (Score:2)
    
    by x6060 ( 672364 ) writes:
    
    It's not a slashvertisement if they tell you how to build it yourself.....
    - Re: (Score:3)
      
      by TheRaven64 ( 641858 ) writes:
      
      Except for the bit about how it would be even better if you paid for their proprietary management software...
      - Re:My God... (Score:5, Insightful)
        
        by x6060 ( 672364 ) writes: on Thursday July 21, 2011 @11:50AM (#36835308)
        
        Did you notice how they even gave you the alternatives to their software? Essentially they are saying "We developed this for our own internal use and if you would LIKE to pay for it its cool. If you dont then there are these other free alternatives." But then again just because some company is mentioned in the article it MUST be a slashvertisment.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by TooMuchToDo ( 882796 ) writes:
        
        1) Build 135TB box 2) Install Openstack.org's Object Storage system (free! Amazon S3 API Compliant!) 3) Profit? Fuck profit! STORE ALL THE THINGS!
      - Re: (Score:2)
        
        by Archangel Michael ( 180766 ) writes:
        
        Who said it would be better? Not the article. The article said you'd have to do it yourself, and if you're THAT good, you might make something better. If you don't want to do it yourself (FreeNAS) you can opt for their software to manage it, which GASP HORROR, they charge for.
        Challenge laid down Open Source Community, make your own Management software and create a new FS that doesn't have the limitations of the EXT4 has without using LVN to get around those limitations, that is better than what these people
      - Re: (Score:2)
        
        by QuantumRiff ( 120817 ) writes:
        
        They don't sell the management software.. its internal. They only sell a backup service to end users, via a client.
    - - Re: (Score:3)
        
        by x6060 ( 672364 ) writes:
        
        You must be. They even GIVE you the free alternatives to their software. But in the same page they give you everything you need to do it yourself. You just have to add some hardware.
        
        Re: (Score:3)
        
        by x6060 ( 672364 ) writes:
        
        Thats all well and good that you can point out a few places that "Looks like marketing/advertising speak" and ignore the fact that they give you all the tools and knowledge to do it yourself and replicate the success of the "ultra-frugal Storage Pods". Hell they even give you some of the issues you will run into using one of the free alternatives AND EVEN GIVE YOU THE SOLUTIONS FOR THEM. OH! But there are a few places where they mention the product that they run off the very system they told you how to buil
  - Re:My God... (Score:5, Insightful)
    
    by Dillon2112 ( 197474 ) writes: on Thursday July 21, 2011 @11:24AM (#36835096) Homepage
    
    My problem with Backblaze is their marketing is very misleading...they pit these storage pods up against cloud storage and assert that they are "cheaper", as though a storage pod is anything like cloud storage. It isn't. Sure, there's the management software issue that's already been mentioned, but they do no analysis on redundancy, power usage, security, bandwidth usage, cooling, drive replacement due to failure, administrative costs, etc. It's insulting to anyone who can tell the difference, but there are suits out there who read their marketing pitch and decide that current cloud storage providers like Google and Amazon are a rip off because "Backblaze can do the same thing for a twentieth the price!" It's nuts.
    You can see this yourself in their pricing chart at the bottom of their blog post. They assert that Backblaze can store a petabyte for three years for either $56k or $94k (if you include "space and power"). And then they compare that to S3 costing roughly $2.5 million. In their old graphs, they left out the "space and power" part, and I'm sure people complained about the inaccuracies. But they're making the same mistake again this time: they're implicitly assuming the cost of replicating, say, S3, is dominated by the cost of the initial hardware. It isn't. They still haven't included the cost of geographically distributing the data across data centers, the cost of drive replacement to account for drive failure over 3 years, the cost of the bandwidth to access that data, and it is totally unclear if their cost for "power" includes cooling. And what about maintaining the data center's security? Is that included in "space"?
    On a side note, I'd be interested to see their analysis on mean time between data loss using their system as it is priced in their post.
    You could say the Backblaze is serving a different need, so it doesn't need to incur all those additional costs, and you might be right, but then why are they comparing it to S3 in the first place? It's just marketing fluff, and it is in an article people are lauding for its technical accuracy. Meh.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Revotron ( 1115029 ) writes:
      
      Suits will be suits. Backblaze proudly boasts that they're a great offsite backup solution, but they will quickly tell you that they are not a "cloud storage" provider. Their only business is offsite replication. They don't hide the fact that if you upload 500GB of data and then delete it off your computer, it will be removed from their systems as well.
      
      You didn't read the article. You know how I know that? They explicitly state that they don't have any costs for replacement hard drives over 3 years,
    - Re:My God... (Score:4, Insightful)
      
      by Archangel Michael ( 180766 ) writes: on Thursday July 21, 2011 @12:27PM (#36835632) Journal
      
      First: ALL Marketing is misleading. That is what marketing does. Accentuate the positive, eliminate the negative. So complaining about that is just idiotic.
      Second: You could have a couple dozen Backblaze units, pay for a tech to monitor them 24/7/365 and replace all the drives twice over for what Amazon charges for the same thing. Sure that doesn't included cost for premises, and HighSpeed Internet to multiple locations. But still, that is aggregated with all the other clients.
      Third: what are you paying for in the "cloud", I mean besides ethereal concepts. Does Amazon tell you how they do things? You probably know less about Amazon (and the others) setup so you're comparing something you know something about (not everything) verses something you know almost nothing about, and the complain that they aren't doing it in a comparable way. You don't know.
      Fourth: Your basic assumption is that Backblaze has no contigency for drive replacement, which is false. Since these are "new" drives there might be insufficient data about failure rates and therefore the actual cost of replacement (never mind warranties) or having drives in both Hot and Cold Spare setups. I'm sure that Backblaze in their $5/MO service figures what it costs to store data, have spares, keep the Datacenter running and profitable. Even if they double the cost to $10, it still puts the others to shame.
      Have you compared the data loss rates for the last three years between Amazon and Backblaze? Can you even compare or is that data held secret (see point 1b). My point here, is that you're pulling shit out of your ass and thinking it doesn't stink. Even if it isn't directly comparable, it is at least in the realm of consideration, EVEN if everything you said is true. And at 10 times less in cost, that can buy a lot of redundancy. It is just a matter of perspective.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by TooMuchToDo ( 882796 ) writes:
        
        Amazon's S3 is based off of MogileFS (the concept, not the code): http://danga.com/mogilefs/ [danga.com]
        And if you want to run an S3 compliant system internally, you'll us openstack.org's object storage system:
        http://www.openstack.org/projects/storage/ [openstack.org]
        Ability to provide object storage services at multi-petabyte scale
        Free open source software, no licensing frees, ‘open-core,’ or ‘freemium’ model
        Written in python; easy to differentiate your offering with extensions and modifications
        Compatibility a
- Re: (Score:2)
  
  by Kjella ( 173770 ) writes:
  
  Pr0n stars? Because we all know what it's really full of...
Already approaching Petabytes? (Score:2)

by Hermanas ( 1665329 ) writes:

Wow, are we already approaching Petabyte clusters? I'm still getting used to Terabyte!
- Re: (Score:2)
  
  by Narnie ( 1349029 ) writes:
  
  If I ever have to admin a Petabyte cluster, I'd name it Petabear.
- Re: (Score:2)
  
  by corbettw ( 214229 ) writes:
  
  I know, it's crazy! Storage numbers are increasing faster than the national debt!
- - Re: (Score:2)
    
    by gman003 ( 1693318 ) writes:
    
    According to TFA's TFA, the company has a total capacity of 16 petabytes, using only 201 pods (many being the old 1.0 pods with 67TB storage).
Not enough (Score:3)

by bryan1945 ( 301828 ) writes: on Thursday July 21, 2011 @10:12AM (#36834306) Journal

For a true porn collector yet.

Share
twitter facebook
- - Re: (Score:2)
    
    by rbrausse ( 1319883 ) writes:
    
    fun fact: porn industry has problems with high definition [nytimes.com]
    The high-definition format is accentuating imperfections in the actors — from a little extra cellulite on a leg to wrinkles around the eyes. [..] "The biggest problem is razor burn," said Stormy Daniels, an actress, writer and director. "I'm not 100 percent sure why anyone would want to see their porn in HD."
This is a huge step forward (Score:3, Funny)

by mugurel ( 1424497 ) writes: on Thursday July 21, 2011 @10:17AM (#36834338)

for both internet security and privacy: each of us can now store his own local copy of the internet and surf offline!

Share
twitter facebook
- Re: (Score:2)
  
  by AmberBlackCat ( 829689 ) writes:
  
  That would actually be nice. If every site I ever went to was cached locally. Like having a browser cache with unlimited size. It would be miles better than archive.org, if you remember a site from years ago and wish you could go back. Even better if it prefetched links you never clicked on.
- Re: (Score:2)
  
  by demonbug ( 309515 ) writes:
  
  for both internet security and privacy: each of us can now store his own local copy of the internet and surf offline!
  Of course, with my 150GB/month bandwidth cap it is going to take ~70 years to fill it up...
- Re: (Score:2)
  
  by Pharmboy ( 216950 ) writes:
  
  wget -m -p http://*
  
  Just run that in your cron.daily scripts and you are good to go!
But you cant use it without getting too hot? (Score:2)

by drolli ( 522659 ) writes:

Or can somebody tell me if the cooling of the HDs is ok if they are stacked like in the picture?
- Re: (Score:2)
  
  by blackraven14250 ( 902843 ) writes:
  
  With those gigantic fans, and the track record they have, it's probably ok.
- Re: (Score:3)
  
  by Black.Shuck ( 704538 ) writes:
  
  It's probably fine [imgur.com].
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  First, it depends on airflow. That is pretty close to optimal in the design. Second, you can monitor disk temperature and even have an emergency slowdown or shut-off if they overheat. Monitoring and shut-down is easy to script, maybe half a day if you know what you are doing.
- Re:But you cant use it without getting too hot? (Score:4, Interesting)
  
  by demonbug ( 309515 ) writes: on Thursday July 21, 2011 @11:47AM (#36835272) Journal
  
  Or can somebody tell me if the cooling of the HDs is ok if they are stacked like in the picture?
  According to their blog post about it, they see a variation of ~5 degrees within unit (middle drives to outside drives) and about 2 degrees from the lowest unit in a rack to the highest. They also indicate that the drives stay within the spec operating temperature range with only two of the six fans in each chassis running.
  Keep in mind these are 5400 RPM drives, not the 10K+ drives you would expect in an application where performance is critical. These are designed for one thing - lots of storage, cheap. No real worries about access times, IOPS, or a lot of the other performance measures that a more flexible storage solution would need to be concerned with. These are for backup only - nice large chunks of data written and (hopefully) never looked at again.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by WuphonsReach ( 684551 ) writes:
  
  Or can somebody tell me if the cooling of the HDs is ok if they are stacked like in the picture?
  
  It doesn't take much airflow at all to keep drives down around 35-40C. Even a light breeze can be enough to drop drive temperatures 5-10C. They're only 5-10W devices (for 3.5" drives) which means they're easy to cool in comparison to the 100-200W video cards or the 95-150W CPUs.
Can't actually store 135TB of data (Score:5, Interesting)

by gman003 ( 1693318 ) writes: on Thursday July 21, 2011 @10:21AM (#36834390)

The article says it uses RAID 6 - 45 hard drives are in the pod, which are grouped into an arrays of 15 that use RAID 6 (the groups being combined by logical volumes), which gives you an actual data capacity of 39TB per group (3TB * (15 - 2) = 39TB), which then becomes 117TB usable space (39TB * 3 = 117TB). The 135TB figure is what it would be if you used RAID 1, or just used them as normal drives (45 * 3TB = 135TB).
And these are all "manufacturer's terabytes", which is probably 1,024,000,000,000 bytes per terabyte instead of 1,099,511,627,776 (2^40) bytes per terabyte like it should be. So it's a mere 108 terabytes, assuming you use the standard power-of-two terabyte ("tebibyte', if you prefer that stupid-sounding term).

Share
twitter facebook
- Re: (Score:3, Informative)
  
  by GameboyRMH ( 1153867 ) writes:
  
  A manufacturer's terabyte would be 1,000,000,000,000 bytes.
  - - Re: (Score:2)
      
      by Wildclaw ( 15718 ) writes:
      
      I haven't checked how Hitachi does it, but that's how Seagate and Western Digital do it.
      Bullshit. Neither of my new 2TB Western Digital disks come with 2048*10^9 storage space.
    - Re:Can't actually store 135TB of data (Score:5, Informative)
      
      by Kjella ( 173770 ) writes: on Thursday July 21, 2011 @10:50AM (#36834794) Homepage
      
      Hitachi:
      "Capacity - One GB is equal to one billion bytes and one TB equals 1,000GB (one trillion bytes) when referring to hard drive capacity."
      Western Digital:
      "As used for storage capacity, one megabyte (MB) = one million bytes, one gigabyte (GB) = one billion bytes, and one terabyte (TB) = one trillion bytes."
      Seagate (PDF product sheets):
      "When referring to hard drive capacity, one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, equals one trillion bytes."
      So no, no and more no. Sometimes there really should be a "-1, Wrong" moderation...
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by gman003 ( 1693318 ) writes:
        
        Huh. That's odd - I distinctly remember seeing otherwise. Oh well - guess I was wrong.
        
        Re: (Score:3)
        
        by Solandri ( 704621 ) writes:
        
        Some marketer at Maxtor(?) started the transition from the 2^20 definition of MB to 10^6 for HDDs in the mid-1990s. The (at the time) smaller HDD manufacturers like Western Digital quickly followed suit. Seagate was one of the later ones. IBM (now Hitachi) was the last one to make the switch - they held out until about 2000.
- Re: (Score:2)
  
  by Inda ( 580031 ) writes:
  
  Tell me about it!!!
  £4,561.68 still sounds like a steal. In fact, I might just steal one and save even more!
  I actually spend more than that on food for the family per year. I wonder...
- Re: (Score:2)
  
  by complete loony ( 663508 ) writes:
  
  Data is also duplicated across different pods so you can lose one due to power supply issues and not care for a while. RAID across local groups of disks does seem a bit pointless when you already have a layer of redundancy across the whole rack.
- Re: (Score:2)
  
  by ari_j ( 90255 ) writes:
  
  Just a small quibble: RAID level 1 would give you a capacity of 3TB with an absurd amount of redundancy. Level 0 is that one that would give you 135TB striped across all 45 disks.
- Re: (Score:2)
  
  by QuantumRiff ( 120817 ) writes:
  
  There is "cloud storage" management software that would be awesome on these boxes (although they might benefit from a bit more CPU and ram, and some more Gigabit nics.. When I read this blog article yesterday, I immediately went back to openstack.. The examples for Openstack Storage don't even bother with raid, since the objects on the drive will be replicated to multiple other servers automatically. This could be very, very interesting..
  http://www.openstack.org/projects/storage/ [openstack.org]
- - Re:Can't actually store 135TB of data (Score:4, Informative)
    
    by gman003 ( 1693318 ) writes: on Thursday July 21, 2011 @10:40AM (#36834658)
    
    Common usage for the past 50 years has been that, in the context of computer memory capacity, 'tera-" is to be interpreted as 2^40 (with "giga-" being 2^30, and so on). You'll note that I included a sidenote on 'tebibytes" to appease revisionists like you.
    PS: It's rather ironic that someone accusing me of bastardizing SI prefixes can't even spell 'terabytes" properly. Unless you're somehow referring to Earth Bytes or something.
    
    Parent Share
    twitter facebook
    - - Re: (Score:2)
        
        by marcosdumay ( 620877 ) writes:
        
        Kelvin Mega what?
        On a side note, I know plenty of people that use the SI as if it was case insensitive. Other common bastardizations:
        Caling the unit "linear metter", and abreviating it as "ml" (in latin languages of course, english speaking people are more likely to use some custom unit for that)(again, propably case insensitive, so replace it by Ml, mL or ML if you like).
        Abreviating second as "sec", square metter as "sqm" or "quadm", "mqd", etc.
  - Re: (Score:3)
    
    by Just Some Guy ( 3352 ) writes:
    
    Stop bastardizing the SI prefixes. Terra is the prefix
    The irony: it is strong with this one.
- - Re: (Score:2)
    
    by gman003 ( 1693318 ) writes:
    
    Dammit, why do I keep getting those mixed up?
RAID-6 (Score:2)

by hjf ( 703092 ) writes:

RAID-6, really?
After 5+ years working with ZFS, personally, I wouldn't touch md/extX/xfs/btrfs/whatever with a 10 foot pole. Solaris pretty much sucks (OpenSolaris is dead and the open source spinoffs are a joke), but for a storage backend it's years ahead of Linux/BSD.
Sure, you can run ZFS on Linux (I did) and FreeBSD (I do), but for huge amounts of serious data? No thanks.
- Re: (Score:2)
  
  by TheRaven64 ( 641858 ) writes:
  
  Sure, you can run ZFS on Linux (I did) and FreeBSD (I do), but for huge amounts of serious data? No thanks.
  What do you count as a serious amount of data? And what makes the FreeBSD version inferior in your opinion (aside from being a slightly older version - I think -STABLE now has the latest OpenSolaris release)?
  Genuinely curious: I'm thinking of building a FreeBSD/ZFS NAS and I'd like to know if there's anything in particular that I need to look out for. Performance isn't really important, because most of the time I'll be accessing it over WiFi anyway, which is liekly to be far more of a bottleneck than any
Anything over 2TB should be ZFS... (Score:3)

by QuietLagoon ( 813062 ) writes: on Thursday July 21, 2011 @10:33AM (#36834544)

... if you really care about the data. ZFS has built-in so much more data integrity checks, and more extensive data integrity checks [oracle.com], than the vanilla RAID6 arrays.
.
Both FreeBSD [freebsd.org] and FreeNAS [freenas.org], in addition to OpenSolaris [opensolaris.org], support ZFS.

Share
twitter facebook
- Re:Anything over 2TB should be ZFS... (Score:4, Interesting)
  
  by brianwski ( 2401184 ) writes: on Thursday July 21, 2011 @01:43PM (#36836454) Homepage
  
  ... if you really care about the data.
  (Disclaimer: I work at Backblaze) - If you really care about data, you *MUST* have end-to-end application level data integrity checks (it isn't just the hard drives that lose data!).
  Let's make this perfectly clear: Backblaze checksums EVERYTHING on an end-to-end basis (mostly we use SHA-1). This is so important I cannot stress this highly enough, each and every file and portion of file we store has our own checksum on the end, and we use this all over the place. For example, we pass over the data every week or so reading it, recalculating the checksums, and if a single bit has been thrown we heal it up either from our own copies of the data or ask the client to re-transmit that file or part of that file.
  At the large amount of data we store, our checksums catch errors at EVERY level - RAM, hard drive, network transmission, everywhere. My guess is that consumers just do not notice when a single bit in one of their JPEG photos has been flipped -> one pixel gets every so slightly more red or something. Only one photo changes out of their collection of thousands. But at our crazy numbers of files stored we see it (and fix it) daily.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by rubycodez ( 864176 ) writes:
  
  Except OpenSolaris is dead, better to keep data on a filesystem that runs on living OS. FreeNAS is FreeBSD based, so we're down to one open source OS that supports ZFS.
file system (Score:3)

by roman_mir ( 125474 ) writes: on Thursday July 21, 2011 @10:34AM (#36834562) Homepage Journal

When you choose which file system to use, you should consider what the purpose of the storage is. If it's to run a database, you may want to rethink the decision to go with a journaling file system, because databases often their own journaling (like PostreSQL WAL), which actually means the performance will get reduced if you put a journaling file system underneath that. [postgresql.org] Just my 0.0003 grams of gold.

Share
twitter facebook
- Re: (Score:2)
  
  by QuantumRiff ( 120817 ) writes:
  
  They don't run databases on this storage. the ONLY way they access all this storage is via an HTTPS connection to the tomcat server running on the machine. They have some very, very interesting blog entries about how things scale when you go beyond a handful of servers.
  - Re: (Score:2)
    
    by roman_mir ( 125474 ) writes:
    
    That's not why I wrote the comment, I saw that the access is over http, I wrote it because this story is an ad for this company, but also it's talking about building a system like that for your own use, and if you do it for your own use, why would you do http only?
7K for software raid? and why a low end cpu? (Score:2)

by Joe_Dragon ( 2206452 ) writes:

Why not use a SAS card?
why have three PCIe cards that are only X1 when a x4 or better card with more ports has more PCI-e bandwidth and some even have there own RAID cpu on them.
Why use a low end I3 cpu in a 7K system? at least go to i5 even more so with software raid.
- Re:7K for software raid? and why a low end cpu? (Score:5, Informative)
  
  by drinkypoo ( 153816 ) writes: <drink@hyperlogos.org> on Thursday July 21, 2011 @10:45AM (#36834722) Homepage Journal
  
  Hardware RAID controllers are stupid in this context. The only place they make sense is in a workstation, where you want your CPU for doing work, and if the controller dies you restore from backups or just reinstall. Using software RAID means never having to try to get a rebuilder software to convert the RAID from one format to another because the old controller isn't available any more, or because you can't get one when you really need one to get that project data out so you can ship and bill.
  
  Parent Share
  twitter facebook
  - Re:7K for software raid? and why a low end cpu? (Score:4, Insightful)
    
    by pz ( 113803 ) writes: on Thursday July 21, 2011 @12:02PM (#36835412) Journal
    
    No. Hardware controllers are the right solution in this context. These pods are not designed for individual users, but for corporations that can afford stockpiles of spare parts, so replacing a board can be done easily. Using hardware controllers allows many more drives per box, and thus per CPU. A populated 6-CPU motherboard is going to be less reliable, dissipate more heat, require more memory, and likely be less reliable, than the special-purpose hardware approach that allows for a single CPU.
    Software RAID makes sense when you have a balance of storage bandwidth requirements to CPU capacity that is heavy on the CPU side. This box is designed for the opposite scenario, as the highly informative blog describes:
    http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ [backblaze.com]
    (Yes, I know, expecting someone to read the blog would mean that they would have to read the linked article and then click through to the original post, a veritable impossibility. Still, it is recommended reading, especially the part about their experience with failure rates and how they have *one* guy replacing failed drives *one* day per week.)
    
    Parent Share
    twitter facebook
- Re:7K for software raid? and why a low end cpu? (Score:4, Insightful)
  
  by gman003 ( 1693318 ) writes: on Thursday July 21, 2011 @10:47AM (#36834756)
  
  Because, for this project, raw storage capacity is much more important than performance. Besides, they claim their main bottleneck is the gigabit Ethernet interface - even software RAID, the PCIe x1, and the raw drive performance is less of a limiting factor.
  Yeah, in a situation where you need high I/O performance, this design would be less than ideal. But they don't - they're providing backup storage. They don't need heavy write performance, they don't need heavy read performance. They just need to put a lot of data on a disk and not break anything.
  PS: SAS doesn't really provide much better performance than SATA, and it's a lot more expensive. Same for hardware RAID - using those would easily octuple the cost of the entire system.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by nabsltd ( 1313397 ) writes:
    
    Besides, they claim their main bottleneck is the gigabit Ethernet interface - even software RAID, the PCIe x1, and the raw drive performance is less of a limiting factor.
    This is absolutely true. Even with a pair of bonded 1Gb Ethernet connections, it's not nearly enough to keep up with a PCIe x1 in the real world. I'm moving to a single 10Gb connection from each server to iSCSI SAN because of this.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Very simple: Best bang for the buck. Your approach just increases cost without any real benefit in the target usage scenario. For example, the i5 is just a waste of money and energy. Hardware RAID drives cost, but the only "advantage" is has is that it is easier to use for clueless people.
- They specified a RAID CPU (Score:2)
  
  by Quila ( 201335 ) writes:
  
  An Intel i3 540, more powerful than the CPU on most hardware RAID controllers.This thing will be doing very little other than handling the RAID sets.
  - But the raid cpu is on it's own where the system (Score:2)
    
    by Joe_Dragon ( 2206452 ) writes:
    
    But the raid cpu is on it's own where the system cpu has to do the video, networking, and the OS on top of doing the raid work.
Backblaze is speaking about scalability in SF (Score:4, Informative)

by Jim Ethanol ( 613572 ) writes: on Thursday July 21, 2011 @10:41AM (#36834664) Homepage

If you're in the SF Bay Area check out http://geeksessions.com/ [geeksessions.com] where Gleb Budman from Backblaze will be speaking about the Storage Pod and their approach to Network & Infrastructure scalability along with engineers from Zynga, Yahoo!, and Boundary. This event will also have a live stream on geeksessions.com.
Full Disclosure: This is my event.
50% discount to the event (about $8 bucks and free beer) for the Slashdot crowd here: http://gs22.eventbrite.com/?discount=slashdot [eventbrite.com]

Share
twitter facebook
- Re: (Score:3)
  
  by gpuk ( 712102 ) writes:
  
  Hi Jim
  I'm quite a few timezones East of you, meaning the live stream will start at 0300 local on Wednesday for me. I'm willing to tough it out and stay up to watch it if necessary but it would be much more civilised if I could watch a playback. Will it be available for download later or is it live only?
  It sucks I've only just learnt about geeksessions :( Some of your earlier events look awesome
Original blog post (Score:5, Informative)

by Baloroth ( 2370816 ) writes: on Thursday July 21, 2011 @10:44AM (#36834708)

Here [backblaze.com] is a link to Backblaze's actual blog entry for the new pods 135TB, and here [backblaze.com] is the original 67TB pods. The blog article is actually quite fascinating. Apparently they are employee owned, use entirely off-the-shelf parts (except for the case, looks like), and recommend Hitachi drives (Deskstar 5K3000 HDS5C3030ALA630) as having the lowest failure rate of any manufacturer (less than 1% they say).
I found it kinda amusing that ext4's 16TB volume limit was an "issue" for them. Not because its surprising, but because... well, its 16TB. The whole blog post is actually recommended reading for anyone looking to build their own data pods like this. It really does a good job showing their personal experience in the field and problems/not problems they have. For instance: apparently heat isn't an issue, as 2 fans are able to keep an entire pod within the recommended temperature (although they actually use 6). It'll be interesting to see what happens as some of their pods get older, as I suspect that their failure rate will get pretty high fairly soon (their oldest drives are currently 4 years old, I expect when they hit 5-6 years failures will start becoming much more common.) All in all, pretty cool. Oh, and it shows how much Amazon/ Dell price gouges, but that shouldn't really shock anyone. Except the amount. A petabyte for three years is $94,000 with Backblaze, and $2,466,000 with Amazon.
P.S. I suspect they use ext4 over ZFS because ZFS, despite the built in data checks, isn't mature enough for them yet. They mention they used to use JFS before switching to ext4, so I suspect they have done some pretty extensive checking on this.

Share
twitter facebook
- - Re: (Score:2)
    
    by Baloroth ( 2370816 ) writes:
    
    They mention that they have one guy dedicated to building new pods and replacing old drives. Out of ~9000 drives and ~200 pods, they replace ~10 drives per week, and with the RAID6 data redundancy the chance of losing data is absolutely minimal. RAID6 uses 2 drives for data parity, so I believe you would need 3 drives out of 45 to fail within a week to actually lose data. I suspect they would shut a pod down if 2 drives in it failed at the same time. Since the failure rate, including infant mortality, is on
  - Re: (Score:2)
    
    by Zemplar ( 764598 ) writes:
    
    That's bogus advice. ZFS supports hot-swap and you can replace drives as they fail, if you like, but one of the biggest benefits to ZFS is that ZFS corrects corruption other filesystems can't detect.
- - Re: (Score:2)
    
    by Baloroth ( 2370816 ) writes:
    
    True. And I'll be honest, I didn't really think of that. Also, I'm pretty sure Amazon's service is intended for use with way smaller amounts of data, in which case it becomes much more cost effective and reasonable. Still, when you could build and maintain servers with 25 times the amount of data for the same cost as Amazon, I'm inclined to say that Amazon is overcharging. It maybe that everyone else in the market does too, and I acknowledge that building that kind of infrastructure and software is no mean
... but you can't use it (Score:3)

by savanik ( 1090193 ) writes: on Thursday July 21, 2011 @10:51AM (#36834810)

With the latest bandwidth caps I'm seeing on my provider (AT&T U-verse), I can download data at a rate of 250 GB per month. So it'll take me 45 YEARS to fill up that 135 TB array. Something tells me they'll have better storage solutions by then.
In the meantime, I'm just waiting for Google to roll out the high-speed internet in my locale next year - maybe then I'll have a chance at filling up my current file server.

Share
twitter facebook
- Re: (Score:2)
  
  by GodfatherofSoul ( 174979 ) writes:
  
  Crazy enough, you can actually *buy* content instead of downloading it from Pirate Bay.
- Re: (Score:2)
  
  by pz ( 113803 ) writes:
  
  These pods are not intended for the individual user. Your ability to saturate a home pipe without filling up 135 TB isn't relevant.
Engineering competence does give an edge (Score:2)

by gweihir ( 88907 ) writes:

I did something a bit similar on a smaller scale about 9 years ago. (Linux software RAID, 12 disk in a cheap server). The trick is to make sure that you pay something like 70% of the total hardware cost for the disks. It is possible, it can be done reliable, but you have to know what you are doing. If you are not a competent and enterprising engineer, forget it (or become one). But the largest cost driver in storage is that people want to buy storage pre-configured and in a box that they do not need to unde
- Re: (Score:2)
  
  by Walker1337 ( 2400896 ) writes:
  
  But the largest cost driver in storage is that people want to buy storage pre-configured and in a box that they do not need to understand. This is not only very expensive, (when I researched this 9 years ago, disk part of total price was sometimes as low as 15%!), but gives you lower performance and lower reliability. And also less flexibility.
  You aint kidding. I have installed systems for people that cost hundreds of thousands of dollars and they cant even give me basic information in order to complete the install. How many disks to each head? No Idea. How big do you want your RAID groups? No idea. Excuse me sir this IP and Gateway are in different subnets can I have another? That last one has actually happened more than once.
One guy? (Score:2)

by EvilStein ( 414640 ) writes:

hmm.
What the hell else is Sean doing with his time? That's what the articles are really missing...
- Re: (Score:2)
  
  by h4rr4r ( 612664 ) writes:
  
  $2000 to connect 68 drives seems crazy cheap. A good raid controller can cost more than that.
- Re: (Score:3)
  
  by tomz16 ( 992375 ) writes:
  
  Nope, not at all... $2,000 is actually really cheap IMHO. Try to find a way to connect 68 drives cheaply (RAID cards and SATA multiplier backplanes are both pretty expensive). Don't forget that you also need a custom case, motherboard, ram, cpu, PS, and cooling for everything.
- Re: (Score:2)
  
  by gman003 ( 1693318 ) writes:
  
  Yes, 2TB drives are more cost-effective (price per terabyte) than the 3TB drives. But one of the major costs for Backblaze is power and space. They pay about $2,000 per month per rack in space rental, power and bandwidth, regardless of whether that rack is using 3TB drives of 300gb drives. So the difference in hardware costs is payed back by the increased density.
- Re: (Score:2)
  
  by hjf ( 703092 ) writes:
  
  The price also includes custom made cases, fans, the power supplies, and custom-made port multiplier SATA backplanes. The custom parts make it pretty expensive, I guess.
- Re: (Score:2)
  
  by b0bby ( 201198 ) writes:
  
  The 3TB drives are $6300 for 45 at newegg, and you'll need less cases/space/power for them - it's probably a wash in the end.
  - Re: (Score:2)
    
    by Savantissimo ( 893682 ) writes:
    
    The specific drives they recommend are $130 each (Hitachi Deskstar 5K3000 HDS5C3030ALA630 http://www.newegg.com/Product/Product.aspx?Item=N82E16822145490R [newegg.com] ), $5850 for 45, ~1% failure rate vs. the 5% they were getting from other drives.
- Re: (Score:3)
  
  by kiwimate ( 458274 ) writes:
  
  You might want to read the actual blog [backblaze.com] where they explain what they use in a bit of detail. This isn't my area of expertise either, but I do know that running 10 servers is very different from running 100 servers, which is also different from running 1000 servers. There are many questions that crop up that you really don't have to consider when you're down in the smaller arenas. (E.g. patch management - manually patching 10 servers is feasible and more cost effective than having an OTS solution; manually pa
- Re: (Score:2)
  
  by L4t3r4lu5 ( 1216702 ) writes:
  
  I wouldn't be surprised if the top of the case fit flush with the hard drive cases and was used as a heatsink. Alu top case, finned, with a bank of fans in push/pull configuration, and a hot/cold arrangement of ducting along the racks.
  
  That's how I'd do it, anyway.
  - Re: (Score:2)
    
    by cmiller173 ( 641510 ) writes:
    
    You would be surprised that there is a piece of foam between the top of the case and the drives if you RTFA!
- Re: (Score:2)
  
  by Anrego ( 830717 ) * writes:
  
  The multipliers make me more nervous!
  Seriously... my experience with sata multipliers has been that they should be avoided at all costs.
  - Re: (Score:2)
    
    by bill_mcgonigle ( 4333 ) * writes:
    
    Seriously... my experience with sata multipliers has been that they should be avoided at all costs.
    SAS multipliers with SATA drives is a better risk/cost balance, for the general case.
  - - Re: (Score:2)
      
      by Anrego ( 830717 ) * writes:
      
      Oh no doubt. I mean they are using these things reliably as you said, so I'm sure it works. Same can be said about the heat issues (though I guess that would be dependant on external cooling as well).
      Just saying that the mere mention of SATA multipliers makes me cringe and fear for my data/sanity :)
- Re:Feelin' HOT HOT HOT (Score:4, Informative)
  
  by hjf ( 703092 ) writes: on Thursday July 21, 2011 @10:41AM (#36834662) Homepage
  
  This is nothing new. You've never been in a datacenter before, kid. You can ask a grownup one day and he can take you there and you will feel the heat. And NOISE. No offense, but I think you're one of those gamer kids who builds rigs for max FPS, with esoteric water cooling and silent fans everywhere.
  Yeah, no, you don't need to pamper your hardware that much. Even laptop drives work way hot (60C+) for years with no issue.
  Most servers are built that way too. The Sun x4500 is extremely densely packed. And there are hundreds running just fine.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Lorien_the_first_one ( 1178397 ) writes:
    
    Thank you for pointing that out about laptop drives. I have one at home burning it up at over 50C.
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
  - Re: (Score:2)
    
    by Kjella ( 173770 ) writes:
    
    Well that noise are the massive fans that keep the temperature of the equipment fairly close to ambient. If you quiet down the fans, the room temperature won't change much but power-hungry components will suddenly be way, way above room temperature. I had a really crappy cabinet crammed with back-to-back disks, didn't think much of it until they started dying... checked the SMART data, oh 75C for the top drive... that's 50C or so above the ambient temperature in the room. Better cabinet with more space, mor
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Thermal design is highly non-intuitive. So you experiment, measure and have monitoring and automated emergency-shutdown in place. You do not even net fan-monitoring with this setup. Just very simple disk-temperature monitoring will tell you when a fan is down. My guess would be that they can tolerate one fan failure for some time and do a forced shutdown if two go down.
  This is for experienced engineers. I have done things like this before, and I think I could design both hardware and software for these boxe
- - Re: (Score:2)
    
    by Amouth ( 879122 ) writes:
    
    and different hardware/raid/multiplier/power harness setup..
    basically the same just updated - and worth an note about.. i wish they sold or someone sold the setup sans drives (or just the bare case) - it looks fun to mess with but don't have a lot of free time now days.
    - Re: (Score:2)
      
      by nabsltd ( 1313397 ) writes:
      
      i wish they sold or someone sold the setup sans drives (or just the bare case)
      TFA says the case is available from Protocase [protocase.com] for $875 in single unit quantities.
      A "pod" is just a standard x86 PC in this custom 4U case. Sure, it has a few specific extras, but all are standard, off-the-shelf hardware that you can easily buy. Appendix A in the Backblaze blog post [backblaze.com] gives every detail you need.
      If you start with just 15 hard drives (for a total of 45TB), then the price would be about $3300. You probably only save about $500 by using an standard case, because a decent one with room for 15 o
      - Re: (Score:2)
        
        by Amouth ( 879122 ) writes:
        
        opps - read the thing and missed the one off price on the line item - thanks for pointing that out.
    - - Re: (Score:2)
        
        by Amouth ( 879122 ) writes:
        
        5k just for the case? or is that everything sans drives?
- Re: (Score:2)
  
  by cmiller173 ( 641510 ) writes:
  
  Didn't the summary say so and provide a link to the previous story. Of course, in addition to the drives getting bigger they changed a couple other things (MB, memory, CPU, SATA cards, SATA multipliers, wiring), but it is the same case so sure it's the same.
- Re: (Score:2)
  
  by pnutjam ( 523990 ) writes:
  
  The redundancy is unit based, not component based. This makes alot of sense, it's what google does. You don't have to go for expensive proprietary parts, you just buy two commodity parts (or more).
- Re: (Score:2)
  
  by fuzzyfuzzyfungus ( 1223518 ) writes:
  
  I think that you are looking for redundancy at too small a scale: Yes, per-box, there is very little redunancy. RAID-6 makes it not completely useless; but a PSU going out will take out half the box, which will render it pretty useless until the PSU comes back online, and if the mobo dies, game over.
  
  However, as the pictures suggested, they are running rather a lot of these boxes. Their (proprietary) software layer handles storing data across all the boxes and presenting it in some useful-to-the-backblaze
- Re: (Score:2)
  
  by pnutjam ( 523990 ) writes:
  
  you could use openfiler, but you would want to swap some of your disk space for network controllers.
- Re: (Score:2)
  
  by Savantissimo ( 893682 ) writes:
  
  Number 1 thing would be more / bigger network links. I think this has used all its PCIe slots, so you might have to cut the capacity by 1/3 to put in a bigger network card - say a 4x Gigabit Ethernet card (cheap) or a 10gig card (more expensive, plus need the 10gig port to hook it to). Or get a motherboard with more slots.
  More speculatively:
  More RAM might help if you can set it up to cache the right things. Faster drives would help the IOPS (lower latency) but the bandwidth bottleneck is going to be the net
  - Re: (Score:2)
    
    by Savantissimo ( 893682 ) writes:
    
    See: http://bigip-blogs-adc.oracle.com/brendan/entry/test [oracle.com] for more about ARC, L2ARC, using SSDs with ZFS. With 128GB RAM, 550GB of SSDs, and 18TB of disk, the speedup was 8.4x over just the RAM and disks, with 20x less latency. YMMV with different workloads.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

My God... (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re:My God... (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re:My God... (Score:5, Insightful)

Re: (Score:2)

Re:My God... (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Already approaching Petabytes? (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Not enough (Score:3)

Re: (Score:2)

This is a huge step forward (Score:3, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

But you cant use it without getting too hot? (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re:But you cant use it without getting too hot? (Score:4, Interesting)

Re: (Score:2)

Can't actually store 135TB of data (Score:5, Interesting)

Re: (Score:3, Informative)

Re: (Score:2)

Re:Can't actually store 135TB of data (Score:5, Informative)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Can't actually store 135TB of data (Score:4, Informative)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

RAID-6 (Score:2)

Re: (Score:2)

Anything over 2TB should be ZFS... (Score:3)

Re:Anything over 2TB should be ZFS... (Score:4, Interesting)

Re: (Score:2)

file system (Score:3)

Re: (Score:2)

Re: (Score:2)

7K for software raid? and why a low end cpu? (Score:2)

Re:7K for software raid? and why a low end cpu? (Score:5, Informative)

Re:7K for software raid? and why a low end cpu? (Score:4, Insightful)

Re:7K for software raid? and why a low end cpu? (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

They specified a RAID CPU (Score:2)

But the raid cpu is on it's own where the system (Score:2)

Backblaze is speaking about scalability in SF (Score:4, Informative)

Re: (Score:3)

Original blog post (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

... but you can't use it (Score:3)

Re: (Score:2)

Re: (Score:2)

Engineering competence does give an edge (Score:2)

Re: (Score:2)

One guy? (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)