WD's Monster 2TB Caviar Green Drive, Preview Test 454
MojoKid writes "Today Western Digital is announcing their WD20WEADS drive, otherwise known as
the WD Caviar Green 2.0TB. With 32MB of onboard cache and special power management algorithms that balance spindle speed and transfer rates, the WD Caviar Green 2TB not only breaks the 2 terabyte barrier but also offers an extremely low-power
profile in its standard 3.5" SATA footprint. Early testing shows it keeps pace with similar capacity drives from Seagate and Samsung."
backups (Score:5, Interesting)
What the hell do you do to back up your 2TB drive?
That much storage in a single unit seems kind of dangerous.
Nice (Score:5, Interesting)
The cache on this drive is 8x larger than the capacity of my first hard drive.
Re:That was quick, but normal (Score:5, Interesting)
Hard drive capacity is no longer exponential. They have hit some limits that are pretty hard to overcome. They're still making progress but not nearly as fast as in years past. Additionally, drives larger than 640 GB or so seem to have some reliability problems. I just recently upgraded my RAID arrays and went with smaller 640 GB drives because they have proven more reliable even though it would have been cheaper for me to go with newer larger drives.
The OP was wrong about it being one year anyway.
I hate hard-drives. I wish SSD technology would improve. It's not just price, the current drives are unreliable as hell. I trust regular old mechanical spinning devices a lot more than the current SSD crap.
Re:Powers of 2 (Score:5, Interesting)
Yeah... nevermind units that fit in with what's being measured
or are computationally convenient. What we really need are
arbitrary metrics created by beaurocrats on a power trip.
7 hours? (Score:4, Interesting)
It's worse than you think. Even if you have a place to back it up, the I/O rates on modern HDs aren't increasing nearly as fast as capacity. Reading at top speed, it would take almost 7 hours to pull all the data off this drive, even if you have someplace to put it. Similarly, if you're using it as part of a RAID set, it'll take that long to rebuild if you have a failure.
Pretty soon the MTBF on these drives will be a significant fraction of (capacity)/(read rate); that will make for fun all around.
32MB On Disk Cache (Score:4, Interesting)
I was thinking about this the other day, but, does the 32MB on disk cache really matter?
Think of it this way: the Linux kernel does disk caching with my free RAM (which I generally have more than 32MB of) according to some reasonable locality scheme (LRU or something).
If the HDD does the same caching according to nearly the same principles, won't the data on the disk cache nearly always be a subset of the disk cached in RAM? Meaning: doesn't the disk cache have no effect whatsoever?
I'm genuinely interested in an answer to this question, even if it is a little OT. Please burn a little karma for me :)
Re:Powers of 2 (Score:3, Interesting)
No, I'm quite sure the OP meant 1800 gigabytes, or about 15.46 terabits.
Established convention is that bytes are measured in binary (powers of 1024), and bits in decimal (powers of 1000). There's no need to introduce ridiculous-sounding terms like "gibibytes".
(Incidentally, I suspect there would be a lot less resistance to these newfangled units if they'd had the sense to pick names people could be expected to say with a straight face...)
Re:Powers of 2 (Score:5, Interesting)
Your argument would carry more weight if the manufacturers were doing this for the benefit of humans. In fact, they mix units - using the 1024-standard units for cache.
RAM specifications use the 2^x numbering because the device is physically constructed as a square grid of cells with power-of-two numbers of rows and columns. There's a direct mapping between bits on the address bus and the cell that is selected.
In the early days it was convenient to say that 1024 was close enough to 1000, so RAM sizes were quoted in "KB". However, the error in this increases with each step up in size. By the time you get to the TB scale it's no longer a reasonable approximation.
Magnetic storage does not have this constraint. The sector size is (arbitrarily) set at 512 bytes and hard drives usually have an even number of read/write heads, but apart from that there are no powers of two. The number of cylinders on the drive, and the number of sectors per cylinder, are arbitrary.
Communication speeds (e.g. "9600 baud", "100 Mb/s") are also not specified in power-of-two sizes, because they are natively dealing with individual bits and not power-of-two-sized chunks.
Therefore, there is nothing wrong with saying that a drive is "1 TB with 32 MiB of cache". As long as the manufacturer uses the SI and kilobinary notation correctly, users should not complain. Save your anger for the marketroids at WD who come up with features like "IntelliSpeed" in order to sell you a 5400-RPM drive and make you think you're buying a 7200-RPM one.
tl;dr version: Just use the damn GiB/GB notation consistently and get over it.
Re:backups (Score:3, Interesting)
RAID 5 gets increasingly dangerous as drive size increases. Going strictly by published unrecoverable read error rates (1 per 10^14 bits read on recent Seagate desktop drives), the chance of data loss during a rebuild can be very high - 48%, assuming a five-drive array of 1.5 TB drives with this failure rate.
Of course, these figures don't mean that 10^-14 of all bits read will result in a failure. It also doesn't mean that an error will manifest as a flipped bit - instead, one or more sectors will be unreadable (512 bytes each).
The risks of large capacity RAID5 arrays can be mitigated by using more reliable drives or a system that can handle more failures. WD's desktop drives have a failure rate of 1 in 10^15; enterprise drives from all manufacturers usually have a failure rate of 1 in 10^16. And RAID6 can sustain two failures of any drive. Using six 1.5 TB drives with a 10^-15 bit error rate and RAID6 has a failure rate during rebuild under 1%.
Re:Powers of 2 (Score:2, Interesting)
Since when is "the number of possible configurations of a 10-bit sequence" relevant to anything? 1024 is It was picked as a rough approximation of 1000 in the early days of binary computers, because users couldn't make sense of the raw binary data and an accurate decimal conversion was too computationally intensive. It was a kludge from day 1.
And even in the "computer world", measurements don't magically work out to be powers of two. Clock speeds, network speeds, platter density—all of these are limited by the manufacturing technology and the laws of physics, not binary math, so calculations aren't fundamentally any "cleaner" in binary than they are in decimal. The only reason why binary is used in the computer world is that computers do math in binary (or, to be pedantic, hexadecimal).
It makes sense for computers to do internal calculations in their own native base. But by that same admission, it makes sense for humans to do math in their own "native" base, which is almost universally base 10. So when a computer presents data to a human, it should present it in the most human-readable format possible.
I agree with you on one thing: SI prefixes aren't right because they're "official". They're right because they let you do math in your head.
Re:backups (Score:2, Interesting)
Re:Powers of 2 (Score:2, Interesting)
WD Layoffs (Score:2, Interesting)
Re:Powers of 2 (Score:2, Interesting)
Everybody else will have a much easier time using 1024 when dealing with computers.
When is it ever a problem other than when people forget the difference? Do morons really need help calculating things? 1024 MB = 1 GB after all. Pro tip: Most computers can do calculations for you.
Memory is byte-addressable, after all.
Clusters for storage are obviously using powers of 2.
Characters are 1 byte. (2 for foreigny-type stuff.)
Asking scientists to dumb it down is retarded.
Asking them to lie to users is retarded as well.
It's only an issue when some moron messes up and needs something to blame their mistake on. What, did you think 0x23 meant twenty-three? Does 777 mean seven hundred seventy-seven? No one ever bitches about hex, octal, or other schemes. It's always decimal cry babies who fucked up a calculation because they were lazy / out of their realm.
You don't see any computer scientists bitching about how they divided by 1024 instead of 1000 and fucked up.
Re:2,000,000,000,000 (Score:3, Interesting)
And, if you get into any visual effects for your hours of uncompressed video, you may eventually discover the joys of the multilayer exr format. It's currently becoming very popular for rendering multipass CG in the effects world, and basically it allows you to render one file which contains separate images for matte passes, diffuse shading, base color, specular lighting, reflection, etc., etc. The new workflows available make this technique suddenly much more popular. And, a lot of studios will render out 32 bit per channel instead of the 10 bit per channel listed in the table you linked. So, multiply the data rates at the top end of that chart by a factor of at least fifteen when talking about CG. Then double it because it'll be done progressive instead of interlaced. Then double it again because you wouldn't render to 4:2:0 - you'd render to 4:4:4. Then quadruple it because you want to work at 4k instead of 2k HD. Then, if you want to work at the high end in 8k, quadruple it again. (Though, not many people are currently working in 8k, and those who are do so at 24p, not 60p. So, the final quadrupling is probably unfair.)
Now, think about how many iterations you go through as the director says "Make this part faster. Make the wing flaps longer. Make Jar Jar die." Whatever. You wind up with umpteen version of a sequence that you want to keep around for review and comparison.
So, yeah. There are plenty of fields where 2 TB is a tiny joke, rather than being enough for a lifetime of data. I just happen to be involved with one of them. :) Some studios passed the 100 TB online mark years ago. Hollywood will take all the storage the engineers can give them. And big GPU's. :)