Apple Kicks HDD Marketing Debate Into High Gear 711
quacking duck writes "With the release of Mac OS X 10.6 Snow Leopard, Apple has updated a support document describing how their new operating system reports capacities of hard drives and other media. It has sided with hard drive makers, who for years have advertised capacities as '1 GB = 1,000,000,000 bytes' instead of the traditional computer science definition, and in so doing has kicked the debate between marketing and computer science into high gear. Binary prefixes for binary units (e.g. GiB for 'gibibyte') have been promoted by the International Electrotechnical Commission and endorsed by IEEE and other standards organizations, but to date there's been limited acceptance (though manufacturers have wholeheartedly accepted the 'new' definitions for GB and TB). Is Apple's move the first major step in forcing computer science to adopt the more awkward binary prefixes, breaking decades of accepted (if technically inaccurate) usage of SI prefixes?"
Its been done for years already (Score:5, Interesting)
Is Apple's move the first major step in forcing computer science to adopt the more awkward binary prefixes, breaking decades of accepted (if technically inaccurate) usage of SI prefixes?
No, its not any first major step. HDD makers already went there years ago, its established and people know better what it means. And even if I'm quite a nerd myself, I never think that 1 terabytes = 1 048 576 megabytes. Yeah it would be great if I remembered that or as many decimals in PI as possible, but no one really cares. It's a lot easier to remember and think that 1 terabyte is 1 000 000 megabytes, even if its not technically so because of binary system and even if I know that - I still think so just for the easy of it.
And its a mac. What did you think? It's as far from a nerdy computer as possible. Obviously they are going to use terms and units that non-geeky people understand.
Re:Its been done for years already (Score:5, Interesting)
Your example is bad because its the default one. 1 terabyte to 1024 gigabytes is easy. How quickly you calculate that to 4TB? 15TB? 492TB? Or for more better example, 405GB to MB's? Its just a lot easier to think 405GB = 405 000MB than start calculating it, while its kinda close anyway.
Benchmarks (Score:3, Interesting)
Should have been binary from the start... (Score:2, Interesting)
... truth is BASE 10 should have never been used for computers from the start. The storage hardware manufacturers just wanted to lie to make their products look better then they are (as per usual in business).
Hardware manufacturers being close to computer sciences really should have known better. By keeping the standard and just publishing both BASE2 and BASE10 just like how where I live we have english AND french words on packaging.
Re:bug (Score:2, Interesting)
Wait 8 weeks (Score:3, Interesting)
The difference between 2^30 and 10^9 is about 7.4%. Disc drive capacity has been growing at least as fast as CPU power, doubling every 18 month, for as long as I can remember. This means that it takes about 8 weeks for drive capacity to grow by 7.4%. This should mean that by the time the marketing literature has made it through the bureaucratic process of being reviewed for release it will probably be correct!
Re:computers user base 2 (Score:4, Interesting)
Re:Okay, so technically, (Score:2, Interesting)
The prefixes are different already, there is no centibyte for millibyte, it's not really a scientific measurement to begin with.
So there is no problem with using them in the original context (2^10....)
And no logical reason whatsoever for the terms (KiB,MiB,GiB) to have been created in the first place
Silly names (Score:4, Interesting)
Binary prefixes for binary units (e.g. GiB for 'gibibyte') have been promoted by the International Electrotechnical Commission and endorsed by IEEE and other standards organizations, but to date there's been limited acceptance
Nobody's going to use an annoyingly cutesy word like "gibibyte", which seems just as silly now as it did ten years ago [slashdot.org]. Using the abbreviated prefixes might be a good idea, though.
Just for reference (since some people are freaking out about how much space they're "losing") here's the percentage difference between the SI and binary sizes:
Kilobyte: 2.3%
Megabyte: 4.6%
Gigabyte: 6.9%
Terabyte: 9.1%
Petabyte: 11.2%
Exabyte: 13.3%
So for the foreseeable future your hard drive will be about 10% smaller than advertised. Not a big deal, IMHO (it's not like you're paying for the missing bits), but still worth pointing out.
Re:Its been done for years already (Score:5, Interesting)
The problem isn't the definition, it's that OS's and hardware manufacturers have been using different definitions. If both of them would stick to factors of 1000, there would be no problem. If they all stick to 1024, there would be no problem. The problem is that both definitions are used.
Personally I'd vote for 1000, since it's just easier for most people. That way they could easily know that 1001 1MB files do not fit on a 1GB USB stick and all the world would be consistent.
Re:Its been done for years already (Score:3, Interesting)
So we've had a defined standard that was, arguably, not the easiest to understand. THEN harddrive manufacturers started their fraud. And THEN people started complaining. So what, and please think about this, would be the right decision here?
The "right" solution is that things dependent on the number of address lines (cache size, RAM size) are in units measured in 2^10, and things not dependent on the number address lines (network bandwidth, HDD/SSD size) are in units measured in 10^3. Files are interesting in that the base unit is a 512 byte sector but they don't depend on address lines, so they should be measured like floppy disks where 1kB is 1024 bytes, 1MB is 1000kB ,and 1GB is 1000MB etc -- but this is confusing, so they'll probably just consistently use steps of 1000.
Naw, not even those who know the difference care (Score:3, Interesting)
It is kind of like the rated speed of a network card. Sure I've got a gigibit ethernet card. But unlike I assume most non-nerds, I *know* it doesn't move a giga*byte* per second--it moves a giga*bit* per second. So how many seconds does it take to move a giga*byte*? Well, I amost always convert GB to Gb by just multiplying by ten. Yeah there are 8 bits in a byte and I should be using 8, but there is all kinds of error correction and stuff that get shoved down the pipe too that I should be accounting for. Thus I figure 10 is good enough and plus the math is easy. With WiFi, I'd probably use 11 or 12 bits per byte. Basically, I dont care about the *exact* number, I just want an estimate.
Same with how big a file is. Unless I'm writing code and need to verify I'm writing out the *exact* number of bytes, I figure the numbers I see either are rounded to the hard drives block size, or they account for other stuff. Heck, even Explorer gives you like two file sizes on its property panel. Unless you add that cute little -h to df, most implementations will show you a number based on block size and *that* number depends on an environment variable.
In short, there are multiple standards and more most use cases we are looking for estimates to filesize or transfer speed. There are always hidden assumptions in most cases.
That all said, if I've got a file that contains the hex dump below, I better get back 6 bytes from my OS. ls -l shows the right number.
coryking@localhost ~ $ hexdump -C testing
74 65 73 74 2e 0a |test..|
PS: Those weren't "junk" characters slashcode! When are you going to get a better editor--steal the one used by stackoverflow. You use a `` around something and it interprets it as code.
PPS: Just learned learned there was a hexdump utility. Cool!
Re:Should have been binary from the start... (Score:2, Interesting)
But computer scientists aren't using a Base 2 system. They are using a base 1024 system. A base 2 system, would be based around the following numbers, 2^1, 2^2, 2^4, 2^8, 2^16 and so on. 2^10 is nothing more than a perversion.
Re:Its been done for years already (Score:3, Interesting)
> So we've had a defined standard that was, arguably, not the easiest to understand.
> THEN harddrive manufacturers started their fraud. And THEN people started complaining.
> So what, and please think about this, would be the right decision here?
As far back as I know, and this goes back before the 1970s, C.Sci boffins picked up a defined pseudostandard (that 1024 was close though to 1000 to use K, etc) for concepts that required *only* direct binary addressibility like RAM and CPU registers/caches, and all else used a base 10 definition of K right from the start - that includes tape drive storage, hard drive storage, bandwidth rates, CPU frequencies, display frequencies, screen resolution, sampling rates and so on.
The idea that 1K = 1024 for "everything in a computer" is relatively new. The old guard knew exactly when it was appropriate to use, and did not use it for concepts outside that domain. It's only since the mid 1990s that geek kids fresh out of school want to use it everywhere. Hell, go into a geek IRC channel (usually a bastion of relatively conservative C.Sci geeks) and ask how many Hertz in a 1GHz processor, and a fair number will insist it's 1073741824Hz, or that 10Mbps ethernet is 10485760bps. They'd be wrong, too.
Re:Let's see them be consistant. (Score:3, Interesting)
But only under the pie chart. Free+Total still adds up to 4GB (3.81GB + 199MB at the moment)
So yeah, someone probably added the new math in the wrong place.
Re:Its been done for years already (Score:3, Interesting)
Well...
Since we are talking about number values meant for human consumption, the selection of SI versus IEC units is arbitrary. Since most human beings tend to think in powers of 10, the SI units could be thought of as the more appropriate for the task.
Now technically, when it comes to media, the actual number of storage available doesn't necessarily need to be a power of 2. Yes the maximum capacity for a given random access media is limited by the largest value that can be addressed which is a power of two. However, the actual number of data words or address locations don't necessarily need to be. This is why we are able to have data structures of any length (eg char a[10];).
You merely agree with me (but it's not so obvious) (Score:2, Interesting)
I already stated that it "would be a pain to convert to and from base-2" but you, while seeming to agree with me, make me understand where you are perhaps overlooking some things:
> every load and store operation
PCM wouldn't be used for the level-1 or probably even for the level-2 caches, so we are actually talking about loads and stores of cache lines which are many bytes long, said loads/stores already requiring something like 100-200 clock cycles on the last system for which I did low-level optimization (but I admit, that was quite a while ago). So it's not totally clear that its impossible to build some kind of pipelined asynchronous base converter which could convert fast enough (at small enough geometries) to make it worthwhile.
Re:Tilting at the wrong questions (Score:1, Interesting)
If you want to talk about 1024^3, then it's Gibi. Gibi is 2^30 since it was created. It was never, ever meant to be anything but 2^30.
Perhaps, but most importantly, it was never, ever meant to be used by anybody with the slightest remaining shred of self-respect. "'Gibi'? You actually want me to use the prefix 'Gibi'? In company communications? That somebody might actually read?" There are three types of people who use the prefix "Gibi" (and any of the other "ibi"'s):
1. Members of the high-school computer club who feel that they aren't getting their fair share of wedgies, beatings, and general abuse.
2. The insufferable know-it-all coworker that even the other insufferable know-it-all coworkers can't suffer.
2a. People like the NASA tools who name Mars rocks "Scooby Doo" and "Yogi Bear" instead of "Rock #567-93" or "Rock Which Cost You-The-Taxpayer A Bajillion Dollars So We Could Name It 'Scooby Doo'".
3. NOBODY ELSE EVER.
The problem here is all too clear: Using the "ibi" prefixes makes you and/or your company look ridiculous, and will get your kids beaten by other kids. Hence, these prefixes will never be adoped by any serious individual or company.
The secondary problem is that inventing new prefixes, regardless of how silly they sound, is the wrongest possible way of solving this non-issue. Let's take a look at how similar issues have been solved throughout the ages in a much more non-controversial and sensical manner:
Distance:
Joe Blow: "miles".
Pirate Joe Blowbeard: "nautical miles"
Joe McBlow of the Clan McBlow: "kilometers"
Astrophysicist Dr. Joespeh Blow, PhD: "AUs"
Weight:
Joe Blow: "pounds"
Physicist Joe Blow: "pounds-force"
John Smith-upon-Avon: "kilograms"
Mass:
Joe Blow: "pounds"
Physicist Joe Blow: "pounds-mass"
Joe Blow, precious metal dealer: "troy ounces"
See the pattern? It wasn't the prefix changing in efforts to rationalize the needs of different users for conceptually-similar-but-not-the-same units, it was the units themselves.
So, if you haven't already beaten me to the punch, let me tell you how this tragedy of modern unitology should have been solved: Yep, define a new unit, the "k". Spell it however you want (like "Kay" maybe, but don't get cute with something like "Kai") if you're worried about conflicts with "K" (Kelvin) or "k" (prefix for one thousand). Problem solved:
1. It's already in common use, so there'd literally be no change required in day-to-day conversation: "Yeah boss, I don't know if this 128k SRAM chip in our product is going to be enough, I think we'll need to go to a 512k part."
2. The normal and non-silly SI prefixes still work the base-10 way they always did: "I just got a new 5 terabyte HDD! Yeah baby, that's 5,000,000,000,000 bytes! What, how many k is that? Um... well, I don't know why you'd care, but it's about 4,882,812,500 k, or approximately 4,882,812 kk, or 4,882 Mk! Yeah, I rule."
3. The "Gibi" and other "ibis" which have caused so many so much pain can be expunged from the SI's records permanently.
Let's solve problems instead of simply creating more while making ourselves look like fools for once, huh?
The SI prefixes have been around for nearly 5 decades [...] Why can't we, the C.S. people, accept that?
Because like the man said, "London calling, yeah, I was there, too / And you know what they said? Well, some of it was true.". CompSci is a very immature field, for every definition of "immature". I frankly believe that this "ibi" nonsense was just made up by the SI to give the field an official wedgie.
Re:Its been done for years already (Score:3, Interesting)
Gigabyte does mean 1,000,000,000 bytes. Giga means billion. It doesn't not mean 1024 * 1024 * 1024 bytes. Mega means million, kilo means thousand.
I can't understand why people are actually arguing that doing it wrong is right.
There are even proper units for the 1024 units. Kibi-, mebi-, gibi-, and so on.
People keep harping on about these "proper" units, but the reality is that there's no way in hell you'll ever get anyone but obsessive geeks (the kind that develop OSS software) to adopt prefixes that sound like something you feed your cat. Seriously, those "proper prefixes" suck.