Changes in HDD Sector Usage After 30 Years 360
freitasm writes "A story on Geekzone tells us that IDEMA (Disk Drive, Equipment, and Materials Association) is planning to implement a new standard for HDD sector usage, replacing the old 512-byte sector with a new 4096-byte sector. The association says it will be more efficient. According to the article Windows Vista will ship with this support already."
Ah, error correction. (Score:5, Insightful)
Re:Ah, error correction. (Score:5, Informative)
Unlike CD-ROMs, I don't believe you can actually read the sector meta-data without some sort of drive-manufacturer-specific tricks.
Re:Ah, error correction. (Score:5, Informative)
What are you calling meta-data ?
CDs also have "merging bits", and what is read as a byte is in fact coded on-disk as 14 bits, and you can't read C2 errors either, that are beyond the 2352 bytes that really are all used as data on an audio CD, an audio sector being 1/75 of a second, 44100/75*2(channels)*2(bytes per sample) = 2352 bytes and it has correction codes in addition too. You can however read subchannels (96 bytes / sector)
When dealing with such low-level technologies, reading bits on disk doesn't mean anything as there really are no bits on the disc, just pits and lands (CD) or magnetic particles (HD) causing little electric variations on a sensor, then no variation is interpreted as 0 and a variation is interpreted as a 1, and you need variations even when writing only 0's as a reference clock.
without some sort of drive-manufacturer-specific tricks.
Now of course, as you cannot change HD platters within different drive with different heads like you can do with a CD, each manufacturer can (and will !) encode differently. It has been reported that hard disks with the same reference wouldn't "interoperate" exchanging the controller part because of differing firmware versions, while the format is standardized for CDs or DVDs.
they actually store near 600 bytes
(that would be 4800 bits) In that light, they're not storing bytes, just magnetizing particles. Bytes are quite high-level. There are probably more than a ten thousands magnetic variations for a 512 byte sector. What you call bytes is already what you can read
Here's an interesting read [laesieworks.com] quickly found on Google just for you
Re:Ah, error correction. (Score:2)
I've had to change the controller on a few hard drives for clients who did some really stupid things to their drives, but didn't w
Re:Ah, error correction. (Score:3, Interesting)
Re:Ah, error correction. (Score:3, Informative)
Re:Ah, error correction. (Score:2, Informative)
Cluster size? (Score:3, Interesting)
Re:Cluster size? (Score:5, Informative)
problems for older hardware??? (Score:2)
All to often an advantage in speed improvements and such are more than countered by adding overhead junk.
now maybe I should RTFA...
It can't be. (Score:2)
If this were transparently implemented by the hardware, the OS would frequently try to write a single 512 byte sector. In order for this to work, the hard drive controller would have to read the existing sector then write it back with the 512 bytes changed. This is a big waste, as a read then a write costs at least a full platter rotation (1/7200 second). Do this hundreds or thousands of time
Hrm, that kind of makes sense... (Score:2, Insightful)
As filesystems are slowly moving towards larger block sizes, now that the "wasted" space on drives due to unused space at the ends of blocks are not as noticable, moving up the s
Re:Hrm, that kind of makes sense... (Score:3, Informative)
Re:Hrm, that kind of makes sense... (Score:3, Informative)
Re:Hrm, that kind of makes sense... (Score:3, Informative)
Re:Hrm, that kind of makes sense... (Score:2)
Good for small devices (Score:4, Interesting)
Also, squaring away each sector after processing is a round trip back to the filesystem which can be eliminated by reading a larger sector size in the first place.
Some semi-ATA disks already force a minimum 4096-byte sector size. It's not necessarily the best way to get the most usage out of your disks, but it is one way of speeding up the disk just a little bit more to reduce power consumption.
Re:Good for small devices (Score:2)
Things don't work that way. (Score:2)
Well sorry, but that's the way it is.
Hard drives generally have the ability to read/write multiple sectors with a single command. (Go read the ATA standards). And DMA is usually used [ program I/O just plain sucks].
I don't see how changing the sector size is going to save power... Eit
Re:Things don't work that way. (Score:2)
There are losses of file space for file systems that contain lots of small files: the maximum space wasted by having only one byte on a block goes from 511 bytes to 4095 bytes, and that will affect available disk space in some systems that use lots of small files.
Re:Good for small devices (Score:2)
In Vista already? (Score:5, Funny)
Why just one standard? (Score:2, Interesting)
It's all about Format Efficiency (Score:5, Informative)
During the transition from 512-byte to 1K, and ultimately 4K sectors, HDDs will be able to emulate 512-byte modes to the host (i.e. making a 1K or 4K native drive 'look' like a standard 512-byte drive). If the OS is using 4K clusters, this will come with no performance decrease. For any application performing random single-block writes, the HDD will suffer 1 rev per write (for a read-modify-write operation), but that's really only a condition that would be found during a test.
Seems good to me. (Score:3, Informative)
LBA accesses on sector boundaries, so for larger HDD's, you need more bits (currently 28-bit LBA, which some older bioses support, means a maximum of 128GB- 2^28*512=2^28*2^9=2^37) Since 512-bytes were used for 30 years, I think it is easy to assume it will not last for 10 more years (getting to LBA32 limit). So why not shave off 3 bits and also make it an even number of bits (12 against 9).
Also there is something called "multible block access" where you make only one request for up to 16 (on most HDD's) sectors. For 512-byte sectors you have 8K, but for 4K sectors that means 64K. Great for large files (IO overdead and stuff).
On the application side this sould not affect anyone using 64-bit sizes (since only the OS would know of sector sizes), as for 32-bit sizes it already is a problem (4G limit).
So this sould not be a problem because on a large partition you will not have too much wasted space (i have around 40MB wasted space on my OS drive for 5520MB of files, and I would even accept 200MB)
Boot sector virii (Score:5, Funny)
Re:Boot sector virii (Score:2)
Oh great, now my viruses can be bloatware, too. I guess with that much space, they can even install a GUI for the virus, or maybe "Clippy" to keep me distracted while he formats my hard drive.
Re:You got modded up funny but? (Score:3, Informative)
(Boot sector is one, so we start off odd right after boot sector. There are usually 2 FAT copies (even), so after FAT offset stays odd. For root directory size, there is usually no compelling reason to make it an even size, however usually Windows makes it an even size anyways, guaranteeing t
Re:You got modded up funny but? (Score:3, Interesting)
You're all complaining about tiny files... (Score:2, Insightful)
Re:You're all complaining about tiny files... (Score:3, Informative)
Which is good, you don't really want lots of small files anyway.
If you are using windows, you can see how much is space is wasted at the moment, just right click on a directory, and it will tell how much data is in the files, and how much disk space
Re:You're all complaining about tiny files... (Score:3, Interesting)
Of course one effect of the new sector size will be that old filesystem drivers, esp. those which come with old OSs, will likely not be able to use those disks. Which in effect means that if you want to use such a disk, you absolutely will have to upgrade your OS.
It's so that ECC can handle bigger bad spots (Score:4, Interesting)
Redmond thinks they're so smart... (Score:3, Funny)
Oh YEAH? Well Linux has had support for it for eleventeen years, and the Linux approach is more streamlined anyway!
File sizes (Score:3, Interesting)
System Pages, RAID, Tail Blocks, and Addressing (Score:5, Insightful)
First of all, most OSes these days use a memory page size of 4k. Having your IO system page match your CPU page makes it much more efficient to DMA data and the like. Testing has shown that this is generally a helpful.
Second, RAID will benefit here. Larger blocks mean larger disk reads and writes. In terms of RAID performance, this is probably a good thing. Of course, the real performance comes from the size of the drive cache, but don't underestimate the benefit of larger blocks. Larger blocks mean the RAID system can spend more time crunching the data and less time handling block overhead. The fact that more data must be crunched for a sector write is of concern, but I'd bet it won't matter too much (it only really matters for massive small writes, not generally a RAID use case).
Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache).
Netware 4 did something called Block Suballocation. While not as tightly packed as Reiser tail blocks, it did take their larger 32kb or 64kb blocks (which were chosen to keep block addresses small and large file streaming faster) into disk sectors and storing tails in them.
NTFS has block suballocation akin to Netware, but Windows users are, to my knowledge, out of luck until MS finally addresses their filesystem (they've been putting this off forever). Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).
To my knowledge, ReiserFS is the only filesystem with tail packing. If you are really interested in this, see your replacement brain on the Internet [wikipedia.org].
Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.
Finally, someone mentioned sector size issues with defragmenters and disk optimizers. These programs don't really care as long as all of the sectors on the system are the same size. Additionally, they could be modified to deal with different sector sizes. Ironically, modern filesystems don't really require defragmentation, as they are designed to keep fragments small on their own (usually using "extents"). Ext2, Ext3, Reiserfs and the like do this. NTFS does it too, although it can have problems if the disk ever gets full (basically, magic reserved space called the MFT gets data stored in it and the management information for the disk gets fragmented permenantly). If it weren't for a design choice (I wouldn't call it a flaw as much as a compromise) NTFS wouldn't really need defragmentation. ReiserFS can suffer from a limited form of fragmentation. However, v4 is getting a repacker that will actively defragment and optimize (by spreading out the free space evenly to increase performance) the filesystem in the background.
I really don't see how this can be bad unless somebody makes a mistake on backwards compatability. For those Linux junkies, I'm not sure about the IDE code, but I bet the SATA code will be overhauled to support it in a matter of weeks (if not a single weekend).
Re:System Pages, RAID, Tail Blocks, and Addressing (Score:3, Insightful)
If this is an issue, you use the wrong application - one word file per phone number?
File systems became simpler over time. This is a GOOD THING AND THE ONLY WAY TO GO.
If you try to optimize too much, you end up with something like the IBM mainframe file systems from the 70s, which are still somewhat around.
Create a simple file, called a data set ? Sure, in TSO (what passes for a shell, more or less), you use the ALLOCATE command: http://publibz.b [ibm.com]
Configureable Sector Size (Score:3, Insightful)
Doesn't anyone remember that SCSI-drives that support a changeable block-size are around since basically forever? Of course with harddisks it was used mostly to account for additional error-correcting / parity bits, but also magneto-optical media could be written with 512 or 2k (if I remember correctly).
(first hit I found: http://www.starline.de/en/produkte/hitachi/ul10k3
Size != storage (Score:2, Informative)
Tom
GRUB and LILO (Score:2)
I really don't know much about how drives store data. So this may be a really stupid question. But do larger sectors also mean the boot sector? Is this good news for boot loaders?
Old 512 byte sector? What about my 336 bytes? (Score:2)
Ah the joys of using a Harris 24 bit word/8 bit byte/112 word disk sector machime.
History of the 512-byte Sector Size (Score:3, Informative)
In 1963, when IBM was still firmly committed to variable length records on disks, DEC was shipping a block-replacable personal storage device called the DECtape [wikipedia.org]. This consisted of a wide magnetic tape wrapped around a wheel small enough to fit in your pocket. Unlike the much larger IBM-compatible tape drives, DECtape drives could write a block in the middle of the tape without disturbing other blocks, so it was in effect a slow disk. To make block replacement possible all blocks had to be the same size, and on the PDP-6 DEC set the size to 128 36-bit words, or 4608 bits. This number (or 4096, a rounder number for 8-bit computers) carried over into later disks which also used fixed sector sizes. As time passed, there were occasional discussions about the proper sector size, but at least once the argument to keep it small won based on the desire to avoid wasting space within a sector, since the last sector of a file would on average be only half full.
Re:4MB (Score:2, Funny)
Re:4MB (Score:4, Insightful)
Re:4MB (Score:2, Insightful)
But of course, 4MB is fscking l
Re:4MB (Score:4, Interesting)
Re:4MB (Score:3, Funny)
Wow, that's nice. Time to add a small cgi script to my webserver, and link it as an image:
Because of R-M-W (Score:3, Insightful)
Re:Because of R-M-W (Score:2, Informative)
That's why there is a system (Level 1, 2, and main memory) cache. Write-backs to the physical disk only occur when needed. That doesn't mean that 4MB would be a good sector size; it just means that write-backs are not the issue to consider here.
Re:Because of R-M-W (Score:3, Informative)
In his example, let's say it was a text editor. You change one letter in a document, and save it, it must sycnhronously write the sector to disk, to the actual physical media. Otherwise, if the system crashes, you lose it, and most people don't like that in a text editor.
Write cache at the disk level can be very bad. Databases may have no way of knowing that write cache is enabled, and tell you that your transaction is comitted when it's really not. Of c
Re:4MB (Score:2)
Depends on the size of your device and system cache, really. There's a functional difference between the size of a sector and the size of the space allocated for new files or file extensions. My advice? Let the hardware vendors decide what they want for the sector size, doesn't matter all that much. Then make sure any file extension or file initial size allocation is a healthy multiple of that. If you don't use it all, truncate on close. It's small allocations, not small sector si
Re:4MB (Score:4, Informative)
MOD PARENT UP ! (Score:2)
Allocation size is irrelevant as many advanced systems are supporting fragments (however still not implemented in ext2/ext3
And from a past discussion some people [slashdot.org] are thinking that the 512 bytes comes from the memory page size of the VAX.
Re:4MB (Score:2)
Re:4MB (Score:3, Informative)
Huh? Which modern architectures?
The only systems I run that still have 4k page sizes are x86 systems.
x86-32 = 4k
x86-64 = 4k
G4,G5 = 4k
alpha (64bit) = 8k
sparc (64bit) = 8k
ia64 = 16k
and at least on the ia64 platform the page size is configurable at compile time.
Re:4MB (Score:3, Informative)
Comment removed (Score:5, Informative)
Re:That's nice (Score:2, Informative)
This is a quick and dirty hack to check that the generated data is correct. I'm not going to spend weeks designing a data file format, and an API plus conversion tools to export the files to an excel compatible format.just because I've got an inefficient file system.
A new hard drive would be a better investment. Or alternatively just ignore the problem since NTFS seems to hande these adequately.
And sometimes its simply impossible to write a solution that will work like this. Some applicat
No, that's not 'sector' (Score:4, Informative)
Re:No, that's not 'sector' (Score:2)
Re:No, that's not 'sector' (Score:4, Interesting)
Nah, nothing that significant. The operating system does/should not "know" anything about how the data is physically stored by a device. The existing O/S storage abstractions will remain. (You may have trouble running a very old O/S but that would be just one of your problems)
Every modern O/S uses disk space as virtual memory by reading and writing chunks of RAM to the HDD when it runs out of physical RAM. The standard HDD sector size is changing to the most commonly used O/S size for memory "pages" (RAM chunks written to disk).
The larger size will (in theory) speed things up a tiny amount. The the HDD will now read/write a "page" to disk in one sector rather than four. Meaning the HDD will perform less administrative functions to swap RAM back and forth to the disk. Hardly anyone will notice this but constant minor tweeking of HDD internals has evolved them very rapidly. eg: In 1990 I paid $200AU for a second-hand 20MB HDD (~0.2 SECOND seek time!).
LBA (Score:2)
You're talking about LBA, but that only applies to cylinders/heads. The OS does map to the sector (eg, file inode stored at sector 12345 from the partition beginning, which says that file begins on sector 23123 etc). If it didn't use sectors, it would need an extra 7 bits to store the location of everything within the filesystem.
The filesystem also communicates with the driver using sector numbers. It's
Re:LBA (Score:3, Interesting)
The Amiga used byte-offsets and lengths for all IO's. This did eventually cause problems when disk drives (which started at 10-20MB when the Amiga was designed) got to 4GB, but a minor extension allowing 64-bit offsets solved that. 64-bit offsets shouldn't overflow very soon....
For the device driver, it's no big deal to shift the offset if the sector size is a pow
Re:That's nice (Score:3, Informative)
Re:That's nice (Score:4, Informative)
See: http://www.microsoft.com/technet/prodtechnol/winx
If you notice, in most of the useful cases the custer size is 4K. Making the hard disk match this seems like a good idea to me.
And EXT2 also uses a 4K block size.
Also remember it's for large disks, no FS that I know of supports a cluster (or block) size smaller then 4K for large disks.
Re:That's nice (Score:2)
Re:That's nice (Score:4, Informative)
Here [ntfs.com]'s a short description of how NTFS allcates space. On volumes larger than 2GB, the cluster size (the granularity the FS uses to allocate space) was 4k already unless you specified something else when formatting the drive. Also, Windows NT has supported disk sector sizes larger than 512 bytes for a long time; it's just that anything else has been rare.
How do you know how your data is actually stored ? (Score:4, Insightful)
Modern DASD architecture is almost completely hidden from the user. In the (good?) old days system software needed to interface closely with the DASD and needed to understand the hardware architecture to gain maximum performance from the devices (I know because I work on such systems within an IBM mainframe environment on airline systems which require extremely high speed data access).
Nowadays the disk 'address' of where the data actually resides is still couched in terms that appear to refer to the hardware itself but in 'serious' DASD subsystems (e.g. the IBM DS8000 enterprise storage systems [ibm.com] )the actual way in which the hardware handles the data is masked from the operating system. Data for the same file is spread across many physical devices and some version of RAID is used for integrity.
The 4096 value for data 'chunks' has to do with the most efficient amount of data that can be transmitted down a DASD channel (between a host and storage in large systems or the bus in self-contained systems)
The idea of a 'file address' would cease to exist and it would be replaced by a generic 'data address' if it weren't for the in-built assumptions about data retrieval within all current Operating Systems.
Re:That's nice (Score:2, Informative)
Re:That's nice (Score:3, Insightful)
Some file systems can pack multiple tail fragments into one block.
Re:Quick Explain How! (Score:5, Interesting)
You have say, 10 lockers up and 20 lockers accross
You can only put one thing in a locker, so you cant put your gym shorts in the same one as your shoes. But if you have lots of socks, you can pile them in, and take up two or three if neccessary.
Space is wasted if you have a really big locker, but it's only holding a sock.
Now, you've got to record where all of this stuff is, or you will take forever to find that sock. So you set asside a locker to hold the clipboard with designations.
Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...
Re:Quick Explain How! (Score:5, Funny)
Re:Quick Explain How! (Score:2)
Re:Quick Explain How! (Score:2)
What're we going to get to top that tomorrow?
Re:Quick Explain How! (Score:5, Funny)
Best analogy is Spock's gym locker room
Spock has say, 10 space lockers up and 20 space lockers accross
Spock can only put one thing in a locker, so Spock cant put his gym shorts in the same one as your shoes. But since Spock has lots of socks, He can pile them in, and take up two or three if neccessary.
Space is wasted if Spock uses a really big locker, but it's only holding a sock.
Now, you've got to record where all of this stuff is, or you will take forever to find that sock. (I guess the tricorders are broken) So Spock sets aside a locker to hold the clipboard with designations.
Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...
Re:Quick Explain How! (Score:2)
That post was illogical.
Re:Quick Explain How! (Score:3, Funny)
Man that takes me back. Where's my toupee....
Wrong attribution on your sig (Score:2)
Nevertheless, Heineman was still one heck of an airplane designer.
As far as the 8" floppy, ISTR that they were intended to replace punched cards, 77 tracks with 26 sectors (hard coded) came out to be pretty close to a box of 2000 hollerith cards (80 columns with 12 bits per column). 8" drives were available before the end of 1975, and the VAX came out in 1977(?). One of the uses for the flopies was loading the micropro
Re:Quick Explain How! (Score:2)
The 11/750 loaded its microcode from a little magnetic tape. It used to take (seemingly) ages to get going. I used to boot PDP 11/84's and 83's from TK50 tape. This was in traffic signal cabins out in the
Re:Vista (Score:2, Funny)
Re:What's the case for Linux? (Score:2)
Ask Hans Reiser maybe.
Re:What's the case for Linux? (Score:2, Informative)
The poeple who would need to write support for this are Jeff Garzik (libata) and James Bottomley (scsi). It's not that this would require a terribly complicated patch though.
Re:What's the case for Linux? (Score:2)
As the patch was done 'properly', a couple of tweaks of some constants and a recompile (if it isn't a run-time parameter already) should enable 4k sectors, 8k sectors, even 1Mb sectors, if you really want to go there.
Re:What's the case for Linux? (Score:3, Informative)
Re:What's the case for Linux? (Score:5, Funny)
"Hello. My name is Inigo Molnar. You changed the sectors. Prepare to die."
Re: Apple in the forground again (Score:5, Informative)
Re:file size (Score:2, Insightful)
Actually a 200 GB drive can still store 25 million files. How many fonts do you have?
FWIW the advantage is in the error correction. For a 1 bit secotro size, you'd need 3 bits to store it with error correction. As the block becomes larger, the error correction becomes more powerful. That is where the advantage is.
Of course data can still be stored byte-wise on the disk - it is only th
Re:file size (Score:2)
Re:file size (Score:2)
Re:Finally!! (Score:2)
What was done with CP/M and 86-DOS, but not PC/MS-DOS, was distributing source code for the BIOS
Re:Finally!! (Score:2)
I had a CP/M system as well and IIRC it stored contiguous files on every third or sixth sector because the CPU could not always keep up with the disk.
I remember writing some slow basic code around the same time on an apple ][ which caused the floppy drive to stop and wait for the CPU.
Also there was something about batch files in CP/M. I think the files were structured so that the shell could go back to the disk for the next step in the script. Those were the days when memory was really scarce.
Re:30 years and now it's bumped up only 8x? (Score:3, Informative)
Re:30 years doing what? (Score:5, Funny)
That already exists. It's called a "child." Geeks might think they are hard to obtain, but in fact they tend to pop up unexpectedly quite often. They also have an audio interface, are touch-sensitive, run off of bio-mass fuel, and can even do the dishes after they have been around for a few years. They can be attached to a Playstation or an iPod too. When you first get them they are quite noisy and smelly with a few leaks, but that goes away after the break-in period. They don't come with a users manual though. Documentation is sparse. You have to get a third-party handbook.
Re:30 years doing what? (Score:5, Funny)
All in all, they are not really a good replacement for a hard disk.
Re:30 years doing what? (Score:2)
Re:There really isn't much data... (Score:2)
I think that's it. You will not lose space on the disk when the change happens because the filesystem block sizes are still 4k. I suppose you could use whatever compression system your FS offers to pack the data into less blocks, then you'd gain a t
Re:There really isn't much data... (Score:2)
you'd be surprised:
~69% in my case.Re:There really isn't much data... (Score:2)
If this is 69% of the size of your hard drive, I'm impressed that you can read Slashdot on your computer. Considering these days most hard drives shipping are at the very least 40GB or so, does 440MB (versus the 110MB it would be if they were all 512 byte files instead) really make much of a difference?
Re:Block size issue (Score:2)
Not thought about stuff this low level for quite a while.. I wonder how many geeks are truly informed as to how the basics of their
Re:OK, I'll ask.... (Score:3, Funny)