Linux Not Quite Ready For New 4K-Sector Drives 258
Theovon writes "We've seen a few stories recently about the new Western Digital Green drives. According to WD, their new 4096-byte sector drives are problematic for Windows XP users but not Linux or most other OSes. Linux users should not be complacent about this, because not all the Linux tools like fdisk have caught up. The result is a reduction in write throughput by a factor of 3.3 across the board (a 230% overhead) when 4096-byte clusters are misaligned to 4096-byte physical sectors by one or more 512-byte logical sectors. The author does some benchmarks to demonstrate this. Also, from the comments on the article, it appears that even parted is not ready, since by default it aligns to 'cylinder' boundaries, which are not physical cylinder boundaries and are multiples of 63."
first misaligned post (Score:2, Funny)
damnit, obviously since this is not technically the 'first post', my web browser must be misaligned by a post
oh great.. i try to make a joke and ... (Score:2, Interesting)
Set 32 sectors per track (Score:5, Insightful)
The simple solution is to set you Sectors per Track to 32. This would make sure that everything is properly aligned (except the first partition, usually /boot, which is mis-aligned by one cylinder).
Re: (Score:3, Insightful)
Sectors, blocks, clusters, cylinders... I hope that as we move to solid state drives, devs have the sense to exorcise these anachronisms from the kernel. We haven't been able to get rid of terminals in the 20 years since they've even existed.. this document [linux.org] is heart wrenching. Try reading it; it'll make you cry to see how deeply the now-irrelevant concept of a terminal runs in Linux.
Re:Set 32 sectors per track (Score:5, Insightful)
the now-irrelevant concept of a terminal
Speak for yourself sir, I for one like my rs-232 terminals to be handy for when ethernet is down and you can't ssh (and can't be assed hooking up keyboard and monitor). Seriously, anyone adept at the command line uses it far more than the gui to get things done, terminals will never disappear.
Re:Set 32 sectors per track (Score:4, Insightful)
I don't think he was dissing command line interfaces.
I think his complaint was that even newfangled RS-232 terminals had to jump through hoops to remain compatible with computers that were hooked up to typewriters and line printers. The protocols and underlying software have idiosyncrasies built into them that just don't make sense any more. Instead of throwing away the cruft to make something better, everybody's hacking onto the same old outdated shit. It's limiting progress, in a way.
Re: (Score:2, Offtopic)
Terminals are only irrelevant if you have a strictly Windows and DOS notion of computing.
The document you cited touches on this a little bit.
Re:Set 32 sectors per track (Score:5, Insightful)
terminals are a very necessary and relevant part of Linux. That's how most server administration is done. That's how sending commands to many network appliances is done. That's how setting up high end computers is done (e.g. set up a midrange Integrity or Superdome and you'll start with terminal on the serial port, whether cu in linux or hyperterminal in windowws or a real terminal). Also how certain tasks are performed in GUI environments. It doesn't matter that the terminal is now mostly virtual, the cursor control and font attribute features make convenient applications possible. Even on the weekend here I am chatting via IRC to some tech friends with irssi in terminal under screen, and reading server status emails with mutt. the terminal, it's 21st century tool.
Re:Set 32 sectors per track (Score:5, Insightful)
terminals have nothing to do with the command line!
i think the op is complaining about the fact that things like
baud, stopbits and whatnot are deeply embedded in the
linux kernel. these concepts are not necessary to
have a command line. c.f. plan 9.
Re: (Score:2)
Yes, thank you. That's exactly what I mean.
Re: (Score:2)
In window's defense, they recognize thumbdrives just fine for almost a decade; it is the manufacturers who bottle their own autostart and drivers and non-standard thumbdrives so they can include backup software, encryption, or flat out spyware.
Re: (Score:3, Informative)
We both agreed that most of windows land involves emailing shit to yourself, and a lot of USB thumb drive use...
Explorer: \\ComputerName\c$\Documents and Settings\UserName\My Documents\
Permissions permitting, this is all you need to do. Or you just share folders.
(Of which I could fire off a good half-hour rant on how poorly windows handles mass storage devices. It's a USB THUMB DRIVE for gods sake. It's not a fucking printer! I want to plug it in, and transfer files to/from it. It doesn't need to be "installed", indexed, and have drivers downloaded for it. Just fucking open a file browser like any sane OS does. )
This is a 10 year old complaint.
I have a hard time working on windows, because I'm so much more efficient with a terminal. It's not that I can't use a gui - I'm just an order of magnitude faster using the terminal.
That and you're not using Windows properly.
Re: (Score:2)
Explorer: \\ComputerName\c$\Documents and Settings\UserName\My Documents\
Doesn't do anything I posted I did. I'm trying to put to words how ridiculously out-of-scope your "solution" is, but I really can't. You really need to learn what people do with terminals before posting on the internet, so you don't look like a complete dumbass. It doesn't help that I'm talking about transfers between machines which are not on the local network, and which don't run windows on either end.
And for the record, I AM using
Re:Set 32 sectors per track (Score:5, Informative)
The terminal is not irrelevant. If your Cisco router is ever compromised (it happens) or if IOS becomes corrupt (or if you have an IOS install with a nasty bug where the password does not save correctly, or when an IOS upgrade goes badly) or someone fudges the configuration up, the only way you can recover it is often through the serial port. Serial ports are also very handy for integrating video surveillance with point-of-sales systems that are not IP-aware (or worse, antiquated DVR appliances which can't do POS integration over IP), for some smart switches, *NIX boxes that have been rooted (I've rescued a Solaris box through a serial connection in an enterprise environment where reinstall was not possible due to poor timing - week of finals - and backups were sabotaged by a disgruntled gradute student and logins through IP and at the console were blocked), and so forth. However, I'd rather see RS-485 or RS-422 take RS232's place, since RS-485 and RS-422 can work over much longer distances and you can hang multiple serial devices off of a single bus.
RS-232 might be absent from a lot of consumer motherboards, but it is far from dead and certainly not irrelevant, even now in 2010.
Re:Set 32 sectors per track (Score:5, Insightful)
Oh, in addition, now that Windows Server (Core) has a real GUI-less mode and Powershell and UNIX environment shells on Windows finally have usable interfaces, shell prompts are becoming even more relevant even in large Window shops. So, even Microsoft has acknowledged that the UNIX-y way of doing things is key for automation and uptime in an enterprise environment. Now, most PCs won't boot with output to the serial port, but some enterprise server boards do have such options.
A GUI is great for basic tasks, but for repetitive tasks a command shell and scripting environment are key for efficiency, and reliable automation. VBS/Windows Scripting Host was an "acceptable" workaround for a while but in the past many Windows administrative tools required the box to not be headless, the workstation unlocked and the windows open for the GUI to be accessible for scripting - and even then it was iffy because not all GUI elements are accessible (especially third-party tools with custom controls).
Re: (Score:2, Funny)
I recommend you visit Microsoft [microsoft.com] and have a look at their "Windows" operating system. The concept of a terminal doesn't run nearly as deep in it as it does in Linux. The same goes for the concept of security. Overall, it is kind of a poorly reinvented UNIX, but I think you might just like it. There are quite a few applications available for it nowadays, and it is gaining more and more marketshare and public recognition.
Re: (Score:3, Funny)
I recommend you visit Microsoft [microsoft.com] and have a look at their "Windows" operating system. [...] Overall, it is kind of a poorly reinvented UNIX, but I think you might just like it.
I've seen some people use it, then they told me you had to pay for it. I was flabbergasted -- why would anyone pay for an operating system?
Re:Set 32 sectors per track (Score:4, Informative)
Essentially we are back to the old problems of the ST412 interface where we had to figure out the best interleave for the drives as well when we were formatting them. Most drives then did have a fairly conservative interleave, but a reformat of them could improve the throughput considerably. A reformat could be done so that the whole track could be read in 2 rotations instead of 3, and what that does to performance is fairly easy to understand. C800:5 was a commonly used BIOS address where the low level format routine did reside.
But from what I understand this problem is an offset problem when the head steps from track to track, and that's also an issue to be considered. And today it's not common knowledge/practice to low level format hard drives.
And why stick at 4k sectors? Depending on the system you may want to use a different sector size. If you run Oracle on some systems the block size is 8k, and in that case you may want to have 8k disk blocks too since it would be good for performance.
Anyway - sooner or later we will have flash drives instead, and then this isn't a problem.
Re:Set 32 sectors per track (Score:5, Insightful)
Anyway - sooner or later we will have flash drives instead, and then this isn't a problem.
Actually this problem is potentially much worse on SSD's. Erase blocks are huge, and read-modify-write really sucks on flash.
Re:Set 32 sectors per track (Score:4, Interesting)
Actually this problem is potentially much worse on SSD's. Erase blocks are huge, and read-modify-write really sucks on flash.
Couldn't this be addressed (at least in part) by a battery-backed write cache like better RAID controllers use? Set it up like SAN snapshots (so it just stores the diff between what's in the actual flash storage and what's been changed so far), and then write the changed blocks when it's most advantageous (e.g. when there's an entire block's worth of data, so it would all have to be erased by the flash storage anyway).
Maybe combine that with something like a disk defrag, except instead of storing frequently-sequentially-read data in physical sequence, store frequently-written data (regardless of if it's sequentially-read or not) in physical sequence.
Re:Set 32 sectors per track (Score:5, Informative)
Actually this problem is potentially much worse on SSD's. Erase blocks are huge, and read-modify-write really sucks on flash.
Couldn't this be addressed (at least in part) by a battery-backed write cache like better RAID controllers use? Set it up like SAN snapshots (so it just stores the diff between what's in the actual flash storage and what's been changed so far), and then write the changed blocks when it's most advantageous (e.g. when there's an entire block's worth of data, so it would all have to be erased by the flash storage anyway).
Maybe combine that with something like a disk defrag, except instead of storing frequently-sequentially-read data in physical sequence, store frequently-written data (regardless of if it's sequentially-read or not) in physical sequence.
That's exactly what most SSD controllers do!
Some now come with 32 to 64MB of cache, and some of the new Sandforce controller based SSDs also come with a little ultracapacitor that acts like a mini UPS. The cache is used as scratch space for reordering writes and defragging blocks.
There was a firmware patch recently for the OCZ Vertex series of SSDs that enabled background defrag. If you let the drive site there for a few minutes, it would start getting faster until it returned to 'as new' speeds
Good thread on this. (Score:4, Informative)
Re: (Score:2)
http://www.osnews.com/thread?409281 [osnews.com]
One of the comments in that thread suggests switching to GPT if you aren't using Windows.
I haven't used Windows at home since ~2001.
Can you just wipe/reinstall using GPT? I thought the BIOS was involved with the type of partition table and that I had to be using the msdos partition type because of the BIOS. Can a geek with deeper knowledge of partitions and and all things boot drop some knowledge?
Re: (Score:2, Informative)
The BIOS has no understanding of partition tables. It merely reads the first sector of the harddrive to 0x7C00 and then jumps to that location. The DOS partition table is used by convention for interoperability between operating systems. If you wanted to use a different partitioning scheme, there is no technical reason your operating system couldn't.
Re: (Score:3, Interesting)
GPT wraps itself in a MBR partition map. At the very least the GPT is supposed to include an MBR map that claims the whole disk as used by GPT to avoid issues with old disk tools and the like. And if you've got a partition scheme that's compatible with the MBR scheme they can both contain the same information, assuming your disk tool supports this, so that MBR-only environments can still find your partitions.
It's also possible to format with GPT and then use an MBR-only tool (fdisk) to go back and manipulat
Re:Good thread on this. (Score:4, Informative)
Unless your BIOS is trying to be too smart and peeking into your partitions instead of launching the MBR (sadly, some do), it won't matter. It's the MBR's job to boot your system after the BIOS hands off control to it, and on most Linux systems the bootloader is installed straight into the MBR.
Re: (Score:3, Informative)
Even if you are using Windows, Vista and up support GPT. It's handy for servers where you expect to have partitions larger than 2 TB.
But I guess if one were using a modern version of Windows, you wouldn't have the 4K alignment problems to begin with.
Re: (Score:2)
That's an interesting point - I assumed diskpart supporting it meant it could boot from it.
However, Vista and 7 (and the server editions) can boot from a GPT partition, but only the x64 bit versions. source [wikipedia.org]
Open Source to the rescue (Score:3, Insightful)
I am no kernel hacker but I can almost guarantee that some kernel hacker will provide a solution to this "short coming" fairly soon.
That's the beauty of Open Source.
I am aware though that "fairly soon" means many things to many people; which means that there could be a substantial delay before we get a working solution to this issue.
I am optimistic nevertheless.
Request to Western Digital: Provide all the information needed to develop a solution.
Re: (Score:2)
And that couldn't possibly happen with closed source?
Re: (Score:2)
I guess you only read but did not understand! Key words in my piece are: "Fairly soon."
Re: (Score:2)
Well, since Windows Vista and Windows 7 already support this, I'd say "fairly soon" is demonstrably false. Unless you're happy with "fairly soon" being "after everybody else has been doing it for several years."
Re: (Score:2)
Huh? What are you talking about? Was that intended as a reply to some other post?
The point I was making is that, since Windows has supported these drives for 3 years now, and Linux doesn't yet have the same level of support, obviously open source development isn't as fast this thread claims.
What the hell does the usage of Windows XP have anything to do with that point?
Re: (Score:2)
I'm asserting that for all intents and purposes, Microsoft has not solved this problem. If they had then the drives wouldn't need to fake 512 sectors.
Re: (Score:2)
But they have solved it.
They haven't invented a time machine to solve the problem in an OS that came out years before the problem existed. Is that what you're complaining about? The lack of a time machine?
Re: (Score:2)
Solving it is worthless if they can't convince anybody to actually use the fix. I'm complaining about their complete inability to get people to upgrade, so we don't have to make workarounds for a nearly decade old operating system anymore.
Re: (Score:2)
Re:Open Source to the rescue (Score:4, Informative)
Exactly. Drives are pretending to have 512-byte sectors because Windows can't deal with 4k sectors, and then silently reducing performance when you believe them and use 512-byte sector sizes. Had the drives reported 4k sector sizes, they'd work great under Linux and not at all under Windows.
This isn't a Linux problem, it's a drive problem caused by Windows. The solution is to implement yet another workaround for stupid devices, and start aligning partitions to 4k by default.
Nitpick: SDHC card sectors are always 512 bytes, and most SD card sectors are 512 bytes too. Flash memory would benefit from larger sector sizes too, but they've probably stuck to 512 bytes for Windows compatibility.
Re:Open Source to the rescue (Score:5, Insightful)
On the contrary, this has (almost) nothing to do with Windows - it has everything to do with old OSes. The IDEMA didn't approve the 4K sector standard until 2006; it was only in the late 90's that the first meaningful research was begun by IBM on whether 512B sectors would be an issue.
As it turns out, yes, 512B sectors would be an issue, and drive manufacturers would be best served by moving to larger sectors (with some arguing over whether to go to 1K or 4K). So the IDEMA hashed this out over the first half of the decade, and finally in 2006 approved the 4K specification.
The point of all of this is that software written at the turn of the century was all done well before changing drive sector sizes was a serious discussion. WinXP was released in 2001, Mac OS X 10.0 was in 2001, and of course Linux 2.4 was also in 2001. None of those OSes know what to do with anything other than a 512B sector - the only reason Windows factors in to this equation is that WinXP just happens to be with us (no doubt trying to eat our brains) while the other two are dead. Anything circa 2005 or later such as WinVista, Linux 2.6, and Mac OS X 10.5 know full well what to do with a 4K drive.
But even that is beside the point. You don't just make major jumps like this, you have to do it in a transition so that you don't break old hardware and old software alike. Even if XP/Lin2.4/MacOSX knew what to do with 4K sectors, at some point you'd run in to hardware, 3rd party devices, etc that would not. A transition is necessary to let old hardware and software get flushed out of the ecosystem, and as such we're still years out from consumer drives offering native 4K access.
In short: drives are pretending to have 512-byte sectors because there's a lot of old stuff, including Windows XP that can't deal with 4K sectors.
Re:Open Source to the rescue (Score:4, Insightful)
I was referring to Windows XP - I should've mentioned it explicitly. Nonetheless,
There's exactly one old thing used in any significant quantity and liable to have to work with these drives, and it's Windows XP. Everything else either isn't in any significant use any more, or will never be seeing one of these drives.
One potential concern would be USB enclosures which have to work with older OSes / devices. To that, I would say it should be the enclosure's job to do the 512-byte sector emulator, not the drive's.
Windows XP has been updated over the years (via service packs and the like) to handle hardware that didn't exist at the time of its release, including things like SATA. They could do the same for 4k sector disks, but they aren't going to do it because they want people to move to Win7. Therefore, Microsoft is still to blame for neither 1) providing a solution for XP, nor 2) providing enough compelling reasons to migrate to Win7. Heck, Vista was a trainwreck and it doesn't really count, so (proper) Windows is actually 3 years late to the 4k party, as Windows 7 was only released in 2009. Effectively, they've spent those three years scaling down support for XP while providing no viable alternative, and now the rest of the world has to deal with a significant amount of people still using a 9-year-old OS.
In short: the fact that people are sticking to 9-year-old XP and making hardware companies break or slow down improvements means Microsoft did something terribly wrong.
Comment removed (Score:5, Insightful)
Re: (Score:3, Informative)
And...
Total bullshit.
Linux kernel code had flexible block device sector size since the days of 1.x series of kernels. The "problem" is (and always was) with some of the user-space utilities for some of the file systems available under Linux, file systems specifically designed for ... compatibility with DOS and
Re: (Score:2)
My Hard drive with 4k sector sizes works perfectly under Windows 7?
Re: (Score:2)
There are no hard drives with logical 4k sector sizes out there, as they all emulate 512-byte sectors for compatibility, so your point is moot. However, if they did exist, they would theoretically work with Windows 7. I was talking about Windows XP.
Re: (Score:2)
s/windows/windowsxp
Whatever you may think of Vista/7/Server2008, this isn't a problem for them. I agree that it's stupid for WD to make their drives lie for compatibility with what is effectively a legacy OS (by modernity, not use), but there's no call to tar newer releases for it.
Re: (Score:2)
Nitpick: SDHC card sectors are always 512 bytes, and most SD card sectors are 512 bytes too. Flash memory would benefit from larger sector sizes too, but they've probably stuck to 512 bytes for Windows compatibility.
This is no longer true. Most 2x NAND memory manufactured in the past year is 4KB block sizes with 8KB coming soon. That it pretends to be 512 bytes is a function of the SDIO MLC driver IC. Luckily for SD they come pre-partitioned so that the partitions are aligned properly.
Re: (Score:2)
That was exactly my point - I was referring to SD and other Flash memory external interface standards, not the underlying NAND chips. At the SD interface level, sectors are 512b most of the time (or always, for SDHC), but the physical NAND devices are 2K/4K, so the interface would benefit from using the native sector size.
Re: (Score:2)
Oh, haha. I misunderstood your meaning. Good show, then.
Re: (Score:2)
> I guess the beauty of Closed Source, then, is that the OS supports it out of the box
Or not. The sheep-like closed source user is never likely to notice the problem.
Although the real problem here seems to be that you can't consider a bit of hardware
as it presents itself. Now I wonder why a hard drive company feels the need to have it's
hardware LIE to the OS?
Re: (Score:3, Interesting)
Now I wonder why a hard drive company feels the need to have it's hardware LIE to the OS?
So the hardware is compatible with more software. For example, hard drives still report some number of cylinders, heads and sectors to the BIOS and the OS, but hard drives have been using ZBR [wikipedia.org] for 20 years now (IIRC) so the sector number is meaningless.
But, as it is now, if my old system needs a new hard drive, I do not need to find an old drive to be compatible with my system (as long as it is IDE or SCSI, I don't know of any adapter from the newer interfaces to ESDI or ST-506, but they probably exist).
They
Re: (Score:3, Insightful)
Re: (Score:2)
Don't forget that this "beauty of open source" doesn't even get around to thinking about fixing the issue until a competing OS has had support for it for 3 entire years.
Re: (Score:2)
Actually, the beauty of closed source is that the OS "supports" it out of the box, except that it's buggy as all get out, works very slowly for stuff that was much faster in 1995, and despite many users noticing and complaining about the problem both to the vendor and in various blogs and online forums, it doesn't get fixed for months or years while a qualified dev is finally diverted when the problem is so severe that even the US Army won't buy until it's fixed.
So don't do that... (Score:2, Informative)
Author claims a massive performance drop if things aren't aligned right. Ubuntu already does it with parted and fdisk can do it manually. So, no big problem; fdisk ought to be fixed to have sane defaults with a 4096 byte block size, sure. That can't be all that difficult.
The author also seems to think that only a 30% increase in times for misaligned writes should be expected. I'm not sure why. In a naive implementation I'd expect a 100% increase in time (each block now needs to be written twice). Linu
Re: (Score:3, Insightful)
fdisk doesn't need to be fixed, it needs to be deprecated. DOS partition tables are a ridiculously bad artifact of the past. We won't be using them for much longer anyway; they're limited to 2TB for 512-byte-sector drives (or 4K drives with 512-byte emulation).
Check with your distribution (Score:5, Interesting)
I know that Fedora seems to have addressed this with parted 2.1.1 [fedoraproject.org] and util-linux-ng 2.1 [fedoraproject.org]. Both are scheduled for Fedora 13, but can be pulled into Fedora 12 by those getting the hardware early.
Oh slashdot.. (Score:5, Insightful)
Dear Slashdot,
I've been around for a while. Enough to understand, nay, love the fact that you are linux supporters and all that. But I remain an ardent supporter of truth and speaking in ways which are concise and leads the reader in the direction of truth. Nothing in this news story is inaccurate, but to make it a point to say that Windows XP is incompatible with no mention of Vista and 7 being perfectly compatible should be an embarrassment of journalistic integrity.
Windows XP may not work with the new WD Green drives, but Vista and on have been perfectly comfortable with 4096 byte sectors. A lay reader may read this story and not "Read between the lines" as I have learned to do here. Their take away may be that Microsoft operating systems are broken in some way (which they are in a lot of ways), but not this one!
slashdot is not journalism (Score:5, Insightful)
should be an embarrassment of journalistic integrity.
Slashvertisements, basic English grammar and spelling problems, completely wrong summaries and titles...
...and you a)think that Slashdot is "journalism" and b)it's had integrity to lose in the first place?
I like Slashdot, but gimme a break...it's a user-driven blog which directs readers to existing stories (now often lagging behind the major news wires) with good categorization and semi-sophisticated commenting system, utilized by a larger commenter population. Not much more, and definitely not journalism.
Re: (Score:2)
Yeah, I don't disagree, maybe I'm even well aware of that, but if I had admitted that in my original post I'd have no room to indict the continued problems. I think they should aspire to be better even if Slashdot is everything you say it is.
Re:slashdot is not journalism (Score:4, Interesting)
I'm with you, but on the other hand that doesn't mean they should just not give a shit about the quality of their end-product. We know from experience that they can edit and correct stories as corrections arise in the comments, but how often does that happen in practice? (Hardly ever.) Somewhere between a third and half of the stories posted here are either outright lies, or extremely misleading-- I may be exaggerating, but not by much-- and almost never are they corrected.
Look, any site that posts this article: http://tech.slashdot.org/article.pl?sid=09/02/16/2259257 [slashdot.org] without a single correct simply Does. Not. Give. A. Shit.
I don't think anybody's expecting the New York Times when they visit here, but some minimum level of competence would be nice. I don't fault anybody for complaining.
Re: (Score:2)
nonsense, I guessed (correctly it turns out) from the description that Xp and older versions of windows would have problem. But those of us who keep older computer around for windows, with no interest in going beyond Xp, aren't fretting.
You have an imagined stereotype that isn't true. Here's my stereotyping, Less than 0.1% of readers of slashdot would be "lay readers" in matters of computers so you needn't worry.
Re: (Score:2, Flamebait)
Dear Slashdot,
Nothing in this news story is inaccurate, but to make it a point to say that Windows XP is incompatible with no mention of Vista and 7 being perfectly compatible should be an embarrassment of journalistic integrity.
You are correct. It is a non-issue because nobody uses WindowsXP anymore and there are no legacy systems running it. Nor will there ever be a need to use a new drive in a WindowsXP stsem since there are no WindowsXP systems in use. You are correct.
Re: (Score:3, Insightful)
but to make it a point to say that Windows XP is incompatible with no mention of Vista and 7 being perfectly compatible should be an embarrassment of journalistic integrity.
.
.
Their take away may be that Microsoft operating systems are broken in some way (which they are in a lot of ways), but not this one!
It only takes about 3 brain cells to realize that Windows XP != All Microsoft Operating Systems. Even the average person has more than 3 brain cells.
For those people with less than 3 brain cells, S
Re: (Score:2)
When the summary explicitly mentions only XP, then obviously all others after that are fine.
Sure, if this was "Gramps weekly Gramps-thing" papyrus gazette, that for some reason had a story about these new drives, then yes, a clarification that Vista and so forth work ok (and an explanation about what the heck a hard drive is in the first place) would be in order.
Re: (Score:2)
now xp is 62% and Vista(20%) and Win7(10%). So about 33% of Windows OS's are 4k aware.
Drive lies and future fixes (Score:5, Interesting)
There is an excellent thread talking about how recent (2.6.31+) linux kernels try to report the underlying hard drive architecture [gmane.org] (found via the OSNews comments [osnews.com]). Alas, it looks like some of these drives are not reporting this data correctly and thus automatic adjustment (at partitioning time) is not taking place. It looks like in the future rather than trying to do detection by reported capability fdisk (and hopefully gparted) will default to sectors of 1MiB if the topology can't be found by default [gmane.org] (unless your media is small).
Additionally, I gather that recent Fedoras will try to adjust things like LVM to match larger sectors too [storagemojo.com]. Hopefully whatever is laying out LVM will also be fixed too.
Coincidentally, it looks like Oracle have a very committed dev trying to make this stuff work by default...
I was worried about this... and am still unclear (Score:5, Informative)
I just got one of the 1TB 64mb WD drives that is known to be 4kb sector based.
Here is how it shows up in dmesg:
[ 3.420488] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
and here's what hdparm -I says:
ATA device, with non-removable media
Model Number: WDC WD10EARS-00Y5B1
Serial Number: WD-WCAV55227529
Firmware Revision: 80.00A80
Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
Standards:
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 1953525168
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 953869 MBytes
device size with M = 1000*1000: 1000204 MBytes (1000 GB)
cache/buffer size = unknown
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 1
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_B
DragonFly's solution (Score:5, Interesting)
We're adjusting our disklabel64 utility and kernel support to set the partition base offset such that it is physically aligned instead of slice-aligned, and we are using 32K alignment. That should fix the problem without having to mess around with fdisk.
The DragonFly 64-bit disklabel structure uses 64-bit byte offsets instead of sector addressing to specify everything. It ensures things are at least sector aligned but we wanted to make disk images more portable across devices with potentially different sector sizes. The HAMMER fs uses byte-granular addressing for the same reason, 16K aligned.
-Matt
Re: (Score:2)
We're adjusting our disklabel64 utility and kernel support to set the partition base offset such that it is physically aligned instead of slice-aligned, and we are using 32K alignment. That should fix the problem without having to mess around with fdisk.
The DragonFly 64-bit disklabel structure uses 64-bit byte offsets instead of sector addressing to specify everything. It ensures things are at least sector aligned but we wanted to make disk images more portable across devices with potentially different sector sizes. The HAMMER fs uses byte-granular addressing for the same reason, 16K aligned.
-Matt
You should use 64K alignment at a minimum, almost all RAID and SAN volumes are 64K aligned with IO sizes of 64K minimum.
You'll lose up to 50% performance on random IOs on any server class hardware.
As a comparison, Windows now aligns to the nearest 1MB.
Poorly researched article. (Score:5, Insightful)
The article represents one data point, for one particular way to install a drive, on one (un-named) version of Gentoo, on one particular model of a WD drive that had a bugzilla entry entered by the author all of 2 days ago. So this is supposed to be an indictment of all of Linux?
The author even mentions that Ubuntu has an option on parted that accomplishes the task properly. I'd be much more interested in an article that talks about how the default installer handles this task rather than concentrating on one particular expert tool that does so. It's still good to know that fdisk on his un-named Gentoo distribution does the wrong thing.. but this hardly means we should fire up the klaxon and declare "Linux not fully prepared for 4096 sector hard drives!". It's certainly interesting, but I'll withhold judgment until we actually know more about the implications of this across the entire spectrum of Linux distributions and the various 4096 sector HDs.
Re:Poorly researched article. (Score:4, Informative)
This is simply a matter of fdisk from that version of util-linux-ng (which is clearly named in the article) trusting the hardware vendor to specify correct block sizes. The vendor did not. Thus fdisk does not end up with 4k block sizes, as happens for many programs. And only(?) parted apparently contains a workaround that detects the correct block size.
Its not that you can't use parted on Gentoo, though, it is just that in the world of user choices that is Gentoo, not everyone will be using that program or that particular option.
Re: (Score:2)
But this is not a distro-specific issue, so you are wrong, too.
I never made any claims about this being a distro-specific issue, or not being a distro-specific issue. The only point I'm trying to get across is the article is extraordinarily narrow in what it's actually tested.
Re:Poorly researched article. (Score:4, Informative)
I wrote the linked article.
I completely agree that the article is narrowly focused. VERY narrow. My objective was to demonstrate a problem and point out that Linux has not FULLY adapted. I didn't say Linux devs were idiots or that it would never be ready. I was trying to express the idea that Linux [distros in general but perhaps not all] is not QUITE ready for these drives, because not all the tools have fully adapted. Some tools make no mention of any problems in their man pages. Some (like parted's defaults) are even misleading if you mistakenly think that "track aligned" is a good thing.
And I was trying to do that in the very limited number of words I had available for a title.
Also, WD claimed that Linux is unaffected. Some distros probably are, but this could lead people to believe that the statement is universally true, which it isn't. Thus, my over-all objective is to educate people to the fact that if they don't know what they're doing, they can get this wrong. There are lots of mistakes I've made where I wished that someone had mentioned some critical fact on a how-to (like, don't use dmraid/fakeraid for RAID1 because reads aren't load-balanced; use mdraid instead). I've filed plenty of bug reports on such issues.
Re: (Score:3, Insightful)
I can't over-emphasize the importance of titles in communication, especially with complex technical subject where there's a lot of evidence presented to support a conclusion. Your title colors the rest of the article and creates expectations about what you're trying to say. When people read articles (especially on the web) they scan through them trying to find the important parts. That's been demonstrated through eye-tracking studies multiple times.
Your title was very broad, but the evidence to support i
Simple Solution (Score:2)
Don't partition the drive in XP - format the entire thing and don't split it apart. Get a secondary physical drive.
Re:Interesting (Score:4, Insightful)
$ time cp winxp.img /mnt/sdc # ALIGNED
/mnt/sdd # UNALIGNED
/mnt/sdc # ALIGNED
/mnt/sdd # UNALIGNED
real 5m9.360s
user 0m0.090s
sys 0m20.420s
$ time cp winxp.img
real 13m26.943s
user 0m0.110s
sys 0m19.350s
$ time cp -r Computer Architecture/
real 42m9.602s
user 0m0.680s
sys 1m59.070s
$ time cp -r Computer Architecture/
real 138m54.610s
user 0m0.660s
sys 2m15.630s
The first two being a single file, the latter two being multiple files in a larger directory structure.
I would heartily disagree with you on the matter.
Re: (Score:2)
Re: (Score:3, Informative)
While a kernel tweak may help alleviate the issue, it is primarily an is
Re:Interesting (Score:5, Interesting)
Re:Interesting (Score:5, Insightful)
Forcing users to optimize isn't inherently wrong, it's just that they shouldn't need to do it for things which are somewhat standard as a work around for weird hardware designs. And yes, I realize that the 4096byte sectors aren't being implemented arbitrarily.
Re: (Score:2)
Wait.. wait.. what the fuck?
I was under the impression that mis-aligned partitions cause small random IOs to take up TWO small random IOs, basically halving random IO performance. It should have a negligible effect on any streaming operation, as only the first and last blocks will have an overhead, the others overlap with useful IO.
Just how BAD is the Linux 'cp' program? Is it doing writes one block at a time, or something insane like that?
On Windows, if you repeated the copy same test, you wouldn't see a m
Re: (Score:2)
Re: (Score:2)
I hope you mean 70+ MB/s.
Re: (Score:2)
It is not a distro problem. On top of that, you run the same distro TFA does.
Evaluation setup and methodology:
* Gentoo ~amd64 system with 2.6.31-gentoo-r5 kernel
* fdisk version: fdisk (util-linux-ng 2.17)
* The drives are identical, but I did not try swapping configurations to make sure that one drive isn't fundamentally slower than the other.
* Core 2 Quad at 2.33GHz (Q9450), 8GiB of RAM
* MSI X48 Platinum motherboard -- Intel X48 Express + ICH9R
Re: (Score:2)
Re:I just bought one of these (Score:5, Informative)
dev/sdd:
Model=WDC WD15EARS-00Z5B1, FwRev=80.00A80, SerialNo=
Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50
BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16
It looks to me that this should *really* be fixed by WD with a firmware update
.
Solution: Instead of fdisk, call it as fdisk -H 224 -S 56 as per Theodore Tso's blog [thunk.org].
a firmware update isn't realistic (Score:3, Informative)
I'll get to why in a second, but first:
RawCHS hasn't meant anything in a decade. The largest drive you can describe with CHS is 8GB.
Track size hasn't meant anything in even longer than that. When drives went to zone bit recording (ZBR), the number of sectors per track became variable. This happened in about 1989.
The sector size does mean something, but it is the actual sector size, not the sector "grouping" size. If the drive reported a sector size of 4K, then it would expect that the host understand that s
whoops, one thing about RawCHS (Score:4, Interesting)
I forgot, there is one thing RawCHS nowadays. That is that there is no proper spec for how to know if a partition in an MBR (fdisk) partition table is a valid partition. So there are heuristics that are applied to the entries to guess if they are real or to be ignored as empty. One of the heuristics that some software uses is to ignore all partition entries that don't begin on a cylinder boundary. To be on a cylinder boundary, the partition has to start on a sector number that is a multiple of the number of sectors (S in CHS) in order to be valid. And since all drives 8GB or greater present an S of 63, that is why the first partition on an MBR disk has always started at sector 63, which makes it unaligned when the internal sector size is 4K (8 internal sectors).
Windows before 2000 checks the CHS alignment of MBR entries and ignores any partition entries that don't start on a multiple of S. So all disks out there are misaligned. With Windows 2000 or later, you can start the partition on any boundary you want.
Western Digital has a jumper you can put on the drive that adds 1 to all access requests, making all those misaligned first partitions aligned. But it'll also make any aligned partitions misaligned. So the real answer is just to layout your disk different. I would recommend using GUID disk partitioning instead of MBR anyway, because MBR doesn't work for >2TB drives. And GUID doesn't have any weird alignment requirements (and doesn't have any knowledge of CHS).
Re: (Score:3, Informative)
The original reason for aligning to track boundaries (a track is a cylinder-head pair) is that the first four sectors of MS-DOS' IO.SYS (IBMBIO.SYS) had to be contiguous and on a single track.
Re:if vista/win7 really do support this correctly. (Score:5, Insightful)
The real problem is that it is lying about it's sector size, it's reporting 512 bytes when it's using 4k, if it told linux it was using 4k everything would be fine and dandy.
Why does it lie about it's sector size when it doesn't need to? because if it didn't the drives would not work on windows XP at all. Which would not bode well for sales.
Once drives with 4k sectors arrive its up the individual maintainers of each affected tool (fdisk, et. al.) to update their code.
Kernel handles sector sizes, and could handle 4k sectors ages ago, but when the hardware reports something it tends to trust it, which is now apparent it shouldn't. (512 byte sectors being implemented as an emulation layer of sorts on these drives.. and enabled by default)
Re:if vista/win7 really do support this correctly. (Score:5, Insightful)
Re:if vista/win7 really do support this correctly. (Score:4, Insightful)
I see it rather as an indictment against closed-source OSes, if XP turns out to be incompatible with these new drives and MS never releases a patch to add support. People will need to upgrade for no good reason to one of MS's new operating systems. People should not have to deal with a complete upheaval of their tested and true systems due to a small hardware change such as this.
I can imagine MS is quietly chuckling with glee to itself, if this issue becomes a deal-breaker for machines still running XP.
Re: (Score:2)
I wouldn't be too fond of the MS development model from what I hear from those who were on the inside:
http://www.nytimes.com/2010/02/04/opinion/04brass.html?pagewanted=all [nytimes.com]
Inside Microsoft, political infighting trumps common sense. If you really want to hold up a closed source development model as an example of "what works" take a look at Apple. They crank out far better products with a fraction of the resources.
Re:Partitions are obsolete (Score:4, Insightful)
You've got it exactly backwards, people shouldn't be partitioning disks into one huge partition. They should be able to split things up a bit to keep rapidly changing directories from mostly static ones and to manage the risk of filesystem corruption destroying important files.
Re: (Score:2, Insightful)
Splitting a disk into multiple pseudo-disks makes sense in many situations, but the clunky legacy partition tables are only good for inter-OS compatibility. Otherwise LVM beats partitions in every respect. Now if only we could get a LVM solution that works in multiple operating systems...
Re: (Score:2)
If your worried about one big filesystem going south then I suggest you start using a modern filesystem without those concerns.
We've got well past the point where that should be an issue on any modern system.
Your acting like it's still the 70s, it's not, we've learned a few things since then.
Re: (Score:2)
Nope. Read the original article. The first partition will be mis-aligned if you use the first available block rather than padding to the next 4K boundary.
You won't even know it unless you look at the partition table in expert mode.
Re: (Score:2)
What the GP is advocating is not using a partition table, instead of partitioning drives into one partition. This doesn't cause the issue mentioned in the article. This isn't a novel idea - there are some USB drives that do this already, and even Windows seems to cope fine.
partition table (Score:3, Funny)
Re: (Score:3, Insightful)
And then when you install a different distribution, you blow away your home directory. Sorry, bad idea. /home should be in a separate partition from the rest of the stuff..
Also, since I usually have several distributions installed at the same time, I have several partitions...but that's a less common problem.
A better solution would be to have a boot partition snuggled up against the MBR that automatically adapts so that the boot + MBR is an appropriate size, say 32 MB. (My current boot directory is 14MB,