Forgot your password?
typodupeerror
Data Storage Open Source

OpenZFS Project Launches, Uniting ZFS Developers 297

Posted by Soulskill
from the putting-the-band-together dept.
Damek writes "The OpenZFS project launched today, the truly open source successor to the ZFS project. ZFS is an advanced filesystem in active development for over a decade. Recent development has continued in the open, and OpenZFS is the new formal name for this community of developers, users, and companies improving, using, and building on ZFS. Founded by members of the Linux, FreeBSD, Mac OS X, and illumos communities, including Matt Ahrens, one of the two original authors of ZFS, the OpenZFS community brings together over a hundred software developers from these platforms."
This discussion has been archived. No new comments can be posted.

OpenZFS Project Launches, Uniting ZFS Developers

Comments Filter:
  • by Anonymous Coward on Tuesday September 17, 2013 @08:13PM (#44879423)

    If this gets us BP-rewrite, the holy grail of ZFS i'll be a happy man.

    For those who don't know what it is - BP-rewrite is block pointer rewrite, a feature promised for many years now but has never come. It's a lot like cold fusion is that its always X years away from us.

    BP-rewrite would allow implementation of the following features
    - Defrag
    - Shrinking vdevs
    - Removing vdevs from pools
    - Evacuating data from a vdev (say you wanted to destroy you're old 10 disk vdev and add it back to the pool as a different numbered disk vdev)

  • Still CDDL... (Score:5, Informative)

    by volkerdi (9854) on Tuesday September 17, 2013 @08:15PM (#44879437)

    Oh well. I'd somehow hoped "truly open source" meant BSD license, or LGPL.

  • Re:I'm addicted (Score:1, Informative)

    by Anonymous Coward on Tuesday September 17, 2013 @08:48PM (#44879685)

    Too stupid.

  • Re:Patents? (Score:5, Informative)

    by utkonos (2104836) on Tuesday September 17, 2013 @08:49PM (#44879693)
    FAQ much? There is no central source repository for OpenZFS. Each supported operating system has it's own repository. [open-zfs.org] The previous also has a link to the source tree for each of the supported projects under the umbrella.
  • Re:FINALLY. (Score:3, Informative)

    by Anonymous Coward on Tuesday September 17, 2013 @08:53PM (#44879721)

    Been using btrfs for several non-essential file systems. Working great so far, and have even done several successful bedup runs. Has worked great for minimizing disk usage on some Maven repositories with lots of duplicate files between Jenkins and Nexus. Maybe not tested enough for your server that you need to stay up all the time, but great for the home desktop (provided you're sane and are keeping backups, which you should be doing already anyway). The more testing it gets, the sooner it becomes "tested enough" for the needs-to-always-be-available server.

  • by Vesvvi (1501135) on Tuesday September 17, 2013 @08:59PM (#44879755)

    I don't have any practical experience with BTRFS, but I use ZFS heavily at work.

    The advantage of ZFS is that it's tested, and it just works. When I started with our first ZFS testbed, I abused that thing in scary ways trying to get it to fail: hotplugging RAID controller cards, etc. Nothing really scratched it. Over the years I've made additional bad decisions such as upgrading filesystem versions while in a degraded state, missing logs, etc, but nothing has ever caused me to lose data, ever.

    The one negative to ZFS (if you can call it that) is that it makes you aware of inevitable failures (scrubs catch them). I'll lose about 1 or 2 files per year (out of many many terrabytes) just due to lousy luck, unless I store redundant high-level copies of data and/or metadata. Right now I use use stripes over many sets of mirrored drives, but it's not enough when you read or write huge quantities of data. I've ran the numbers and our losses are reasonable, but it's sobering to see the harsh reality that "good enough" efforts just aren't good enough for 100% at scale.

  • Re:FINALLY. (Score:4, Informative)

    by Virtucon (127420) on Tuesday September 17, 2013 @09:13PM (#44879827)

    licensing or patent issues [sys-con.com]?
    What you also forget is that Oracle was the leading proponent of BTRFS and yes it had to do with licensing and patents from Sun. Once they acquired Sun that all went out the window. If I were the CEO at Oracle I'd ask "Why two file systems that essentially do the same thing? One's mature and the other, not so much" That's why BTRFS still survives but now with less Oracle support. Wait, is that a bad thing?

  • by Bengie (1121981) on Tuesday September 17, 2013 @09:15PM (#44879837)
    Oracle released ZFS under a BSD compatible license. Anyone is allowed to do whatever to the opensource code. Going forward, Oracle has not opened an code after v28, which is the last OpenSource version to be compatible with Oracle ZFS.
  • Re:Still CDDL... (Score:3, Informative)

    by larry bagina (561269) on Tuesday September 17, 2013 @09:26PM (#44879905) Journal
    CDDL is basically LGPL on a per-file basis.
  • Re:Cool, but.. (Score:4, Informative)

    by smash (1351) on Tuesday September 17, 2013 @09:38PM (#44879961) Homepage Journal
    That. Those who don't understand ZFS are condemned to reinvent it, poorly.
  • by Anonymous Coward on Tuesday September 17, 2013 @09:45PM (#44880005)

    You don't understand. ZFS didn't lose that data -- ZFS detected that the underlying disk drives lost that data. You can run ZFS in a highly redundant modes that allow it to reconstruct lost data, but it sounds like OP's redundancy is such that sufficient drives may lose bytes as to cause lost files.

  • Re:ZFS for Windows? (Score:5, Informative)

    by tlambert (566799) on Tuesday September 17, 2013 @10:15PM (#44880189)

    It doesn't have to be POSIX compliant to have it ported to it and it doesn't require somebody to pay for licensing. With the Features of ZFS one could argue that a port to at least Windows Server would be great and it would garnish quite a following from those who've had to put up with the way NTFS views disk volumes and storage.

    Windows isn't a very friendly development platform for Open Source, starting with the licensing requirements for tools and distribution restrictions on binaries derived from those tools when using header files containing substantial code, or runtime libraries. Part of this is an intentional legal defense against WINE and CrossOver Office, and part of it is just scale management by limiting the support community requirements to "serious developers".

    In addition, a lot of the installable filesystem and similar code, as well as a lot of the necessary VM internals (memory mapped files and paging/swapping from filesystems) are not adequately explained (i.e. they involve locking text regions with level 0 locks, which require a level 3 lock then a level 0 lock, and to do this to get the offsets on the physical media for the blocks in question. This used to not work on removable media in NT as of 4.0.1; not sure if it's supported yet, but it was the reason you couldn't install it in JAZZ drives or even regular hard drives in removable carriers.

    Having developed a filesystem for Windows95 IFSMgr, and reverse engineered all this crap, and having done it again for NT3.51, I would not look forward to having to repeat the process for Windows 7 or Windows 8, which are the only useful versions to target for by the time the code ends up functional.

    So unless someone wanted to seriously underwrite the effort (read: it's have to be done by Oracle, or by a startup who had a monetization strategy that Microsoft wouldn't preempt, like they did when my team, at a previous employer, ported UFS + Soft Updates to Windows 95, and they announced Longhorn-which-never-happened, and then put together a lawsuit about "deep reverse engineering" which would have precluded using it as a bootable FS... no thanks.

  • by raymorris (2726007) on Tuesday September 17, 2013 @10:26PM (#44880239)

    Using a small, fast SSD as a cache for large, slow disks can be awesome for some workloads, mostly servers with many concurrent users.

    To do that with ANY filesystem, bcache is now part of the mainline kernel . dmcache does the same thing, and there is another one that Facebook uses.

  • Re: Data integrity (Score:5, Informative)

    by MightyYar (622222) on Tuesday September 17, 2013 @10:51PM (#44880403)

    Not sure what you mean. You certainly can set up a mirrored pair (or triplet or quadruplet), but you can also set up what's referred to as raidz, where it stripes the redundancy across multiple disks. You can configure how much redundancy... 1, 2, or more disks if you like. You can also tell ZFS to keep multiple copies of blocks, and it will spread those copies out among the disks. You can set that policy per sub-volume (file system in zfs-speak), so that if you decide that some of your data deserves more redundancy, you can set up a folder that will keep 2 copies of everything, but leave all the other folders at 1 copy. It's super geeky. I've had it detect (and correct) corruption in a failing disk, detect corruption because of a flaky disk controller that would otherwise pretend to work fine, and detect corruption when a SATA cable came loose. Combined with the ECC RAM in the server, I feel more comfortable about the integrity of my data than I ever have. I've lost family photos before to random drive corruption, so I'm sensitive to this stuff :)

  • Re: Data integrity (Score:5, Informative)

    by saleenS281 (859657) on Tuesday September 17, 2013 @11:34PM (#44880625) Homepage
    One point to be extremely clear on however - when you set copies = 2 on a folder level, it does NOT guarantee those copies end up on different physical spindles. Early on there were many people who lost files because they skipped RAID thinking that copies=X would protect their data. It is NOT meant as a means to protect against hardware failures.
  • by saleenS281 (859657) on Tuesday September 17, 2013 @11:36PM (#44880643) Homepage
    Because a COW filesystem will become fragmented over time simply by the way it works. As you delete files, you're only free-ing up small segments of contiguous blocks. Over time, this leads to fragmentation because writes are sometimes forced into non-optimal disk placement due to lack of free space. Granted - if you never fill the pool beyond 50%, it won't be a problem. For everyone else, it's a matter of when, not if it will become fragmented.
  • Re: Data integrity (Score:4, Informative)

    by kthreadd (1558445) on Wednesday September 18, 2013 @02:09AM (#44881215)

    That's what you have backups for.

"If value corrupts then absolute value corrupts absolutely."

Working...