Forgot your password?
typodupeerror
Data Storage Microsoft

Rethinking the Nature of Files 369

Posted by timothy
from the files-are-inside-the-computer dept.
An anonymous reader writes "Two recent papers, one from Microsoft Research and one from University of Wisconsin (PDF), are providing a refreshing take on rethinking 'what a file is.' This could have major implications for the next-gen file system design, and will probably cause a stir among Slashdotters, given that it will affect the programmatic interface. The first paper has some hints as to what went wrong with the previous WinFS approach. Quoting the first paper: 'For over 40 years the notion of the file, as devised by pioneers in the field of computing, has proved robust and has remained unchallenged. Yet this concept is not a given, but serves as a boundary object between users and engineers. In the current landscape, this boundary is showing signs of slippage, and we propose the boundary object be reconstituted. New abstractions of file are needed, which reflect what users seek to do with their digital data, and which allow engineers to solve the networking, storage and data management problems that ensue when files move from the PC on to the networked world of today. We suggest that one aspect of this adaptation is to encompass metadata within a file abstraction; another has to do what such a shift would mean for enduring user actions such as "copy" and "delete" applicable to the deriving file types. We finish by arguing that there is an especial need to support the notion of "ownership" that adequately serves both users and engineers as they engage with the world of networked sociality. '"
This discussion has been archived. No new comments can be posted.

Rethinking the Nature of Files

Comments Filter:
  • by elrous0 (869638) * on Tuesday November 01, 2011 @09:23AM (#37906406)

    I'm sorry, but MS issuing a paper on the "issues of file ownership" and the cloud sends a little chill up my spine. Makes me think that engineering may not be the only impetus behind their paper. It also makes me wonder if someone isn't looking to take a little more "ownership" of what has traditionally been considered *my* data.

    It's bad enough I'm already forced into "buying" software and media that I can never resell. Now they want my fucking Word files too I guess.

  • by fuzzyfuzzyfungus (1223518) on Tuesday November 01, 2011 @09:29AM (#37906488) Journal
    Don't worry, user, of course you own those little files of yours.

    We just want to install some robust Technological Protection Measures to preserve your ownership of those files across all devices and platforms and legal systems aligned with international norms... Totally harmless, nothing to worry about.
  • by Hartree (191324) on Tuesday November 01, 2011 @09:31AM (#37906522)

    Microsoft: All your files^h^h^h^h^hdata are belong to us!

  • by petes_PoV (912422) on Tuesday November 01, 2011 @09:36AM (#37906580)

    A file is essentially just a collection of data - no more and no less. To try and add attributes to that makes little sense and seems as futile as trying to say that each collection of molecules should have a tag saying what it is, who it belongs to and what it's for. Sure, you can add abstractions and structure on top of the basic form, but when you do that you are adding a layer - not redefining the basic building block.

  • by hedwards (940851) on Tuesday November 01, 2011 @09:43AM (#37906658)

    To be honest, this sounds like MS is inventing something that Apple already invented. Apple has had forked files for how many years now? With one fork for the data and a resource fork for the icon and a few related pieces of information.

    Personally, I don't like it, it's non-standard and requires special steps to work with at times, and I'm don't really understand why it's needed in the first place. If it's really that big of a problem you can always zip up the meta data file and the data file and call it a day, but for most purposes I'd rather than the data not get corrupted when the meta data does.

  • by CharlyFoxtrot (1607527) on Tuesday November 01, 2011 @09:45AM (#37906686)

    You should read the article, you are illustrating their point. They talk about how users associate ownership with having a file on a known physical location and how in order for people to feel comfortable with cloud storage the definition of file needs to be redefined in a way that people feel they have ownership over data that exists "out there".

    "[...] ownership is what we are thinking of, when ownership stands as proxy for what used to be knowledge of location and responsibility for that location. What was once a relationship between a user and a physical thing now needs to stand as a relationship between a user and a digital thing. Just what this ownership might be and how it might function in terms of what is specified in this new entity we are thinking of, one that somehow has the properties we have described above and which also allows this new characteristic, we have begun to outline but a beginning is all it is."

    Part of this is the ability to be able to delete their data even when it has been put out there in the wild.

    "A boundary object needs to be developed that can bridge the abstraction of the user and the one of the engineer, who needs to worry about where this thing that keeps growing and changing, and where the locale of storage changes too, such that when a user says ‘delete’, the thing whatever it is and wherever the entities constitutive of it are, are indeed, done away with."

    This is a paper talking about your concerns and how to address them.

  • by elrous0 (869638) * on Tuesday November 01, 2011 @09:50AM (#37906764)

    A quote from the conclusion of the article:

    A boundary object needs to be developed that can bridge the abstraction of the user and the one of the engineer, who needs to worry about where this thing that keeps growing and changing, and where the locale of storage changes too, such that when a user says ‘delete’, the thing whatever it is and wherever the entities constitutive of it are, are indeed, done away with.

    I'm sorry, but that sounds a *lot* like DRMing every file to me, with a central service controlling every file (how else could you implement such a system?). The authors even admit as much a few sentences later:

    At first reading one might think they are alluding to digital rights management.

    Of course, they seem to deny that this is DRM. But that's sure what it sounds like to me. And DRM needs some sort of central service to work, which I'm sure MS will be happy to provide of course.

  • by imric (6240) on Tuesday November 01, 2011 @09:58AM (#37906872)

    Of COURSE they are. They are trying to find a different way to market it - since DRM has no user benefits and users actively dislike it, they 'need' to redefine the issue so users have no choice.

    This is marketing.

  • by gestalt_n_pepper (991155) on Tuesday November 01, 2011 @10:03AM (#37906942)

    Do NOT "improve" the file. I'd like to continue to be able to use my computer and other devices.

  • POSIX xattrs (Score:4, Insightful)

    by Salamander (33735) <jeff@p[ ]typ.us ['l.a' in gap]> on Tuesday November 01, 2011 @10:04AM (#37906962) Homepage Journal

    Look them up. They already allow you to attach arbitrary metadata to a file. Most modern filesystems and user-level utilities support them already. They're even used as the underpinnings for security mechanisms such as POSIX ACLs and SELinux. Sure, there are issues with performance when you have *lots* of xattrs on a file, and that's a fruitful area of research, but we sure don't need some brand-new Microsoft-invented thing to deal with metadata.

  • by StuartHankins (1020819) on Tuesday November 01, 2011 @10:53AM (#37907804)
    +1 Insightful. Allowing Microsoft to do this sort of thing would be a horrible mistake. They've shown they can't be trusted too many times. Maybe the kids weren't aware when this stuff started, but I still remember the tricks Microsoft played... and are still playing. Boo on them forever in my book.

    Poetic justice would have Apple purchase Microsoft and break it into divisions.
  • by biodata (1981610) on Tuesday November 01, 2011 @12:09PM (#37908756)
    The cloud idea likes to project an illusion of it not mattering where the file is, but it is predicated on (more or less) limitless bandwidth with near zero latency, and limitless local storage/cache. If the file you want is not on the local hard disk then it isn't. If your OS needs to fetch it behind the scenes then you need to wait until it arrives. Yes you might think you don't want to know where the file is physically, but when it takes ten minutes to open a file that should take ten seconds, you will probably want to know why (oh, it's in another country and the network is busy because everyone is watching some new TV prog, i see now). Not knowing where the file is just means needing to ask all the time. Is it really better not to know, than just knowing in the first place, and making sure it is where you need it to be? Bandwidth will never be unlimited and latency will never be zero. We are routinely working on 10GB files now where I work, and you always need to know where they are, and to care because however big the pipes are and how ever big the disk space and the RAM, the data streams grow even faster. The technologies underlying data capture devices obey their own version of Moore's law, frequently with higher multiplicities.

Work without a vision is slavery, Vision without work is a pipe dream, But vision with work is the hope of the world.

Working...