Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Data Storage Operating Systems

Ask Slashdot: Is It Time To Replace File Systems? (substack.com) 209

DidgetMaster writes: Hard drive costs now hover around $20 per terabyte (TB). Drives bigger than 20TB are now available. Fast SSDs are more expensive, but the average user can now afford these in TB capacities as well. Yet, we are still using antiquated file systems that were designed decades ago when the biggest drives were much less than a single gigabyte (GB). Their oversized file records and slow directory traversal search algorithms make finding files on volumes that can hold more than 100 million files a nightmare. Rather than flexible tagging systems that could make searches quick and easy, they have things like "extended attributes" that are painfully slow to search on. Indexing services can be built on top of them, but these are not an integral part of the file system so they can be bypassed and become out of sync with the file system itself.

It is time to replace file systems with something better. A local object store that can effectively manage hundreds of millions of files and find things in seconds based on file type and/or tags attached is possible. File systems are usually free and come with your operating system, so there seems to be little incentive for someone to build a new system from scratch, but just like we needed the internet to come along and change everything we need a better data storage manager.

See Didgets for an example of what is possible.
In a Substack article, Didgets developer Andy Lawrence argues his system solves many of the problems associated with the antiquated file systems still in use today. "With Didgets, each record is only 64 bytes which means a table with 200 million records is less than 13GB total, which is much more manageable," writes Lawrence. Didgets also has "a small field in its metadata record that tells whether the file is a photo or a document or a video or some other type," helping to dramatically speed up searches.

Do you think it's time to replace file systems with an alternative system, such as Didgets? Why or why not?
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Is It Time To Replace File Systems?

Comments Filter:
  • Yes it is (Score:3, Interesting)

    by saloomy ( 2817221 ) on Wednesday February 23, 2022 @07:50PM (#62297239)
    But, with a database, not a Didgets. That way, the objects can be stored as anything, like... an email message, or an image, or an email attachment. You could reference every part of your information with it, and you could tag and manage permissions much more granularly. You could even version out the changes to documents until you ran out of space, and then purge as needed old versions and deleted info.
    • Re:Yes it is (Score:4, Interesting)

      by Ostracus ( 1354233 ) on Wednesday February 23, 2022 @08:22PM (#62297347) Journal

      Didn't Microsoft's WinFS [wikipedia.org] fail?

      • by gweihir ( 88907 )

        Didn't Microsoft's WinFS [wikipedia.org] fail?

        The WinFS idea failed several times because MS could not make it work well and they really tried. That is the reason why MS still uses the dog-slow NTFS, which suffers from a really bad VFS layer. Still a modern filesystem and still pretty good overall.

      • Yup, and about three hundred other lets-get-rid-of-filesystems initiatives have as well. Which indirectly answers the question, "No, because the last three hundred times we tried this over a period of several decades it failed every time so the next time we try it it's fairly likely to fail yet again".
    • by Junta ( 36770 )

      The problem with that concept is that it's too open ended to get much value out of it You have a fairly narrow set of generally applicable ways to reference a chunk of data. Turns out that a filesystem pretty well covers the utterly common/generic scenarios. If you get more domain specific you can get value out of storage, but trying to cover arbitrarily many domain specific storage aspirations in a single generic design is not going to go well. Besides, the actual demand is pretty low too, there's not

    • Re:Yes it is (Score:5, Insightful)

      by NoNonAlphaCharsHere ( 2201864 ) on Wednesday February 23, 2022 @08:37PM (#62297391)
      Oh, NO! Please, for the love of $DEITY, NO! A thousand times, NO! On my life, and that of my children, NO!!!
      Did we learn nothing from the massive, read-only (at least when it didn't matter) single-point-of-failure that is/was Windows Registry??? Must a multi-bit (in a single byte) error cause complete and utter system degradation/destruction?
      MUST those who refuse to learn the lessons of history relearn them in screaming agony???
      • by davidwr ( 791652 )

        MUST those who refuse to learn the lessons of history relearn them in screaming agony???

        No, some earn the Darwin Award before the pain reaches their brain cells.

      • MUST those who refuse to learn the lessons of history relearn them in screaming agony???

        No, they learn the lessons of history while their users are screaming in agony.

      • I share your contempt for the Windows Registry. Didgets is nothing like it.
    • Typical filesystems are the most common case of hierarchical databases. What general class of database architecture do you have in mind?

      > the objects can be stored as anything, like... an email message, or an image, or an email attachment

      That's true of virtually any filesystem.

      > You could reference every part of your information with it

      I'm not sure what you mean by that.

      > you could tag and manage permissions much more granularly

      More granularly than per-file?
      You are proposing adding permissions on

    • by Kisai ( 213879 )

      No, why make it so much harder to repair and rescue?

      Every time, every damn time, someone comes up with "thing better than last thing", and all that comes of it is more data loss.

      Here's what I'd do, and please take this with a grain of salt, as it's off the cuff, and based on 30 years of experience fixing other peoples messes:
      1. Store every file as a binary blob like so:
      128-bit UUID - Header (true filename, true directory structure, implied permissions, thumbnail, length of file) - File binary data
      2. Disk jo

  • Things evolve, but the headline is useless for news. He is suggesting modified ISAM. Also, if a filesystems need to know what the file is, based on type, it is a database.
  • Keep it simple... (Score:5, Insightful)

    by franzrogar ( 3986783 ) on Wednesday February 23, 2022 @07:52PM (#62297253)

    The one who wrote the article has no f****ng idea what a filesystem is.

    He tried to make us believe that a "database" is a filesystem alternative... poor human.

    • It could be. Some databases donâ(TM)t even need a filesystem, and can take raw block devices and allocate table space directly on the blocks, like Oracle ASM.
      • by Joviex ( 976416 ) on Wednesday February 23, 2022 @08:16PM (#62297333)

        It could be. Some databases donâ(TM)t even need a filesystem, and can take raw block devices and allocate table space directly on the blocks, like Oracle ASM.

        A FILE SYSTEM with markers in the DATABASE as to the location of WHICH BLOCKS are WHICH FILE.

        AMAZING.

    • 100%

    • He tried to make us believe that a "database" is a filesystem alternative... poor human.

      I seem to recall reading that Microsoft was going to do this about 20 years ago.

      https://en.wikipedia.org/wiki/... [wikipedia.org]

    • AS/400 treats the filesystem as a database

    • by narcc ( 412956 )

      Yeah, this isn't about file systems as we understand them. This is about the how people interact with file systems. The whole file/folder concept. You know, the single most useful thing the average user should to learn about computer operation. Curiously, he doesn't abandon this, he just adds some useless features.

      I RTFA, and I can assure you that Andy "DidgetMaster" Lawrence is a moron and his system is idiotic. Even the name is stupid.

      This reminds me of one of those old late night infomercials. You

    • The one who wrote the article has no f****ng idea what a filesystem is.

      He tried to make us believe that a "database" is a filesystem alternative... poor human.

      It however is. Nothing "poor human" about it. And he's not the first to propose it. He's not even the first to have a working example of it. https://en.wikipedia.org/wiki/... [wikipedia.org]. WinFS failed due to performance issues they couldn't resolve but it worked and did address the searching and traversal problem very well. Oracle's DBFS exists too, as does Google's GFS (though the latter focuses more on its distribution capabilities rather than it's database structure).

      The only true poor human is the one who believes

  • Hans Reiser (Score:5, Funny)

    by PPH ( 736903 ) on Wednesday February 23, 2022 @07:57PM (#62297273)

    ... isn't busy right now. We'll put him on the job ASAP.

  • by thesjaakspoiler ( 4782965 ) on Wednesday February 23, 2022 @07:58PM (#62297279)

    My kids store everything on the desktop so why would anything else be needed?
    The entire filesystem could just be reduced to a single folder called 'Desktop'.

    • My kids store everything on the desktop so why would anything else be needed?
      The entire filesystem could just be reduced to a single folder called 'Desktop'.

      I knew people who were following this approach back in the Windows 95 days.

      If I'd paid attention, I'd bet they were doing it in the Windows 3.1 days too.

      • If you were paying attention, you'd know that you couldn't store files on the desktop in Windows 3.1.

    • by AvitarX ( 172628 )

      Exactly.

      It doesn't matter how large my hard drive is. I just need to upgrade my monitor periodically.

  • What a maroon! (Score:5, Informative)

    by Entrope ( 68843 ) on Wednesday February 23, 2022 @08:01PM (#62297295) Homepage

    This guy reinvented the database, but is calling it a filesystem. They semantics and performance characteristics are totally different between the two. Serious users and developers will recognize this as, not a bad code smell, but a bad architecture smell.

    I can't believe we have a dozen or so comments already and nobody else has commented on this yet.

  • by sbszine ( 633428 ) on Wednesday February 23, 2022 @08:01PM (#62297297) Journal
    I'm using APFS (est. 2017) and it's really fast to search for files. Likewise Linux has a bunch of super fast fileystems. Sure, there are heaps of people using NTFS, but that's not because filesystem technology has stood still for decades.
    • And if you use a proper utility on windows, searches are great too. I use "Everything" on my windows computer and it is great for such, but there are a few others too.

      What I do not understand is why Microsoft, that clearly cannot do a search file system tool that is even halfway decent has not bought one of those tolls or companies making the tools..

  • by NateFromMich ( 6359610 ) on Wednesday February 23, 2022 @08:05PM (#62297305)

    Yet, we are still using antiquated file systems that were designed decades ago

    Speak for yourself.

    • Filesystems aren't broken. These Didgets kids should go and find Real Problems to solve, and get the fuck off our lawns.
    • Antiquated? The wheel is antiquated, but we still use it. Beer was old even by Egyptian times, but we still quaff it.

      As for filesystems designed decades ago, I can point to any recent filesystem, be it APFS, btrfs, ZFS, ReFS, even NTFS, that always get maintained with new features added often. For example, ZFS 2.0 has zstd, which is an excellent performer for compression.

      Finally, filesystems are the last thing I ever want to replace willy-nilly. Of all the things that has to work perfectly, no matter wh

    • by gweihir ( 88907 )

      Yet, we are still using antiquated file systems that were designed decades ago

      Speak for yourself.

      Indeed. This idiot wants to throw out hammers simply because they are "antiquated" because they were "designed decades ago". He is probably unaware that we are still using computer systems that were essentially "designed" decades ago. Maybe we should throw them all out and start over? Surely not.

  • by blahplusplus ( 757119 ) on Wednesday February 23, 2022 @08:08PM (#62297311)

    ... the computer is not a god device.

    File systems are designed around the limitations of the CPU, bus and memory. There are "some" bottlenecks that can be slightly enhanced (aka the move to SSD's) but in general. The poster does not sound like he understands that the reason why things are as they are, are not merely because hard drives, CPU and RAM was slower 20 years ago.

    The reality is they had to try to balance performance against what the CPU was capable of processing. When you write or read file to disk and from memory, those are based on the limitations of the system itself.

    Windows wanted a deeper file system based on databases, but they found the "data explosion" (aka performance) was awful, the more data you store about data, the longer it takes to parse.

    You can all take a look here at WinFS (the cancelled "next gen" file system). So the idea that people haven't wanted to improve file systems before is naive.

    https://en.wikipedia.org/wiki/... [wikipedia.org]

    Parsing data is not some trivial operation, when you read directories or files from a disk, your CPU is literally reading and parsing data.

    There is definitely a case to be made for storage and filesystems itself to become "it's own seperate computer" at some point in the future, but that would increase costs. For the home user who is not a computer enthusiast.

    Things like network attached storage and RAID are there to multiply backup/read speeds.

    The more information you want about objects, the more CPU and memory that takes up. You can have infinite abstract details about any file. That is why file systems of the future will essentially mean that an entire computer will have to be dedicated to the file system if one is to have anything like serious performance.

    • the limitations of both hardware and system architects' imaginations, circa the era when the hard drive was invented.

      WinFS may have failed because such a thing is impossible, or it may have failed because of lack of experience.

      How about this: put tags associated with files in a SQLite database, one database per account, off of $HOME in an existing ext4 file system - many of whose features we won't use. Make an API and a file manager that gives a way to create and use tags. Have file formats with headers tha
      • by Junta ( 36770 )

        the limitations of both hardware and system architects' imaginations, circa the era when the hard drive was invented.

        The filesystem as a user facing interface is based around the limitations of the users which have not really 'evolved', the on-disk format backend has evolved greatly since those days and explicitly has been designed around the changes and advances in technology.

        WinFS was a non-starter because it's a solution looking for a problem, and it was supremely messy concept to try to use for anyone but a developer, and even then, a developer might struggle. Even advanced concepts in 'old fashioned' filesystems can

        • by narcc ( 412956 )

          WinFS was a non-starter because it's a solution looking for a problem, and it was supremely messy concept to try to use for anyone but a developer, and even then, a developer might struggle.

          It's an absolute nightmare. It dramatically over-complicates just about everything while adding no obvious value. Why someone didn't shut that down in the "napkin" stage, I'll never figure out.

      • by Entrope ( 68843 )

        How about this: put tags associated with files in a SQLite database, one database per account, off of $HOME in an existing ext4 file system

        How do you extend that to files that can be read by more than one user, especially in directories that may have different combinations of group read and execute (traverse) permissions on top of different group ownership and ACLs?

        • by narcc ( 412956 )

          In the system he describes, tags are stored in the DB, not the files. Things will work fine until someone moves or renames a file, then it's chaos.

    • by Voyager529 ( 1363959 ) <voyager529@yahoo. c o m> on Wednesday February 23, 2022 @08:44PM (#62297413)

      The poster does not sound like he understands that the reason why things are as they are, are not merely because hard drives, CPU and RAM was slower 20 years ago.

      Oh, it's worse than that.

      20 years ago was 2002. Laptops had 20GB hard disks at the time, desktops sporting 40GB drives were common as well. CPUs were already comfortably in the 1.5GHz range, and 128MB of RAM wasn't unheard of. While 1TB wasn't yet a common capacity of a single drive, SCSI arrays with that much storage could be acquired and still leave half a 42U rack free for the servers accessing it.

      Meanwhile, it's not like NTFS got coded back on NT 3.5 and everyone packed up and went home; the file system has undergone plenty of revisions over the years, and while it's still got its problems (enumerating folders with more than a thousand files gets tedious quickly), it's not like there's been no work done on it, even since 2002. While ReFS is starting to make inroads in the more recent releases of Windows, NTFS has had modifications and upgrades that make it entirely possible to leverage it on petabyte volumes without much of a problem. On the flip side, WinFS was promised to do exactly what is being described in TFS, but not only was it unable to make its way out of Longhorn beyond the earliest of beta builds, but it didn't make it into Vista, 7, 8, 10, or 11...meaning that it either was fundamentally flawed or the wrong solution to the problem.

      On the Linux/BSD side, however, there's patently no way to substantiate the claim. EXT4 handles massive file systems just fine, btrfs is making inroads and achieving maturity, and OpenZFS has limits on files and folders that well surpass physical constraints. These file systems are used day in and day out in environments very small and very large, so clearly, something is being done correctly already if AWS and Google and Facebook and Oracle have made these file systems work already.

      So, I don't know whether the poster is for some absurd reason still running FAT16 partitions, or whether the goal is to optimize something that is only a bottleneck in the most extreme cases, or if this is simply a thought exercise regarding how to reinvent the wheel, as if a whole lot of extremely smart people haven't dedicated a whole lot of thought as to how the existing systems currently work.

      • by narcc ( 412956 )

        and 128MB of RAM wasn't unheard of

        In 2002, 128MB would have been on the low-end. Hell, you could get a graphics card with that much. 256 and 512 were common. A friend of mine spent way too much money to put together a 1GB machine sometime in late 2000 or early 2001.

        It was crazy how fast things were moving then. My 486 66 was upgraded from 8 to 32mb of ram, 8mb at a time, between 1994 and 1997. It was finally replaced out of necessity with a K6-2 266 with 64mb of ram in 1998. I don't think that one made it even 3 years before being rep

    • There is definitely a case to be made for storage and filesystems itself to become "it's own seperate computer" at some point in the future, but that would increase costs. For the home user who is not a computer enthusiast.

      Like the disk drives used on the Commodore PET, Vic-20, and C64. They were complete computers with their own CPU, RAM, and ROM that served files to the main system using a custom filesystem kernel. These disk drives ended up costing as much money (and eventually more) as the main computer!

      What's really amusing is that demo coders often use the disk drives as coprocessors, for things like streaming music.

    • by gweihir ( 88907 )

      That is why file systems of the future will essentially mean that an entire computer will have to be dedicated to the file system if one is to have anything like serious performance.

      Yes, pretty much. Just like you often dedicate a separate computer to a database system these days. In many cases, it will not be worth it alt all. The cases where it will be are already nicely addressed by DB systems today. Hence I do not see that happening anytime soon. It significantly increases complexity overall without any real gains and for the few cases where it actually is needed, it can already be done and done well.

    • There is definitely a case to be made for storage and filesystems itself to become "it's own seperate computer" at some point in the future

      The 1541 disk drive was exactly that.

  • This article is just a follow up from another slashdot article published a few weeks ago about millenials having everything stored in a single storage pool in their phones and not being familiar with a proper filesystem. The answer is no - a global object store creates more chaos than it solves. Yes, tags are useful but, if implemented, they should work within a well-structed folder structure which is the only sane answer to managing the chaos on our hard drives.

    • There's one way to find out. Put tags on an existing file system and see if anyone still cares about subfolders.

      It may be that the millennials would be even more disorganized without their tags.
      • Put tags on an existing file system and see if anyone still cares about subfolders.

        KDE did that with the introduction of the semantic desktop, and it turned out to be so annoying that I think all access to it was eventually removed (I can't find it on my systems anymore). It was, and still is, a terrible idea that no one wants. I think even both people who used it shrugged off the removal as "nothing of value was lost."

      • MacOS did that in 1991, and OSX still does it today. It hasn't caught on to any considerable degree, but it's kind of cool.
        https://en.wikipedia.org/wiki/... [wikipedia.org]

  • by ebunga ( 95613 ) on Wednesday February 23, 2022 @08:10PM (#62297317)

    Don't worry. Lennart Pottering will add something like this to systemd next week.

  • A local object store that can effectively manage hundreds of millions of files and find things in seconds based on file type and/or tags attached is possible.

    Yes, we call such things a filesystem.

    File systems are usually free and come with your operating system, so there seems to be little incentive for someone to build a new system from scratch, but just like we needed the internet to come along and change everything we need a better data storage manager.

    So ext3, ext4, jfs, iso9660, BTRFS, ZFS, venti, etc, etc, ad nauseam are just figments of our imagination?

    So, the question is "Is it time to replace file systems with other file systems?"

  • Simple answer NO, long answer HELL NO. File systems have evolved over time and are more than capable and tuned to general use, where you have very specific performance and use cases their are better systems. Don't think I would ever be pointing anyone at Didgets though, seems a very confused response.
    • Well, we're no longer stuck with 8+3 file names in Windows or 14 characters in Unixoids. And we can do Unicode now. I like disk encryption. But how have they evolved recently? Say, in the past decade or so?

      Has the file system been perfected? Or simply exhausted its possibilities for improvement? Because I can't help dreaming of something less awkward than nested subdirectories.
  • Be File System, the file system of BeOS had a lot of modern features when it was introduced 24 years ago. One of them was database-like extended attributes which allowed it really fast file queries, and a bunch of other stuff.

    You can read more on this Ars article:
    https://arstechnica.com/inform... [arstechnica.com]

  • I have documents and images going back to 1996 on a zfs pool. I know where everything is and rarely need to search for things so indexing is disabled. I can understand the need for indexing on the organisation level but it all works just fine for me. Speed, redundancy and always online.
  • Didn't IBM try a Database (ish) tech as a primary file system at one stage?

  • As others have noted, file systems should be simple. If you want a database, install a database on top of your file system which will take a big chunk of file system space (generally called, I think, a "file") and use it as a database. Big data chunks are not a problem precisely because disks are so large now, and when they are a problem it's better to use application specific solutions that are targeted to the actual problem. Modern CPU's are actually pretty good at dealing with a 64K chunk of data even
  • What we need is a filing SYSTEM. Users have no concept of file nomenclature or filing strategies. We don't need a new container for files, we need an automated library system to store and retrieve them.

    What files should be saved? Which files are useless? What do we store in the cloud and what do we keep locally? How do we merge-purge storage media? Where can we organize photos? etc. etc.

    Take a typical phone user. Look at how often they scroll and scroll. What a bloody waste of time. We need a filing syste

    • We need a filing system that is intuitive so that they can find the pictures they want just by describing what they are looking for... etc. etc.

      I like this idea but have no idea how it might be made.

  • I think he's frustrated that no one seems to care about his blog. It's a slog to read through and I'll confess I didn't have the willpower to get through much, but at a glance I feel like he misses the fact that modern filesystems do have a lot of what he'd think novel, but he just does it in an incompatible way.

    In at-scale storage, 'object store' is a buzzword with some meaning. Basically it's 'just enough filesystem', without some of the burdens associated with network/cluster filesystems. The thing is

  • The question isn't whether we should replace filesystems, but rather if we should move core file system services *into* the filesystem. That is, should we embed all of the things that locate does into the filesystem? My answer would be "no" (I prefer single-task entities where possible), but making a filesystem "hook" wouldn't be bad (i.e., trigger X when a file is updated, where X might be an indexing operation). Perhaps we should standardize more metadata, where it is stored, and how it is accessed. T

    • by Junta ( 36770 )

      but making a filesystem "hook" wouldn't be bad (i.e., trigger X when a file is updated, where X might be an indexing operation)

      For what it is worth, that exists, in the form of inotify. An application can register to be notified should a number of events occur relative to files that the application cares to subscribe for. Technically the events *could* have more data (e.g. the IN_MODIFY only tells you that *something* changed and you have no indication of offsets/how much changed so you have to pessimistically assume the whole file changed) but it's not really worth it.

  • by UnknownSoldier ( 67820 ) on Wednesday February 23, 2022 @08:36PM (#62297383)

    /Oblg. Those who don't understand file systems, such as BeOS's BeFS [wikipedia.org]. are doomed to re-implement it.

    The reason we don't use tags for filenames is simple: How do you handle multiple files with the same tag?

    The same reason we don't use 64-bit hashes for filenames is how do you QUICKLY iterate over a sub-set of files??? Second, how does this schema handle filenames, specifically globbing where an user selects a sub-set of files via wildcards such as "*.txt" ? The reason we even use filenames in the first place is because people NAME things. i.e. "Basket Weaving: Dead career or Undervalued skill?"

    If you can't find files then you have a shitty organizational structure. Blaming or changing the FS is NOT going to fix that.

    A file name along with the directory path provides a UNIQUE name. This allows you to have filenames with the same name in DIFFERENT directories. i.e. The ubiquitous README.TXT in many projects.

    Every file system needs meta-data. The problem is meta-data is NOT static. Every year there are new types of file. How does this schema handle that?

    Apple tried File Types back with its ProDOS [wikipedia.org] FS back in 1983 on the Apple 2 which allowed for 256 types. Mac OS extended this to a 4CC (four character codes) which caused constant annoyance at programs could read one file type but not the other. Unix "solved" this problem by using file extensions which is infinitely more flexible.

    Just because a File System "looks" like a database doesn't mean it is on. Conceptually at a high level they are similar but under the hood there are fundamental differences.

    Windows NTFS has piss poor performance [superuser.com] when many files are in the same directory. i.e. More than 5,000 files. The solution is NOT to switch to a new unproven File System but to organize your data better, namely, use the first 1 or 2 characters as a key and make sub-directories. i.e. 26 sub-directories for A-Z, etc.

    The author needs to go take a Comp. Sci. Operating System 300 or 400 class because they have no fucking clue about how or why File Systems are designed the way they are such as modern features as journaling, data integrity, pooling, TRIM support, slack, etc. Or at the very least read up on ZFS design [fujitsu.com].

    Now, get off my LAN. /s

    • by nester ( 14407 ) on Wednesday February 23, 2022 @08:45PM (#62297417)

      CP/M used *.ext, MS-DOS popularized it. Unix-like OSes, have no concept of a file extension. You don't need a dot. A filename is a single string. (Unix does have special files, eg, device files, named pipes, etc, which could be called types, but real, normal files containing 0 or more bytes are all the same type.)

    • The reason we don't use tags for filenames is simple: How do you handle multiple files with the same tag?

      A filename is just a tag. Sometimes the filename gets so mangled that your terminal can't represent it and you have to play stupid tricks to even refer to the file (like referring to it by inode.) But then again, a filesystem is just a [typically] hierarchical database.

  • Their are some file systems, OpenZFS for example, that solve the huge number of files you might want to store.

    Don't believe me?

    ZFS supports:
    - L2ARC which can be used as a read cache for metadata only
    - Metadata vDevs, which can be used for small files, metadata or both
    Either can be SSD via SATA / SAS, or NVMe drive. (Or even exotic PCIe storage.)

    To be clear, the default L2ARC configuration is to cache both metadata & data. But, you can restrict it to metadata.

    That of course over looks the AR
  • by Todd Knarr ( 15451 ) on Wednesday February 23, 2022 @08:48PM (#62297427) Homepage

    People don't organize files today. They won't fill in reasonable sets of tags for an object-store system. They'll end up with the same problems locating stuff either way.

    The complaint about the overhead of directory structure is a valid one, but that's more about the implementation of the directory structure than a question of filesystem vs. something else. I notice too that one of the things the proposed object store does is sacrifice available metadata for space. That seriously limits how people can search for objects. And really, most people don't search for metadata about files, they search based on the content of the files. The only solutions for that for binary files like images are tagging to describe what's in the image, seeing as image-recognition systems are too complex and too compute-intensive for the average desktop system. For text though it's easy enough to scan files and build a database of content vs. file path. Unix "locate" does that already. And yes the "locate" database is big. Do you expect any other database needing to store the same information to be any smaller? That's one of the trade-offs I have to make any time I build a database: speed of access vs. storage needed for the indexes that speed things up. Thankfully storage is relatively cheap so I can usually give up space for speed without running out of space. Same holds true for the "locate" database, if I have a 20TB drive then giving up 10% for the database probably won't put me in a bind as far as space goes.

    What's really needed isn't a new approach, it's a new implementation of a directory structure that's speed-efficient at handling very large numbers of files in a single directory. Because frankly the average person isn't going to do more than rudimentary filing of files in any organized way so the storage system is just going to have to deal with it.

  • The gains are just not worth the time it would take for such a change to be fully accepted and implemented. I am not even convinced that there are any gains. Just different ways to look at something. I feel this is really just some developers out there looking to "disrupt" something. Long story short, if it isn't broke, don't try to fix it.

    • by Jeremi ( 14640 )

      The gains are just not worth the time it would take for such a change to be fully accepted and implemented.

      Indeed. If history is any guide, even snazziest all-singing, all-dancing database-filesystem would be hamstrung by the fact that people need to transfer their files to other filesystems without losing metadata every time they do so. Which means people would avoid the cool filesystem-specific functionality in favor of remaining portable, so it would go largely unused and eventually it would be abandoned in favor of a simpler, faster, traditional filesystem implementation.

  • I went to look at the website this is all about, very few details about the implementation.
    https://didgets.substack.com/p... [substack.com]

    Is it a graph? A tree? An index with metadata on top of existing filesystems?

    • by Entrope ( 68843 )

      Wife: New Shimmer is a floor wax!

      Husband: No, new Shimmer is a dessert topping!

      Now, improved new Shimmer is also a sandwich spread!

  • And do not forget that MS tried this several times so far and failed resoundingly each time. There is a reason for that. That reason is that there is absolutely nothing "antiquated" about moden file-systems. Even the dog-slow MS NTFS is far superior to the alternatives and the truly modern filesystems found in the Unix-world do an excellent job and are quite mature.

    As to searching the contents of file-systems, that is solved and has been for a long, long time: Just bild and maintain a database of file conte

  • We were told there would be something like this in Longhorn. Perhaps when Longhorn, suitably renamed, finally arrives, perhaps it will have a database FS. But hierarchical FSs are good for many things, and aren't about to disappear any time soon.

    • So, why has Microsoft been trying this for ages? Is it because there are some people who find traditional filesystems lacking in some way and they wanted to fix it? Judging by the comments so far, there are a lot of people who think filesystems are perfect the way they are and are offended by anyone who suggests otherwise. According to them anyone who attempts to change the status quo must be a moron and doesn't understand anything.

      I never said that Didgets was a perfect system and it solves every problem
  • Talking with someone that works on storage systems, on the subject of expanding into terabytes, pentabytes, and beyond. I asked with so much data, will anyone go back and look at it? He said no.
    • The percentage someone will look at will decrease, but the chance that the information someone wants will be in there will increase.

  • by Lando ( 9348 )

    Ha, his file system uses 64 Bytes per file, what a waste. If only he used 64 bits per file it would use even less space for 200 Million Files.

  • Huge sized SSDs are still too expensive compared to huge sized HDDs.

  • NTFS, FAT, exFAT, HFS Plus, EXT. These are just some of the file systems we've seen over the past 3 decades.

    In reality, file systems ARE a type of database, known as a hierarchical database. Not all databases are SQL.

    Cloud services offer things called "blob storage" that "replace" file systems. Relational databases like SQL Server, Oracle, Postgres have blob storage. The trouble is, most software can't use the blobs directly. And worse, the performance sucks compared to actual file systems, because database

  • by noodler ( 724788 ) on Thursday February 24, 2022 @06:09AM (#62298597)

    In a Substack article, Didgets developer Andy Lawrence argues his system solves many of the problems associated with the antiquated file systems still in use today. "With Didgets, each record is only 64 bytes which means a table with 200 million records is less than 13GB total, which is much more manageable," writes Lawrence. Didgets also has "a small field in its metadata record that tells whether the file is a photo or a document or a video or some other type," helping to dramatically speed up searches.

    Do you think it's time to replace file systems with an alternative system, such as Didgets? Why or why not?

    The arguments given are that this filesystem makes searching for that file easier.
    But i have two problems with this philosophy.
    1. If you're a light user then you won't have many files to look for. A 'brute' force approach works pretty well and this FS is not really helpful.
    2. If you're a heavy user then you will be organizing things much better and won't need the facilities offered by this new FS.

    If you're in the middle between these just realize that organizing your files is the only good solution to the problem of not organizing your files.

The 11 is for people with the pride of a 10 and the pocketbook of an 8. -- R.B. Greenberg [referring to PDPs?]

Working...