Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Data Storage

Dropbox Open Sources DivANS: a Compression Algorithm In Rust Compiled To WASM (dropbox.com) 33

Slashdot reader danielrh writes: DivANS is a new compression algorithm developed at Dropbox that can be denser than Brotli, 7zip or zstd at the cost of compression and decompression speed. The code uses some of the new vector intrinsics in Rust and is multithreaded. It has a demo running in the browser.

One of the new ideas is that it has an Intermediate Representation, like a compiler, and that lets developers mashup different compression algorithms and build compression optimizers that run over the IR. The project is looking for community involvement and experimentation.

This discussion has been archived. No new comments can be posted.

Dropbox Open Sources DivANS: a Compression Algorithm In Rust Compiled To WASM

Comments Filter:
  • Apache License (Score:4, Informative)

    by infolation ( 840436 ) on Saturday June 23, 2018 @01:47PM (#56834330)
    It's not mentioned in the article but DivANS is released under the Apache License.
  • What's the Weissman score?

  • I like Dropbox and I'm sure they have a nice algorithm, but . . .

    Does nobody remember the ultimate compression algorithm from 1995 that could scrunch any amount of data to less than 1024 bytes. The DataFiles/16 program got quite a lot of publicity for WEB Technologies.

    As I recall there were some inconveniences; for instance for really serious compression one had to run the software multiple times- compress, then compress the resulting file, then compress that resulting file. Nevertheless that was a lot of c

  • Even a 1% improvement in compression efficiency can make a huge difference.

    Hard Drive Cost Per Gigabyte [backblaze.com] — July 2017

    Looks like we're on track for $20/TB, if you purchase in bulk.

    Let's monetize a "huge difference" at $1000 (which I regard as the smallest available value for a "huge difference").

    Thus, your 1% extra compression needs to save 50 TB to make a "huge difference" of one large.

    Correct me if I'm wrong, but I'm thinking your dataset needs to be on the order of 5 PB for a 1% compression improvemen

    • Here are the prices [amazon.com] for Amazon cloud storage. Depending on the type of storage, it ranges from $0.025 to $0.125 per GB per month. Yes, that's a lot more than buying a hard drive, but a huge stack of hard drives is pretty useless for storing lots of data. This gives immediate availability to all your data, backups, etc.

      Let's say a company has 1 PB of data they need to store. Depending on the type of storage they need, that will cost between $25,000 and $125,000 per month. A 1% reduction in that cost cou

      • by Kjella ( 173770 )

        Let's say a company has 1 PB of data they need to store. Depending on the type of storage they need, that will cost between $25,000 and $125,000 per month. A 1% reduction in that cost could save them over $1000 per month, which is definitely meaningful.

        Except 1PB is a lot of data. Walmart for example have 40 PB [forbes.com] in their data cloud, so they could save ~$40,000 on a $500,000,000,000 business. CERN has 200 PB so that'd save ~$200,000 compared to the $9,000,000,000 budget of the LHC. It's a rounding error and I think if you're working with that kind of data you've already worked on much more specific ways to compress it that won't leave much value in a general compression algorithm. Like Google working on a new video compression algorithm for YouTube makes se

    • by Anonymous Coward

      Correct me if I'm wrong

      You are not wrong, but you are missing the entire point of this.

      If it was about disk storage space then Dropbox would be fine with just compressing it locally. There would be no need whatsoever to compile to WASM.
      The point of having a WASM compressor is that you can compress the files on the client side without them having to install any programs for it.
      The saving isn't in disk space, it is in bandwidth.

      While compressing in the browser might be inefficient there is still an extra bonus for Dropbox here sinc

//GO.SYSIN DD *, DOODAH, DOODAH

Working...