Spotify Is Writing Massive Amounts of Junk Data To Storage Drives (arstechnica.com) 196
An anonymous reader quotes a report from Ars Technica: For almost five months -- possibly longer -- the Spotify music streaming app has been assaulting users' storage devices with enough data to potentially take years off their expected lifespans. Reports of tens or in some cases hundreds of gigabytes being written in an hour aren't uncommon, and occasionally the recorded amounts are measured in terabytes. The overload happens even when Spotify is idle and isn't storing any songs locally. The behavior poses an unnecessary burden on users' storage devices, particularly solid state drives, which come with a finite amount of write capacity. Continuously writing hundreds of gigabytes of needless data to a drive every day for months or years on end has the potential to cause an SSD to die years earlier than it otherwise would. And yet, Spotify apps for Windows, Mac, and Linux have engaged in this data assault since at least the middle of June, when multiple users reported the problem in the company's official support forum. Three Ars reporters who ran Spotify on Macs and PCs had no trouble reproducing the problem reported not only in the above-mentioned Spotify forum but also on Reddit, Hacker News, and elsewhere. Typically, the app wrote from 5 to 10 GB of data in less than an hour on Ars reporters' machines, even when the app was idle. Leaving Spotify running for periods longer than a day resulted in amounts as high as 700 GB. According to comments left in the Spotify forum in the past 24 hours, the bug has been fixed in version 1.0.42, which is in the process of being rolled out.
Typical of today's programmer (Score:5, Insightful)
Re:Typical of today's programmer (Score:5, Interesting)
Wonder what this does to people's data plans and consumption of their monthly limits...
So glad I just use local music files and don't stream. Write once, maybe again to add some more music, then just read many,,,
Re: (Score:2)
Re: (Score:2)
That's a whole separate problem, that's only gonna get worse over the next 4 years....
Re: (Score:2)
So... after an infinite amount of time the database size would asymptotically trend towards zero?
That's some awesome compacting, man!
Re: (Score:3)
Data compaction is easy. Here's the entire NSA archive in compacted form: 1. It's a bit lossy, but with the right expansion program it'll work fine.
Re: (Score:2, Interesting)
Today's programmers? It's been rampant since at least the 1990's...
Re: (Score:3, Interesting)
It's called Gates Law, because it's the opposite of Moore's Law.
Every 18 months hardware became[1] twice as fast, and every 18 months software becomes[1] half as fast.
[1] This trend has mostly stopped for hardware, but software is still becoming slower with each new version, something I can see at the office where everybody is complaining about how slow the PCs are running with Windows 10, where as mine is running Windows 7 just fine[2].
[2] Well, fine for Windows anyway. Of course things don't happen instan
Re: (Score:3)
Remember when the OS fit on a floppy and only did the most basic tasks rather than spying on the user and trying to be everything including the kitchen sink?
Re: (Score:3)
I remember when the "OS" was stored in ROM ICs, computers didn't even have floppy drives and could boot under one second.
Re: (Score:2)
Good point...forgot to go back that far
Re: (Score:2)
I also remember when the OS ran on top of a Basic interpreter. I never experienced it, but I understand there were OSs written in Basic.
Let's not even start with entering start points with toggle switches to get the system to boot.
Now a real old timer will show up to regale us with stories of stringing his own core memory, getting a stitch wrong...
Re: (Score:2)
I also remember when the OS ran on top of a Basic interpreter. I never experienced it, but I understand there were OSs written in Basic.
To clarify what others have said, the OS wasn't written in- nor ran under- BASIC. (#) Both the OS and the BASIC interpreter were themselves written in machine code.
What *was* the case (AFAIK) is that on many 8-bit machines there wasn't such a clear-cut distinction between the functionality of the BASIC interpreter and that of the OS itself; or, at least, much of the OS functionality was accessed through the BASIC command line by default.
For example, on the Sinclair ZX Spectrum or Commodore 64 (or most o
Re: (Score:2)
"I remember when the "OS" was stored in ROM ICs, computers didn't even have floppy drives and could boot under one second"
And accomplish what else? Compared to modern PCs, those were hopelessly archaic & difficult to use.
But what we have no is hopelessly bloated, no argument there.
Re: (Score:2)
On my Atari ST,with it's OS in 192 kB of ROM and 2.5 MB of RAM (upgraded from the stock 1 MB by soldering in two 1 MB SIMMs), I could run Calamus Desktop Publishing, Finale music scoring, PureC C-compiler (very compact and fast code), Bugaboo debugger, Signum typesetting, and many, many more. All in a windowed desktop environment. It may not have been as refined as contemporary counterparts, but is was not at all as difficult to use as you indicate.
Re: (Score:2)
Unlike, say, the Amiga with its full pre-emptive multitasking OS. Did I just reopen the ST vs. Amiga holy war? I think I did!
Joking aside, while I know that the ST did later get "proper" multitasking via Mint/MultiTOS, it's interesting to note that while the Amiga hardware was more advanced in many respects, the OS itself- including the
Re: (Score:2)
If you are genuinely nostalgic for that era, you can still get a device with that level of complexity and computing power from companies like HP and Texas Instruments.
Re: (Score:2)
I routinely use the ATmega328P, it does a pretty similar job when used with the Arduino IDE.
Re: (Score:2)
Yep. (A)bort, (R)etry, (F)ail.
Good times.
Setting IRQs with tiny little DIP switches.
Swapping out the floppy drive with the spell checker.
80 x 25 screens.
Hercules Graphics cards! Whoooeeee!
33K "high speed' modems.
Indeed. Those days were the pinnacle of Western civilization.
Re:Typical of today's programmer (Score:5, Insightful)
If you can't differentiate between bad programming and high-level programming with abstractions, you're part of the problem.
PS lots of great software is written in higher level languages than you're probably capable of ever reading.
Re: (Score:2, Funny)
English isn't apparently among them.
Re: (Score:2)
I see two unclear messages because GGP mashed his two thoughts into one sentence, reusing a fragment. I know of no languages that allow that trick, nor punctuation that could make intent clear.
Also GGP appears to be a LISP snob. Functional anyhow, deserving of any shit flipped back his way.
Getting lost mid sentence and hacking your way back sometimes happens when speaking. GGP had time to read before hitting submit.
Re: (Score:2)
You're such a language expert yet you couldn't derive simple meaning in context from something that is far from understandable and written more clearly than much of the basic communication that goes on in the world today.
You have an amazing understanding of gamma and sentence construction but what you lack is just general understanding.
Re:Typical of today's programmer (Score:4, Insightful)
Bandwidth, memory, clock cycles....don't matter. Use more shitty layers of abstraction over layers built into high level languages, then kick it out the door.
Well, what do you expect? Everyone expects client programmers to support more devices, more user for less money, cheaper / free apps. The last 3 places I've worked at had no QA department whatsoever.
I know it's fashionable to shake the fist at 'lazy' programmers, but the fact is we expect more functionality from less dev time, requiring abstractions, libraries that aren't completely controlled or understood, testing skipped, etc. Programmers aren't the problem, relentless competition is.
Re: (Score:2, Insightful)
Well, what do you expect? Everyone expects client programmers to support more devices, more user for less money, cheaper / free apps. The last 3 places I've worked at had no QA department whatsoever.
I know it's fashionable to shake the fist at 'lazy' programmers, but the fact is we expect more functionality from less dev time, requiring abstractions, libraries that aren't completely controlled or understood, testing skipped, etc. Programmers aren't the problem, relentless competition is.
I certainly expect better. Open source delivers quality - again and again. Any organization with an actual budget ought to do better. And please note that the competition is on quality, not on prettyness, and not on delivery date either.
Also, this can't be a bug resulting from sloppy programming. Sloppy/quick programming results in apps that crash "occationally" and a lot of corner cases that aren't quite right. This MASSIVE writing is something else entirely. Fortunately, spotify is not necessary. In my ca
Re: (Score:2)
Re: (Score:2)
Open source is not a magic panacea that fixes all ills. It requires dedicated programmers with alot of time, just like anything else. The many-eyes-make-all-bugs-shallow mantra has failed many times, have you followed the OpenSSL Heartbleed?
If you don't think that this can happen easily then I guess you've not been in programming very long, or at all. Computers will quite happily do something repetitive and destructive in a loop forever, and in a way that is almost invisible to the programmer unless they're
Re: (Score:2)
I certainly expect better. Open source delivers quality - again and again. Any organization with an actual budget ought to do better.
This statement demonstrates deep misunderstanding of the open source process.
Yes, successful open source projects do deliver high quality, and the result is free to user and other developers... but it is by no means effortless. In fact, open source projects require a lot of overhead that projects internal to a reasonably-efficient organization do not. Communication is slower and more difficult, individual developers tend to have less of the context and less focus, etc. Open source succeeds not because it
Re: (Score:2)
And please note that the competition is on quality, not on prettyness, and not on delivery date either.
Cheap, Fast, Good: Pick any two.
Re: (Score:2)
But-but-but... open source delivers quality, again and again. :)
So your example means nothing
Re: (Score:2)
The obvious counterpoint is that at least with clamscan, he could have commented out the OOM logging, or added exit(1) in its place, or performed some other mitigation to stop the bad behavior. Can't do that with Spotify or other closed source programs, you're just fucked until/unless the vendor releases an update.
Re: (Score:2)
I have 16GB and when clamscan automatically kicked on (Linux anti-virus scan)
Why are you running a virus scanner on Linux?
it ran out of memory and was logging out of memory errors as fast as the hard drive could take it. The same message over and over and over and over and over and over and over and over. Took me four hours to track it down because all your programs start crashing when they can't write their little temp files and it wasn't using a lot of CPU. I didn't want to restart because I had something open I needed to save.
Ran out of memory, or ran out of storage?
Running out of actual memory is fun, too--the handful of times it's happened to me on Linux, nothing actually crashes; it just stops. I could still switch between different open windows, although the screen started painting. Could still run terminal commands but trying to tab-complete would just spit out a bash error :) Trying to recall from memory how to call up a list of processes and kill them via the terminal (because
Re: (Score:2)
And those unrealistic schedules and feature requests are often caused by the business plan, which needs to survive contact with the marketplace. If people would wait for and pay for better software, much of the problems would go away.
Re: (Score:2)
Re: (Score:2)
When it's about the behaviour of an underlying library (as it was in this case) that's not properly understood by the programmers using it.
Re: (Score:3)
If you don't understand an API and what the functions do, it doesn't matter if the code you write has any abstraction. You're still probably going to screw up.
Re: (Score:2)
Re: Typical of today's programmer (Score:2)
It's poor implementation, lack of appropriate testing and, in a lot of cases the aforementioned is a result of unrealistic deadlines.
Re: (Score:2)
high level languages
I agree. We need to fire all C programmers and go back to Assembly only. None of this high level abstraction stuff.
Re: (Score:2)
I disagree. I say we need to fire even assembly programmers and go back to only using the letters in your name.
Re: (Score:2)
Copy con: myprogram.exe
Then type in the headers, opcodes and operands with alt-numpad.
Hex editors, bah.
Re: (Score:2)
high level languages
I agree. We need to fire all C programmers and go back to Assembly only. None of this high level abstraction stuff.
Assembly?? Microcode or nothing, bitches!!
Re: (Score:2)
A magnetized needle and a steady hand.
http://xkcd.com/378/ [xkcd.com]
Re: (Score:2)
Emacs FTW!!
Re: (Score:2)
#hahaonlyserious
Actually I *am* an emacs user as well :)
Re: (Score:2)
Very much so. And when you tell them that they are doing it wrong, they first do not believe you and then they start to cry. We have far too many coders and most of them really bad.
Re: (Score:2)
Very much so. And when you tell them that they are doing it wrong, they first do not believe you and then they start to cry. We have far too many coders and most of them really bad.
We don't have too many coders, we have an industry that is immature because it's far too hard to avoid making stupid mistakes. Other industries can handle below-average participants without collapsing (/causing catastrophic outages, security leaks, whatever). You or I might be awesome, but there can only ever be so many great programmers, half of all programmers are below average. And we need them too. All industries attempt to make the skill easier & safer, and that's a *good* thing. You can always jus
Re: (Score:2)
Re: (Score:2)
Seems a bit late ... (Score:2)
... for highlighting the potential for damage as news, don't ya think?
Persistance abstracted to far? (Score:5, Informative)
This sounds like some smart software architect to the abstraction of the persistance/storage layer of the Spotify stack too far whilst at the same time storing to much of miniscule datapoints in Spotifys objects. Because once abstracted properly, adding attributes to your objects and the entire stack is trivial.
Think of it:
If your stacks ORM neatly abstracts everything concerning persistance and on the backside syncs on neatly whenever it has the opportunity, all you need is app-side developers and software designers storing every little piece of data they can find and that changes evers millisecond and then you have your bandwidth/load disaster as described.
If something like this is the case with Spotify, which I do strongly suspect, it is a good example that goes to show that you can take clean-room design too far. And that a haphazard duct-tape and chickenwire approach to product development can have significant advantages, as you build around unforseen roadblocks on a daily basis and only add the features really needed.
I see an example of this every day, as I am currently doing WordPress development and building a WordPress pipeline for an agency. Large parts of the WP legacy architecture are an abysmally convoluted mess built by people who shouldn't have been let near a keyboard 15 years ago. But having a non-developer build a production capable demo of a website in WP is significantly faster than starting with an actual UX prototype, which quickly leads our team into real-world problems that we often haven't suspected. And suddenly a proper ORM and cleanroom design would cause hassle at one end or the other.
My to eurocents.
Spotify, why (Score:5, Informative)
I use Google Play Music. Not only can it cache songs, you can also upload your own collection. And now that Google has acquired and integrated Songza, their playlists are awesome.
Re: (Score:3)
Re: (Score:2)
Google won't allow me to use the family plan with my custom domain, thanks for rubbing it in.
I finally just had to have everyone in the family get a gmail.com account and use the family plan on that. It's a minor annoyance on laptop/desktop, because you have to have a separate browser profile logged into the gmail account you don't use for anything else. It's not a problem at all on Android, which supports multiple Google accounts very cleanly, and I imagine it's fine on iOS because iOS has no notion of Google accounts device-wide, so you'd just log the Google Music app into the gmail account.
Re: (Score:2)
Don't they charge for playing random music that you didnt upload yourself? At least spotify has a free tier.
Google's free tier is called YouTube :P
(Of course if you pay, YouTube gets better)
Re: (Score:2)
there's a free tier also on Google Play Music but it's just in the USA and Canada. There's no audio ads but there's a limit on how many skips you can do per hour. It's similar to the Pandora free tier.
has this been going on for years? (Score:4, Informative)
Here is a possibly related complaint [spotify.com] from almost three years ago.
Browsers are doing it too (Score:2)
If you leave browsers up all all the time, they have the same problem. Firefox and Chrome. https://www.grc.com/sn/sn-580.... [grc.com]
disk usage consideration (Score:2)
Such unnecessary disk i/o wears my disk down, increases power use (if I'm on say battery on my laptop) and of course creates a kind of internal DoS as it hogs the disk i/o and rest of proc
Re: (Score:2)
Which means that developers often don't bother to see what happens on a slower machine with the user having limited privileges.
So it's a bug? (Score:2)
Spotify Is Writing Massive Amounts of Junk Data To Storage Drives
Or are they talking about the music files?
Warranty for software is needed (Score:2)
That will teach 'em. It is absurd what programmers can do and get away with it, simply because you click on "I agree" on EULA starting with "No warranty"
Yes, it is going to be expensive, but software is getting worse.
Re: (Score:2, Interesting)
Nah, then you'd see an increased network usage. This is probably just Firefox's fsync bug [mozilla.org] repeated: in order to ensure data integrity, SQLite has a mode that fsyncs on commit. (After all, if the data isn't written to storage, it isn't really committed.) If you combine that with autocommit after every minor transaction, you get a ton of fsyncs and massive data usage.
Re: (Score:2)
So this would mostly be small-writes for something which is essentially metadata? Talk about these people not even having a faint clue what they are doing. With the write-amplification you get in an SSD for small writes, this can probably kill a modern SSD in a week or less.
Re:Do not store songs locally (Score:5, Insightful)
I think the gist of it is that for every small change to the data they store on your device, they are re-writing the entirety of the dataset they are keeping. So for instance they are logging a record that says "didnt play music this minute" but are re-writing the entire multi-year log.
I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"
Re:Do not store songs locally (Score:5, Informative)
From the comments on Ars, it seems pretty clear that there is a bug in the app causing it to repeatedly compact the sqlite database it uses. I'm sure we all know that that is something which should be done only when actually needed, so that's clearly a bug, not inefficiency.
Re:Do not store songs locally (Score:5, Funny)
I blame XML and other formats that are used for the stupid reason that "we already have XML routines so lets use it for everything"
XML is like violence - if it doesn't solve your problem, you aren't using enough of it.
Re:Do not store songs locally (Score:5, Funny)
If you're using XML to solve a problem, you actually have two problems.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I remember XML in one application I worked on, things like <AVeryVeryLongFieldNameThatTakesALotOfCharacters>A</AVeryVeryLongFieldNameThatTakesALotOfCharacters> on a slow and flaky connection.
Re: (Score:2)
This is a non issue on desktops, really.
Correction: I think you meant to say this is a non-issue on desktops that are not using solid state drives.
Re:Do not store songs locally (Score:5, Insightful)
Rust spinners wear out too. This can be a particular problem if it's constantly bringing the drive out of power-down.
Re: (Score:2)
That would only be a problem on laptops. Desktop drives spin-down far less often, if at all. (Mine do not. No reason for them to.)
Re: (Score:2)
Mine do. Most energy efficient or otherwise green drives spin down very frequently. Not burning through 6w continuously spinning something that isn't doing anything constructive is a pretty good reason to.
Re: (Score:2)
You seem to have no idea how much power even an idle PC consumes. A 6W change is typically below what you can measure on mains-inlet.
Re: (Score:2)
You seem to have no idea how much power even an idle PC consumes.
I may have no idea, but the power meters on my computer do. Around 350W when I'm playing games. Around 270W when just taxing the CPU. Around 90W when the computer is sitting there idle with the screen off, 89W with the screen off and the HDDs powered down too. So the 2 HDDs in my computer use 10% of the total power load of an idle PC. And that's a 5 year old not very energy efficient one.
Now looking at my server at home it uses 47W idle. And just over 110W when serving files from both arrays. So the HDDs o
Re: (Score:2)
Also if you can't measure 6W then don't buy your measurement equipment from Alibaba.
And there the discussion stops, as you just failed EE101.
Re: (Score:2)
Actually I am an EE, have been for many years. Nice try though.
Re: (Score:2)
A lot depends on the settings. Some WD drives were too "lazy" so they were constantly parking/unparking. Google wdidle3.
Re: (Score:3)
Problem solved.
This is a non issue on desktops, really.
It takes a pretty small worldview to not be able to imagine people on limited bandwidth / unreliable internet connections.
Re: (Score:3)
Re: (Score:3)
You must be new here
(notices UID) err, now I'm just confused...
Re: (Score:3)
Wait, there's articles now?
Re:SSD finite write capacity help (Score:4, Funny)
Why the hell would you put a pagefile in a ramdisk? "Yo dawg, I heard you love pages?"
Re: (Score:2, Funny)
pagefile on a ramdisk is awesome because if you don't have enough ram all you have to do is either add ram to need less paging or add ram to increase the size of your ramdisk - you can't go wrong! It also saves a lot of expensive hard disk storage, especially when you put the computer to sleep.
Re: (Score:2)
It also helps when you're rebooting, loading data from RAM is a lot faster than loading from a hard drive or even a SSD.
FTFY (Score:3)
Generally there is no reason to do that, but there are some poorly coded applications that will page memory to disk, even when they don't need to.
Re: (Score:2)
And it can be simpler to put a pagefile in a ramdisk than to replace the application.
Re:SSD finite write capacity helpIf (Score:4, Informative)
If you're writing enough to pagefiles, you need more RAM anyway.
If you're writing a lot to temporary areas, you need to stop doing so.
That said, I'm on an SSD machine at the moment that has been running for 6 months, with absolutely no special treatment, imaged from a years-old working PC without changing anything, and it's written 1.5TB. 1TB of that was the initial imaging process.
It's the main workhorse in an IT Office in a school, use for 10+ hours every single day for everything imaginable. Client machines rarely use much.
It has a write-life of 100TB. If it dies, I just hit F12 and re-image cleanly.
At current usage (not including the initial image), I count that as 1TB of write a year, which gives longer the expected lifetime of the PC itself, however far out I am.
There's no need for special treatment, no need to use special SSD transfer software, no need to over-provision, or increase RAM cache or anything else. Just have a PC that isn't slogging itself to death, and slap an SSD in.
Don't expect it to last forever, but you shouldn't need to adjust ANYTHING at all.
And I've done this on all the staff work machines earlier this year - zero failures so far and it has made much more of a performance difference than doubling the amount of RAM. In fact, where machines had motherboards that were limited in RAM, we SSD'd and saw HUGE performance increases better than those clients whose RAM we doubled but are running on traditional hard disks.
At home I have a 1TB EVO 850 and that's the same. Literally imaged byte-for-byte, and is stupendously fast and no need for any software changes whatsoever, and the write numbers are predicting 20+ years of life despite a similar 10+ hours a day of usage.
Don't RELY on it never failing. But they are going to be in warranty (whether that's by number of years, or data written) for the life of your machine, under even heavy usage, unless you're doing something incredibly stupid (like use in NVR, RAID, or similar without buying a high-write-endurance model).
Re: SSD finite write capacity help (Score:2, Funny)
Should I also move my HOSTS file to a ram disk?
Re: (Score:2)
If it's an APK HOSTS file I would suggest a ramdisk is the perfect place for it. Don't forget to reboot after installing it. Even if software doesn't ask you to it's always a good idea to reboot a windows machine after installing software.
Re: (Score:3)
Pagefiles I don't put on software ramdisk (had to clarify that), but on HDD instead
So you put the things that benefits the most from fast i/o on your slowest storage device instead of your ssd? Why not put it on a floppy drive, or a mounted network share connected to a VPS hosted on the other side of the country if you like to slow things down?
Or maybe you just love that spinnig hdd sound.
Re: (Score:2)
Wouldn't you be better served with something that isn't bottle necked by it's old connection?
Re: (Score:2)
Pagefiles I don't put on software ramdisk (had to clarify that), but on HDD instead
I place my pagefile o a "True SSD" as I call it based on DDR Ram
Dude you should pick one version and stick with it. Or let one of the voices win.
Re: (Score:2)
yes I have multiple pagefiles (2gb on IRAM & 512mb on a WD 10,000 rpm Raptor driven off of a Promise Ex-8350 128mb ECC ram caching raid sata 1/2 controller)
I see. I guess in your mind this explains why you can tell that you have your pagefile on HDD, not ram, but also that you have your pagefile on ram, not HDD. Let's call it a Shrodinger's pagefile.
Re: (Score:2)
that's just what "your kind" does, lol... apk
Oh, so you're a racist on top of everything?
Re: (Score:2)
Dude take a chill pill. Express yourself more clearly and don't contradict yourself if you want people to take you seriously. Those links you keep posting to other messages in the same thread are not supporting your points, they just make you look like an aspie with a grudge.
apk is using fake names all the time (Score:2)
I don't take FAKE NAME ONLINE
Yes you did just that in another thread, pretending to be someone else and linking back to this thread. Unfortunately your unique way to express yourself betrayed you. Next time try to write full sentences and don't constantly refer to the titles of your posts if you want to conceal your identity. The fact that you're probably one of the only persons on Slashdot who frequently posts links to other comments also was an obvious tell.
apk is a proven liar. (Score:2)
You're no longer fooling me with your torrent of babble and bragging. Skate around it as much ss you want, but twice in this thread you've been caught lying, and now you've also been caught pretending to be someone else in other threads while waging your little vengeful campaign.
You're not merely the excentric techie people assume you are. You're a dishonest, scheming individual that just happens to have a hard time expressing himself succinctly and clearly. I'm disappointed, it's like finding out that the