Libraries are nonetheless a superb useful resource for bodily media, whether or not it’s books, audio CDs, DVDs, or different content material sorts. One factor libraries nonetheless have not found out how one can constantly make accessible to communities at giant is digital content material. There are many digital libraries accessible on-line, however considerations about piracy and correct compensation for media rights holders make the expertise difficult. It is an issue Anna’s Archive, self-described as “the biggest really open library in human historical past,” is attempting to unravel.
In a fully beautiful twist, Anna’s Archive introduced it backed up nearly the entire music accessible on Spotify. The Dec. 20 weblog put up reveals Anna’s Archive “found a method to scrape Spotify at scale,” and the group “noticed a task for us right here to construct a music archive primarily geared toward preservation.” The information backup incorporates 86 million music recordsdata, which Anna’s Archive says represents 99.6% of Spotify listens.
What Anna’s Archive managed to again up
Anna’s Archive stated it selected to again up Spotify tracks primarily based on the corporate’s personal reputation metric. There are a ton of songs on Spotify that get just about zero listens. For perspective, the archive estimates the highest three songs on Spotify have been streamed greater than the underside 20 to 100 million songs mixed. In all, the backup contains metadata from 256 million tracks and audio recordsdata for 86 million songs.
Spotify defines its reputation metric as “a worth between 0 and 100, with 100 being the preferred.” It is calculated by an algorithm that is “primarily based, in probably the most half, on the entire variety of performs the monitor has had and the way latest these performs are.”
Utilizing this categorization, Anna’s Archive backed up the 86 million most-popular songs, which accounts for 37% of Spotify’s whole catalog. Nonetheless, it additionally makes up 99.6% of listens. In different phrases, whereas the archive backed up lower than half of Spotify songs, it covers nearly the entire tracks individuals truly hearken to.
Whereas Anna’s Archive backed up Spotify metadata for 99.9% of tracks, making it the biggest music metadata archive on the planet, it stopped at solely 37% of Spotify music recordsdata on account of storage constraints. The 86 million archived songs characterize 300TB of storage, and the remainder would’ve required 700TB of further storage “for minor profit,” based on the weblog put up.
The music recordsdata are formatted in OGG Vorbis at 160kbps for songs with a reputation metric higher than zero. Songs with a reputation of zero have been re-encoded in OGG Vorbis at 75kbps. Anna’s Archive added metadata to the audio recordsdata, together with “together with title, url, ISRC, UPC, album artwork, and replaygain data.” Audio recordsdata usually comprise no metadata of their very own, so that is vital.
Spotify says that is simply scraping utilizing ‘illicit ways’
Now we have to level out that Anna’s Archive backup is prohibited for a wide range of causes. The scraping of Spotify’s databases violate the corporate’s phrases of service, and the elimination of digital rights administration (DRM) options and sharing of copyrighted materials each violate copyright legislation. By definition, the Anna’s Archive music backup is piracy.
Spotify appears to agree, because it made statements to each Android Authority and Ars Technica commenting on the Anna’s Archive launch.
“An investigation into unauthorized entry recognized {that a} third get together scraped public metadata and used illicit ways to avoid DRM to entry a number of the platform’s audio recordsdata,” Spotify informed Android Authority. “We’re actively investigating the incident.”
Notably, Spotify would not verify the scope of the Anna’s Archive backup, solely saying that “some” of the positioning’s audio recordsdata have been accessed. In a separate assertion, Spotify stated it’s taking motion to forestall one thing like this from taking place once more.
“We have applied new safeguards for some of these anti-copyright assaults and are actively monitoring for suspicious conduct,” a Spotify spokesperson informed Ars Technica. “Since day one, now we have stood with the artist neighborhood towards piracy, and we’re actively working with our business companions to guard creators and defend their rights.”
Whereas Anna’s Archive cites altruistic motivations as their causes for attempting to “protect” Spotify’s music catalog, there are main considerations for artists, report labels, and streaming providers. The backup might create methods for listeners to stream music with out paying for it, hurting the music business. As it’s presently launched, it might be tough for the typical listener to seek out or stream particular person songs inside the 300TB backup, however that would change.
“For now it is a torrents-only archive geared toward preservation, but when there’s sufficient curiosity, we might add downloading of particular person recordsdata to Anna’s Archive,” the archive’s weblog put up notes. “Please tell us if you happen to’d like this.”
It is presently unclear what, if any, authorized motion might be taken towards Anna’s Archive on account of this transfer. Theoretically, the archive’s decentralized community construction prevents it from being shuttered fully. Nonetheless, with regards to music, there’s some huge cash on the road — giving rights holders and regulators incentive to guard copyrighted materials.
In September 2025, the Web Archive settled a lawsuit claiming it served as an “unlawful report retailer” for 4,000 songs (by way of Reuters). As a reminder, Anna’s Archive simply backed up 86 million.
Is that this preservation or piracy?
As a music enjoyer and somebody who intently follows the business, I see each side right here. There’s a legitimate argument to be made for the necessity to protect digital media.
On a excessive stage, songs can rapidly develop into “misplaced media” with out preservation — misplaced media is often outlined as “any kind of media thought to now not exist in any format, or for which no copies will be situated, partial or in any other case.” The concept of music changing into misplaced media is terrifying, and if archival can forestall that from taking place, a preservation angle begins to make sense.
Simply this month, Taylor Swift changed the unique variations of two songs with new recordings with altered lyrics. With out bodily media or digital archives, these authentic recordings might disappear endlessly.
One more reason I purchase the altruistic purpose of the Anna’s Archive backup is the standard of the songs scraped. At 160kbps, the highest-quality songs are very low-quality, making them much less interesting to listeners. These music recordsdata are lower-quality than 256kbps AAC and far worse than any lossless format. The archive might’ve backed up fewer songs at larger high quality, however it did not, which tells me this actually was about preservation.
Here is the issue: from a authorized perspective, it would not matter. That is piracy based on U.S. copyright legislation. I am unable to let you know whether or not Anna’s Archive is right on an ethical or moral foundation, however I can let you know its actions are unlawful. And if the songs stripped from Spotify are made simply accessible for customers as an alternative choice to paying for music, it might do irreparable hurt to the music business.
Android Central doesn’t condone the sharing or distribution of copyrighted materials. You’re accountable for following the native copyright legal guidelines in your nation or area.
