Massive Spotify hack: this is how the biggest leak of its music catalog has happened

  • The Anna's Archive collective claims to have copied some 86 million songs and metadata from 256 million tracks from Spotify, close to 99,6% of the music listened to on the platform.
  • Spotify acknowledges unauthorized access through scraping of public data and evasion of DRM systems to access some audio files, but does not confirm the extent of the attack.
  • The group justifies the operation as a “cultural preservation” project, while the industry warns about piracy, its use for training AI, and the impact on copyright.
  • The incident calls into question the security of the streaming model and reopens the debate about who should guarantee the preservation of digital musical heritage.

Massive Spotify hack

Un massive hack to the heart of Spotify's catalog It has shaken the music industry in the middle of the Christmas season. The activist group Anna's Archive claims to have copied virtually the entire music library of the Swedish platform, in an operation that combines large-scale scraping, evasion of protection systems, and a discourse of "cultural preservation" that clashes head-on with copyright law.

According to the group's own account, the The dump would be around 300 TB of informationwith metadata for 256 million tracks and audio files for approximately 86 million songs, which, according to their calculations, would cover around 99,6% of the music typically listened to on the service. Spotify, for its part, It allows unauthorized access and data extraction.But he tones down the language and avoids, for now, confirming the size of the alleged loot.

What exactly happened and what does Spotify acknowledge?

Spotify data leak

The incident began to become public when Anna's Archive released a statement in their blog where they claimed to have “backed up” virtually the entire Spotify library. There they detail that they have archived metadata from 256 million tracks and 86 million audio files, packaged in large torrents ranked by popularity according to the streaming platform's own metrics.

The Stockholm-based company has confirmed to several media outlets, including international news agencies and technology publications, that it is investigating a case of unauthorized accessIn their version, a third party allegedly collected public metadata through scraping and used “illicit tactics"to circumvent the systems of Digital Rights Management (DRM) and to reach “some audio files”, without offering specific figures.

At the same time, Spotify emphasizes that There is no evidence of a direct impact on usersNo personal data, passwords, or financial information were compromised. According to their official statement, the incident was limited to music content and data associated with the catalog, with no apparent impact on individual accounts.

The company claims to have deactivated the accounts involved in the scraping, reinforced the detection mechanisms of anomalous behavior and deployed new security measures against this type of copyright infringement. Meanwhile, the internal investigation remains open and further action is not ruled out. legal action against those responsible.

How the dump was performed: massive scraping and DRM in the spotlight

Massive Spotify scraping

The technical heart of the case lies in the combined use of scraping and DRM evasionAnna's Archive explains that it identified a way to extract information from Spotify on a large scale using automated software and bots, capable of traversing the platform's systems, collecting data, and converting it into a structured set ready for archiving and distribution.

El scraping public data —such as song metadata, artist names, identifiers, or album art— is relatively common on the internet. However, in this case, the group admits to having found a way to to circumvent protective barriers to also access some of the audio files that were protected by DRM, a step that fully opens the legal front.

According to the activists themselves, the result is an archive of almost 300 terabytes which is being distributed through “massive torrentsThe release of the material is done in phases: first metadata, then the most played songs, and finally the long tail of least listened-to tracks, until they cover what they describe as 99,6% of Spotify listens.

From a legal standpoint, the operation is difficult to defend: The songs hosted on Spotify are subject to copyright. These files belong to record labels, artists, and other rights holders. Copying and redistributing these files without permission not only violates the platform's terms of service but also European and Spanish intellectual property laws.

Anna's Archive: from text to music under the banner of “preservation”

Music preservation and hacking

The group responsible for the leak, Anna's ArchiveHe is no stranger to controversy. He presents himself as a open source, non-profit digital library which centralizes books, academic articles, and other types of content, much of it still under copyright. Several governments and organizations had already pointed the finger at the platform for distributing copyrighted material.

This time, the group states that its goal is to create “the world’s first fully open music preservation archive”They maintain that their mission is to preserve the knowledge and culture of humanity, without distinguishing between types of media, and that Spotify is "a great start" even though it does not contain all the music that exists.

One of the central arguments of Anna's Archive is the risk that lesser-known music disappears If platforms lose licenses, close down, or modify their catalogs, millions of works that receive very little attention could disappear without a trace if streaming services are relied upon as the only access layer.

The group argues that its archive guarantees that Small artists, local scenes, and experimental proposals are also preserved. which lack the backing of large companies. However, the way in which this material has been obtained and distributed—through hacking and torrents—clearly places the operation outside the legal framework and strains the balance between public interest and copyright.

The long tail of streaming: millions of songs that almost no one listens to

Beyond the reputational blow, the analysis of the extracted metadata reveals an uncomfortable aspect for the industry: enormous asymmetry in music consumption within Spotify. Anna's Archive itself points to a scenario in which a minority of songs concentrate the vast majority of plays, while a vast "long tail" barely registers any activity.

Some data circulating in the sector suggests that A very significant part of the catalog doesn't even reach 1.000 plays.This is an indicator of the extent to which the streaming model favors major hits and leaves most works in digital limbo. This reality reinforces recent debates, such as the minimum listening threshold to generate royalties which Spotify itself has introduced.

For the activist group, these numbers confirm that The democratization promised by streaming has many downsides.In their narrative, the mass copying of the catalog would serve to preserve precisely those marginal, experimental or minority songs that, without an external copy, could disappear without a trace if the commercial or technical conditions of the platform change.

On the industry side, however, the fact that all that data structure—including popularity information, relationships between artists, versions, and audio analysis—can circulate freely represents a risk of unfair competition and a serious headache for business models based on subscription and controlled access.

Access is not preservation: the Achilles' heel of the streaming model

The case reopens a debate that libraries, archives, and preservation experts have been raising for some time: Having access on a platform does not equate to preserving the workDigital catalogs depend on temporary licenses, commercial agreements, and business decisions that can change from one day to the next, leaving out titles that do not have another copy accessible to the public.

Anna's Archive's operation exploits precisely that gap. The collective presents itself as a cultural insurance against the volatility of platformsThey argued that if Spotify were to disappear tomorrow or its licenses were to be reduced, large portions of the digital music heritage could be degraded or lost.

However, the existence of a real preservation problem does not automatically make it acceptable. the mass copying and redistribution of copyrighted worksThe clash between the aspiration for open archives and the legal protection of creators remains a minefield, both in Europe and in the rest of the world.

In this context, the leak also puts pressure on the public institutions themselves, which They have de facto delegated a large part of cultural access to private platformsThe episode forces us to ask who should assume the responsibility for preserving in the long term a catalog that is already a central part of contemporary musical memory.

Impact on the European music industry and reaction from artists

In the European ecosystem, where Spotify plays a particularly dominant role, the The hacking has set off all the alarms. among record labels, collecting societies, and artists' associations. Beyond the potential for piracy of the content, there is concern about the precedent it sets for any subscription-based service with DRM.

This incident adds to an already tense atmosphere. In recent years, independent musicians and labels in Spain and the rest of Europe They have denounced the inadequacy of payments per stream, the opacity in recommendation algorithms, and a distribution of income that they consider unbalanced in favor of major record labels and superstars.

The unease has translated into public campaigns, targeted withdrawals from catalogs and a growing political debate about regulating streaming. The Spotify-Anna's Archive case could become additional ammunition for those who claim stricter rules on transparency and protection of rights in the digital environment.

At the same time, some artists and cultural groups see the leak as a symptom of a larger problem: the almost total dependence on private infrastructure for the circulation of music, in the face of the weakness of public networks for archiving and sound preservation in the European Union.

Does the hack affect Spotify users?

From the perspective of the average user, the company insists that There are no immediate causes for alarmThere have been no reports of leaks of credentials, email addresses, bank details or listening histories that could be linked to this specific operation, and the service continues to operate normally in Spain and the rest of Europe.

According to Spotify, the accounts that have been deactivated correspond to malicious users involved in scrapingNot to legitimate customers. Nor have any access restrictions or feature reductions been announced as a direct result of the incident.

Where the impact could be noticeable is in the tightening of policies and APIs of the platform. It is expected that the company will further restrict automated access to certain data, increase pressure on third-party applications, and reinforce DRM layers, with potential side effects on developers, researchers, and legitimate projects working with public information from the service.

In any case, the main risk for European users in the short term lies in the appearance of unauthorized copies of the catalog en pirate services that promise “free Spotify” or similar variations. Besides the legal problems, these types of platforms often involve Exposure to malware, theft of personal data and an experience far removed from the security assumed in regulated services.

The perfect loot for training artificial intelligence

One particularly sensitive consequence of this case is the potential use of the filtered dataset for training generative artificial intelligence models specialized in music. The combination of millions of audio files and 256 million detailed metadata records constitutes a exceptional training material for technology companies.

AI ethics and intellectual property experts have long warned that Pirated content is often used to feed algorithmsevading license fees and without the creators' consent. The publishing industry itself has already denounced similar practices in the training of text models by large technology companies.

In the music industry, a dataset like Spotify's could boost models capable of imitating styles, voices or sounds based on pre-existing works, increasing the pressure on artists' income and further complicating the traceability of unauthorized uses.

So far Spotify has declined to comment on this scenario, but the European cultural sector is increasingly calling for a Specific regulations for the use of copyrighted works in AI training, a field where EU regulations are progressing, but still lagging behind the speed of technology.

A serious warning for the entire streaming model

If the figures reported by Anna's Archive are close to reality, the episode would involve the biggest leak of music content in history And it's a direct blow to the security narrative that has accompanied streaming over the last decade. The idea that a catalog protected by DRM, licensing agreements, and complex anti-fraud systems can be copied almost in its entirety challenges the assumption that controlled access is inherently more secure than ownership.

For the major players in the digital economy—from video platforms to subscription-based reading services—the Spotify case functions as a reminder that no system is infallibleAny service that bases its model on cloud-based access control, rather than on local copies in the user's hands, shares the same vulnerabilities to a greater or lesser extent.

At the same time, the leak brings the fundamental question back to the forefront: who should be responsible for preserving digital cultural heritageIf private platforms fail, should states, public institutions, the creators themselves, or activist groups outside the law take over? None of these options is free of legal, ethical, and practical problems.

While Spotify continues its investigation and Anna's Archive maintains its plan to release data in phases via torrents, The European music industry is watching with concern. How what is probably the largest music catalog on the planet could be circulating on P2P networks. What's at stake goes far beyond a specific company: it affects the balance between technology, copyright, business, and citizen access to culture in the streaming era.

ciberseguridad
Related article:
AI and cybersecurity: real risks and industry responses

Follow us on Google News