NewsNTech
Atlantic reporter Alex Reisner has identified four datasets of music being used to train artificial intelligence models and built a fully searchable public database from them.
The disclosure puts previously opaque training pipelines under direct scrutiny, with Google and Stability AI both having confirmed use of the datasets in published research papers.
Scale of the Datasets Two of the four collections are vast by any measure. One contains roughly 12 million tracks; a second holds approximately 9 million.
The remaining two are smaller but still substantial, each exceeding 100,000 songs.
Keep reading