AI Governance

Search LibGen, the Pirated-Books Database That Meta Used to Train AI

The Verge Alex Reisner March 30, 2025 0.0
Search LibGen, the Pirated-Books Database That Meta Used to Train AI
Search LibGen, the Pirated-Books Database That Meta Used to Train AI Millions of books and scientific papers are captured in the collection’s current iteration. Editor’s note: This search tool is part of The Atlantic’s investigation into the Library Genesis data set. You can read an analysis about LibGen and its contents here. Find The Atlantic’s search tool for movie and television writing used to train AI here. Disclaimer: LibGen contains errors. You may, for example, find books that list incorrect authors. This search tool is meant to reflect material that could be used to train AI programs, and that includes material containing mistakes and inaccuracies. It’s impossible to know exactly which parts of LibGen Meta used to train its AI, and which parts it might have decided to exclude; this snapshot was taken in January 2025, after Meta is known to have accessed the database, so some titles here would not have been available to download.
Share
Related Articles
EU Parliament Approves AI Act - First Comprehensive AI Regulation

The EU has approved the AI Act, establishing the first comprehensive...

October 24, 2025 Read
Understanding The EU AI Acts Risk Based Approach - IAPP

[AI Governance Article from IAPP] The...

April 11, 2025 Read
A Glance At The AI Governance Ecosystem Worldwide - IAPP

[AI Governance Article from IAPP] The...

April 10, 2025 Read
White House Issues Executive Order on AI Safety and Security

A new executive order establishes AI safety and security standards for...

April 10, 2025 Read
New AI Regulation Primer Offers Global Snapshot - IAPP

[AI Governance Article from IAPP] The...

April 09, 2025 Read