
Permanent Project Gutenberg Archive
A complete preservation of Project Gutenberg's 75,945 public domain books on the permaweb, accessible via gutenberg.ar.io - demonstrating decentralized infrastructure for cultural heritage at scale with zero ongoing costs.
Overview
During Banned Books Week 2025, AR.IO founder Phil Mataras personally developed and launched the Permanent Gutenberg Archive—a complete preservation of Project Gutenberg's 75,945 public domain books on the Arweave permaweb. The archive at https://gutenberg.ar.io demonstrates that decentralized infrastructure can protect cultural heritage at scale, with zero ongoing costs and no central point of failure.
Project Gutenberg, founded in 1971, is the oldest digital library in the world. For over fifty years it has provided free access to classic literature. But centralized platforms face risks: funding shortfalls, legal disputes, and jurisdictional restrictions. In 2018, Germany blocked the entire Project Gutenberg website over a copyright dispute; Italy has blocked access since 2020. The Permanent Gutenberg Archive provides a resilient backup accessible through any of the 600+ AR.IO gateways worldwide.
The Challenge
Preserving a literary archive of this scale on permanent infrastructure required solving several practical problems:
-
Censorship Resistance: The archive must remain accessible regardless of legal challenges, political pressure, or content removal requests. Books that are public domain in one jurisdiction may face restrictions in another.
-
Geographic Availability: With Project Gutenberg blocked in multiple countries, the solution must provide globally distributed access points so readers can route around regional restrictions.
-
Platform Independence: The archive cannot depend on any single organization's continued operation. If Project Gutenberg ever loses funding or ceases operations, the preserved collection must remain accessible.
-
Sustainable Funding Model: Traditional hosting requires perpetual funding. A one-time upload cost eliminates the risk that subscription lapses could take the archive offline.
-
Functional Access: Permanent storage alone is not enough. Readers need to search, browse, and read books through a usable interface—not just retrieve files from a transaction log.
The Solution
The Permanent Gutenberg Archive is a self-contained web application stored entirely on Arweave and accessible via any AR.IO gateway.
Key elements:
- Every book preserved as an individual plain-text file with permanent metadata
- Fast in-browser search across the entire catalog—no backend servers required
- Immersive reader with adjustable fonts, reading progress tracking, and distraction-free interface
- Human-readable URLs via ArNS (gutenberg.ar.io)
- Gateway-agnostic design that works across the entire AR.IO network
Implementation Highlights
Serverless Architecture: The entire application runs in your browser. Search, filtering, and sorting happen locally against a compact 7 MB index file.
Plain Text Format: Every book is stored as a clean, consistently formatted text file—useful for reading, AI training, research, and data analysis.
Permanent Metadata: Each book includes tags for title, author, language, and Project Gutenberg ID stored directly on Arweave. Discovery and verification work without external databases.
Gateway Independence: The application detects which AR.IO gateway you're using and loads all content from it. If one gateway is unavailable, simply access through another.
Single Payment Model: All content was uploaded with one payment. No subscriptions, no renewals, no ongoing costs.
Results
- 75,945 books permanently preserved on Arweave
- 28 GB of literary content across multiple languages
- Accessible through 600+ AR.IO gateways worldwide
- Zero ongoing hosting costs after initial upload
- Fully functional reading experience with search, browse, and immersive reader
- Complete backup of the world's oldest digital library
Why It Matters
The Permanent Gutenberg Archive validates decentralized storage for cultural preservation at scale. It demonstrates that:
- Public domain works can be preserved permanently without ongoing funding or institutional support
- Geographic restrictions can be bypassed through distributed gateway infrastructure
- Platform risk can be eliminated by removing dependence on any single organization
The project also establishes a replicable pattern. The same approach can preserve other collections: historical archives, government documents, scientific datasets, or any corpus that deserves protection from platform failure and content restrictions.
Johannes Gutenberg's printing press made books affordable. This archive extends that legacy: ensuring public domain literature remains free and accessible, permanently.
Takeaway
The AR.IO founder built the Permanent Gutenberg Archive to prove a point: permanent, decentralized preservation works at scale, today.
75,945 books. 28 gigabytes. One payment. Forever accessible.
This is what freedom to read looks like.
Resources:
- Live Archive: https://gutenberg.ar.io
- Source Code: https://github.com/vilenarios/perma-gutenberg