Back
Link Rot 2: Under the Surface (Part 2 of 3)
Up to this point, the solution to invasive link rot on the web has been superficial: fixing broken links and archiving pages. But the underlying problem remains and approximately 66.5% of all links on the internet disappear within a few short years of being published.
If you want to get rid of the weeds in your garden, you need to pull out the roots, not just the plants on the surface.
To understand link rot and how to fix it, we need to look at the root of the problem - the actual underlying infrastructure of the internet.
In our last article we highlighted many of the surface, exploring existing attempts to fix link rot on the internet. We also introduced the Permaweb and how it goes to the source of the problem and changes the reliability of data, effectively eliminating link rot.
In the second part of the series, we will look below the surface at the foundation of the modern internet and compare that with the Permaweb.
The Problem with the Current Internet Infrastructure
Current internet infrastructure relies heavily on IP addresses and DNS (Domain Name System) to locate and access web content.
IP addresses come in several forms, but often look something like 70.66.211.32
. This IP address is akin to a street address and phone number for your computer, enabling other computers to both locate and communicate with it.
However, just like in real life where people move and change phone numbers, the IP address for your data can move as well. More often than not, the move leaves ‘no forwarding address’ and the ‘phone is disconnected’ so computers can no longer find or communicate with your data. The prevalence of link rot shows that data moves much more than people do!
Here are several key infrastructure issues in the current internet that contribute to link rot:
IP Addresses as Locations: Typically there are a number of IPs for a given DNS address. Who has control of those IPs? Linking to IPs gives the people running that set of servers control. Whereas, links to data give users control (they can keep the data alive).
Centralized DNS: DNS maps domain names to IP addresses. If the DNS records change or are deleted, the link to the content is lost.
Temporary Hosting: Web content is often hosted on servers that are only as reliable as the next monthly payment. They can easily be shut down, moved, or repurposed, leading to broken links.
Why Link Rot Happens
The above infrastructure issues till the soil and plant the seeds of link rot. But there are other factors that help link rot become prevalent:
Website Deletions: Websites or specific pages can be taken down intentionally by the host or malicioulsy through censorship, making links pointing to them invalid.
URL Changes: Website restructuring or migration to new servers can radically change a site’s URLs.
Server Issues: Servers can go offline or be decommissioned, making the content inaccessible.
Beneath all of this is a faulty premise: data on the modern internet is tied to server locations. This means that the stability of all links on the internet is tied to the stability of the servers.
And, moreover, it’s a group of servers and the primary issue is who controls those servers. Those who control the servers are the ones who will control the content (and links).
But since the content on these servers can easily be changed or moved, there is no end user control or foundation for long-term data reliability.
In the end, this is why you get 65% of links being broken.
So we need a different foundation for our data. How can the Permaweb solve this problem?
How the Permaweb Works
As mentioned before, the Permaweb stands for the permanent web, a collection of all the webpages, apps, and files stored on top of the Arweave blockchain network. It is similar in feel and use of the current internet except that everything uploaded to it remains permanent and cannot be deleted. This effectively eliminates link rot as every piece of data is designed to remain constant for centuries.
How is this possible? What is happening below the surface to prevent the noxious weeds of link rot?
With the permaweb, users reference data through transaction IDs or content addresses instead of server locations. This allows for more resilient and stable connections to information.
A transaction ID is a special code that identifies a specific action or data transfer on a blockchain. It's like a receipt number you get when you buy something online, showing that your transaction (like uploading or storing a file) has been successfully processed and recorded securely.
On the Permaweb, every piece of data gets a unique transaction ID to ensure it can be found anywhere, anytime. In other words, data on the permaweb is location independent - it is identified through a unique ID instead of being bound to a specific server location. So instead of having a street address and phone number for your data, the data itself is given something like a ‘bar code’, that goes wherever it goes.
What is Location Independence?
Location independence means that data can be accessed without being tied to a specific physical location. This method is permissionless and relies on decentralized systems like ArNS (Arweave Name System) and digital signatures to identify content.
The Current Internet
Locations: Websites are accessed through specific locations (IP addresses).
Permissioned Location References: IP addresses are managed through a centralized system, DNS, that maps domain names to IP addresses.
The Permaweb
Values: Websites are accessed through unique identifiers (Transaction IDs and Content Addresses) that are not tied to specific locations.
Permissionless Value References: Decentralized services like ArNS map human-readable names to these unique identifiers without needing central permission.
When you visit a website today, your browser uses an IP address to find the server with the website’s data. Let’s hope that server is up and running! In the future, as content moves to the permaweb, websites will be found using Transaction IDs and content addresses independent of any one server - a more flexible and resilient approach.
Why does it matter?
The current internet may seem stable from day to day, but over time it is highly fluid and ever changing, resulting in the disappearance of countless cultural, economic and scientific works.
The problems listed below are not something we should simply 'try and live with’:
Links can break at any time
Links can mutate at any time as sites change
There is no way to permanently preserve the content of outbound links
Consumers have little recourse for lost data
Few apps support credible exit (if you leave an app, you leave all your data behind)
There are no immutable values on the web
Few are concerned with these issues and software developers are used to working around them. But a better way to organize the foundations of the internet has arrived, and the potential implications are vast. Our digital gardens may soon look much more beautiful.
ar.io