The world produces an overwhelming amount of digital data each day.
For instance, it is estimated that 1.1 trillion MBs of data is uploaded to the internet every day. While that number is difficult to comprehend, it is easier to look at our own lives and see how many pictures, videos, documents, and other forms of data our personal and business lives generate daily.
All of this data creates a real storage problem. When our phones or cloud accounts fill up, where do we store our files without getting locked into monthly subscriptions?
It is easy to find places to store data for short periods of time whether it is on your phone or a popular cloud service. The more difficult problem has always been: how can I store this data long-term?
A fragile internet
When we upload files to the internet, it’s easy to assume they will be there forever. But an exploration of the current state of information storage on the internet reveals a very different picture. In truth, the internet is a very fragile system.
Let’s take a brief look at the fragility of our data storage systems from the perspective of:
The fluid internet
When the average person uploads their data to the internet what do they think is happening? Are we simply dumping another bucket into the ever-expanding ocean of data? Or, is there as much data coming out of that data ocean as there is being put in?
When you look at the stability of the information of the internet it often surprises people. The information on the web is anything but static, and can best described as fluid.
Consider these factors:
- Information is transient. One third of all the information on the internet is changed or gone within two years of it being put up; and after 20 years the majority of it has turned over.
- Link rot is common. Link rot refers to broken links to a web page that no longer exists. If you go back to 1998, 72 percent of the links from the internet at that time are dead. Overall, more than half of all articles in the New York Times have at least one rotted link. Read More
- Even highly important web pages suffer from link rot. A study at Harvard discovered that more than 70% of the URLs within three legal journals, and 50% of the URLs within U.S. Supreme Court opinions suffer reference rot. Read Study
- Changing terms of service. Companies continually change their terms of service from what you originally signed up for. Do most people know that most companies state that they own the data you upload onto their platforms?
- High profile cases of lost data from social media giants Facebook, Instagram, Twitter, My Space and tech titans DropBox and Google.
What this all points to is that our personal and business data is not as safe as we think it is. There are short-term implications of this potential loss, but also long-term consequences of how we are actually going to pass our most valuable resources to future generations.
Read a Related Article: Internet is a Collective Hallucination: the rotting of the Internet and Lose of Data
Read a Related Article: Raiders of the Lost Web: How a Pulitzer Prize finalist's 34-page essay got lost from the web
An example from Web3: Where is my NFT stored?
Another example of data fragility (or, our lack of knowledge about how it is actually being stored) can be seen clearly in NFT storage.
The past few years have introduced a new term into our vocabulary: NFT or non-fungible token.
The most popular way that NFTs have been used, up to this point, is as digital art. NFTs have certainly caught the attention of a segment of our population and have been bought and sold for millions of dollars.
If you buy an NFT for a significant amount of money you would think that people would want to know where the digital file is stored? However, this conversation is ignored more than it should be.
It’s not that people don’t care that it is stored safely: they just assume that it is on some type of blockchain somewhere and that is good enough. This is correct in some cases of NFTs that are stored permanently on Arweave.
However, more often than not NFTs are simply stored on centralized servers like AWS, Google or even in someone's personal Google Drive, Dropbox or OneDrive account.
Of course, the main problem lies in who maintains or keeps on paying for the account to stay active. The person who bought the NFT is relying on the seller of the NFT to maintain the account for decades to come. It is not hard to imagine the negative outcomes that will result from this.
The obvious question becomes, what happens to the value of the NFT if the underlying media file disappears? For example, when the crypto company FTX became insolvent, the assets on its NFT platform also became inaccessible - including high value items like Coachella lifetime passes.
Moreover, what happens when NFTs move beyond digital art and become key financial instruments or integrated with deep business cases? What underlying storage needs to happen to allow this type of transaction to take place so the value of the NFT is captured or can grow?
The storage underneath needs to be as solid as the business on top to make any sense in the long-run to the purchaser of an NFT.
Personal and business storage problems
Everyone has a horror story about data loss at some point in their business or personal careers.
The next time you are at a party ask people about a time when they lost important files. Everyone has a painful story.
Furthermore, a survey of enterprise business storage problems uncovered:
- 60% of enterprises experienced public cloud outrages with 22% of those experiencing data loss
- 41% of enterprises had experienced high or unexpected egress costs
- 53% of enterprises were unable to use public cloud as data storage due to regulatory and compliance
- 70% of enterprises are interested in their data storage solution being sustainable
Personal storage problems from a survey of the ar.io team revealed:
- Wedding photos lost forever after a computer failure
- Important business files not being backed up
- USB failures after only a few years
- Faulty hard drives
- Social media accounts being indiscriminately closed
The maintenance of personal and business data is a complex problem that has been difficult to solve.
Reliable long term storage can be achieved, but involves a multifaceted approach with backups on multiple hard drives, USB drives, and subscription cloud services that will need to be updated frequently for true data preservation.
This manual approach requires significant amounts of time and money to be done right.
The New Library of Alexandria
Thus far, our survey of long-term data storage has been bleak. We discussed the fragility of data on the internet, the rapid loss of information in the digital world and the increased censorship of data. Are there any positive examples of how data can be stored for long periods of time?
The Library of Alexandria was founded in the early 2nd Century BC as a storehouse of the world’s cumulative knowledge. It became the largest library of its time, with as many as 400,000 scrolls including many of the world's greatest literary and scientific treasures. The Library remained in existence for more than 400 years before being burned.
The longevity of the Library of Alexandria has inspired a new group of computer scientists, led by Sam Williams, to reimagine the problem of long-term data storage. They think of their work as building a New Library of Alexandria, but this time in a digital form where it cannot easily be destroyed.
How do they do this?
Turn to our next article, What is Arweave?, to find out!