A state of the art of decentralized web — Part 2

A state of the art of decentralized web — Part 2

2. File storage

This article is the second chapter of a series about web decentralization. Here we are going to focus on how to store files. I believe storage is the most advanced field when speaking about decentralization because it relies on mature and robust technologies: peer-to-peer and cryptography. If you are an application developer (web or mobile), you can already replace your storage layer by a decentralized one.

And wherever you store your data, always anticipate worst-case scenarios 😉

Series articles

  1. Introduction
  2. File storage
  3. Blockchain and smart contracts
  4. Databases — coming soon
  5. Going mobile — coming soon

Why decentralizing file storage?

In typical cloud architectures, we would use cloud services like Amazon S3, Azure Files or Google Cloud Storage. On a security and reliability level, no worry this is a really good choice: files can be server-side encrypted (so the cloud provider don’t have access to your files), you can configure backups and cross-region replication. You also have an infinite scalability and a good set of APIs to access your file from any program.

However, I see two issues with these services:

  • They are centralized, which means that in case of an outage, the whole system is impacted. Sometimes, a typo can turn the web down for four hours.
  • They have a significant price (generally in $/GB/Month), which makes it complicated for a non-profit organization to propose an application using a cloud storage service.

On the other hand, your brand new computer might have a 2TB hard drive while you are only using a tenth of it. Then, why not use this free space to support a collaborative file storage system?

Let's have a look at the main candidates offering such a solution:

  1. InterPlanetary File System
  2. Swarm
  3. Storj
  4. DADI
  5. Dat Project
  6. Sia
  7. Blockstack

That's a lot of options and they sometimes offer similar features so I will focus on what makes them unique. Ready?

1. InterPlanetary File System

IPFS is the Distributed Web

Behind this enigmatic name stands the most mature system as of now in my opinion.

Features:

  • Both a peer-to-peer protocol and a network. It is a combination of Kademlia, BitTorrent, and Git.
  • IPFS can be used standalone, so you don't need to use a specific stack in order to use IPFS.
  • IPFS provides a naming system (IPNS), a CDN and can directly serve websites.
  • Provides no guarantees about data availability or redundancy. FileCoin, one of IPFS sister-projects, proposes to solve that by introducing an incentive system based on blockchain (users that store data will earn FileCoin tokens). We can see this as an “Enterprise edition” of IPFS with an on-chain SLA.
  • Open source. Some modules of IPFS already became independent projects (libp2pIPLD).

Pricing: IPFS is free to use. Users contribute to the network by serving files to their peers. FileCoin price will depend on the market (the more people propose disk storage, the cheaper it will be).

Status: Even though it is still in alpha version since 2015, the protocol seems pretty stable and fast. Skeptical? Then watch this funny cats compilation on DTube, a YouTube-like based on IPFS for storage and Steemit to manage content and reward authors. FileCoin, however, is much younger and not ready to use yet (see the last update of the roadmap).

Going further: For a good theoretical and practical introduction to IPFS, I recommend you this article:

A Hands-on Introduction to IPFS

And if you are interested to understand how IPFS works under the hood (with its name service and various protocols), then watch Juan Benet (founder of IPFS and Filecoin) speaking about the different mechanisms they implemented:

2. Swarm

Swarm

Features:

On a functional and technical point of view, Swarm is very similar to IPFS with two major differences:

  • Swarm is deeply integrated with the Ethereum blockchain. This can be seen as an advantage if you are already using Ethereum in your project, but can also complexify your stack in the other case.
  • Swarm has a built-in incentive system to incite people hosting data. Functionally, using Swarm is equivalent to using IPFS along with FileCoin.

To understand more the similarities and differences between the two projects, a good summary is available here. Ethereum choice of starting its own project has been the subject of many debates, but this kind of concurrence can also make them produce better results (I hope so). Swarm is open source (part of Ethereum code base).

Pricing: Once again, the price will depend on the market

Status: Swarm is still experimental. The Proof-of-Concept 3 has been released, and a testnet is available. The roadmap is available here. Keep a close eye to this project, but don’t use it in production yet.

Going further: To get started with Swarm, read the official documentation.

3. Storj

Decentralized Cloud Storage - Storj

Features:

  • S3 compatible. This means it will have the same concept of Buckets and Objects, but no folders like in a standard file system. This will make it super easy to decentralize the storage layer of an existing application using Amazon S3!
  • Built-in end-to-end encryption.
  • Open source.

Pricing:

  • Storage: $0.015 per GB per month
  • Bandwidth: $0.05 per GB downloaded

Storj proposes a partnership program for open source. In short, every open source software that uses Storj will be rewarded by 10% of the benefices.

Status: The Storj network will be launched in early 2019. See the roadmap here.

Going further: Whitepaper

4. DADI

The local global network

A good description of DADI would be "A decentralized cloud provider". DADI already propose a CDN and a cloud storage service is coming really soon. Let's focus on the CDN here.

Features:

  • Serves static files (assets, websites, …).
  • Data are cached at the edge.
  • Full support for caching, header control, image manipulation, image compression, and image format conversion.
  • Open source.

Pricing: This document details the pricing which depends on the volume of requests. DADI is based on proof of stake, so to become a DADI host you must stake 5,000 $DADI tokens (equivalent to 216$ as of now).

Status: CDN is already available, and the cloud storage should be available beginning of 2019. See their extremely ambitious roadmap here. They recently raised $29M with an ICO.

Going further: A bunch of tutorials is available here (look for the CDN tag).

5. Dat Project

Dat Project - A Community-Driven Web Protocol

Features: What’s Dat? Dat is a peer-to-peer protocol. It offers a CLI, a Node.js API, and a desktop application. Hosted files can be accessed through a web browser at this address: https://datproject.org/{dat-key}. Nothing new? Well, the particularity here is that you share a folder instead of individual files, like a repository with git (dat clone, dat create, dat share), and you can also update your files. With this in mind, we can understand that it is not the best fit to build a decentralized web application (compared to previous solutions). However, it can have a real added value for researchers and scientists willing to publish data (experiment results, training database, …) to the community and apply change easily. Sources are available here.

Pricing: Free to use. However, you are responsible for keeping the data alive (peers only share the data they have cloned, like in a classic BitTorrent system).

Status: Dat is ready to use. See dat.land for examples of usage.

Going further: Dat documentation and tutorials are available here.

6. Sia

Sia

Features: Once again, Sia is blockchain based and remunerates hosts in SiaCoin. When uploading a file, it is guaranteed to stay alive for a period of 3 months, and the contract can be renewed automatically. It is open source too (recently moved to GitLab). Even though Sia offers an API, I believe its main use case is to store users personal data and get rid of services like Dropbox or Google Drive.

Pricing: About $0.002 per GB per month.

Status: Sia is ready to use to store non-critical files. In the next year, they plan to improve speed, support warm storage and add a CDN feature.

Going further: Get started using the API (CLI or JS) or the GUI.

7. Blockstack

Blockstack

Blokstack is more than storage. It is a toolkit that aims to provide everything to create a new decentralized internet. For now, it is composed of a naming system(BNS), a file storage system (Gaia), an authentication system (you can use the same Blockstack id on every app built on Blokstack), and a web browser. Let's focus on the storage layer.

Features:

  • Flexible drivers system (compatible with current cloud storage providers).
  • When used with the Blockstack browser, users can choose where to store its data by running a Gaia hub and using a driver (either on a local disk or on a cloud provider). What's interesting is that the storage solution you choose will be used by all Blockstack applications that need to store data. So you are in control of all your data, you can backup them, delete them, host them or use a third party. You just own your data!
  • Open source.

Pricing: The default storage provider is free. If you want to store all your data on S3, then you will pay S3 fees.If you are a developer, Blockstack just launched its App Mining program: get paid to develop apps on Blockstack. Starting in December, $100,000 in aggregate will be distributed every month (20% for the 1st ranked app, 20% of the remaining 80% for the 2nd, …). That's a lot, and you can use this money to incentive users to use your app too (redistribute part of the money). More information on https://app.co/mining.

Status: Ready to use. At the beginning of 2019, Gaia will be released as an independent project so we will be able to use the storage layer in any project.

Going further: Have a look at Gaia README and blockstack.js documentation.

What’s next?

Well, you now have an amazing choice of technologies to store your files depending on your application needs. Next chapter is about decentralizing the logic of your application using the blockchain.

I hope you are ready because it’s gonna be tricky 🤓

Give some 👏👏👏 if you appreciated this one and are impatient for the continuation. These articles reflect my comprehension of the subject, and discussions are much welcomed in the comments.

A state of the art of decentralized web — Part 2 was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.

Publication date: 
02/12/2019 - 00:39