Understanding The Elusive Blockchain Technology — Part 1

Despite the popular notion, Blockchains are not that complex once you get the general idea behind them.

What makes Blockchain so interestingly difficult to implement is that it is made up of some many different components of varying complexity and a good Blockchain developer needs to know how to connect these components in the right order to build good Blockchains.

In this post, I will explain what a Blockchain actually is, then start talking about various features of Blockchain such as Hashing, Immutability, and Distributed P2P Network.

This is Part 1 of 2 of my series on Blockchain. Part 2 will be out soon!

Defining Blockchain

The credit for inventing Blockchain technology is generally given to an anonymous entity (could be one person or a group, nobody knows) named Satoshi Nakamoto and his academic paper titled “Bitcoin: A Peer-to-Peer Electronic Cash System” published in 2008.

But there were two other people before Satoshi who spoke about something very similar to Blockchain. These individuals were Stuart Haber and Scott Stornetta and they released a paper titled “How to timestamp a digital document”.

While this paper doesn’t openly discuss anything related to Blockchain, several concepts, features, and ideas of what makes Blockchain… Blockchain are present in this paper.

But what is a Blockchain?

If you go to Wikipedia — the universal collection of definition, you will see that it is defined as “a growing list of records, called blocks, which are linked using cryptography”.

In simple terms, it is a list of blocks, where the blocks are linked to each other in the form of a chain.

The next question in your mind should be:

What is a Block?

A Block in a Blockchain is defined as a record that consists of some data, a pointer called previous hash that addresses the block that comes before it, and another pointer called hash that addresses the block itself.

Here is an example of a simple blockchain made of 2 blocks:

The first block is called the genesis block. Once this block is created, it will stay the same for all of eternity. There is no way, any other block can take its place. Like any other block, it will have some data stored in it. But because it is the first block in the chain, and there is obviously no block before it, its previous hash value will be a string of zeros. The block also has a hash value, which is a string of hexadecimal characters.

The second block will also have some data stored inside it. And it’s previous hash value will be the same as the previous block’s hash value. Apart from that, it will also have a hash value which will the previous hash value of the next block in the chain, and so on.

The hashes are what link the blocks to each other. Each block knows its own identity and the identity of the block before it. This method of linking also secures the blockchain from any changes to the data of the block.

The hash of a block is created by taking the data stored inside it and the previous hash. So if I was somehow able to change the data of one block, then that block’s hash will also change. But now the next block’s previous hash value will not match with the new hash value of the changed block. And then the blockchain will know that the block has been corrupted, and the blockchain will then become invalid.

Hashing

The previous few paragraphs describe how hashing connects the block to each other, and how it can also be used to secure the blocks.

But what exactly is Hashing?

When it comes to human beings, there are a couple of things that can be used to differentiate and identify each of them:

DNA
Fingerprint

And while identical twins share the same DNA, they do not have the same fingerprints. So fingerprints can be considered as the ultimate unique identifier of any individual.

Hashing is a blockchain’s version of fingerprints!

There are different algorithms that can be used for Hashing, the most popular one is the SHA 256 and it is what is used in most blockchains and cryptocurrencies.

SHA stands for Secure Hashing Algorithm and it has many different versions. SHA-2 is a set of cryptographic hash functions that are designed by the National Security Agency (NSA) and contains six hash functions.

The “256” part of SHA-256 is used because that’s the number of bits it takes up in memory. The Hash created by this algorithm will be 64 characters long and it will be in hexadecimal format. Hexadecimal stands for 16, so each character of the hash takes up 4 bits (4² = 16, and 16*4 = 64).

Hashing can be used on any kind of data, be it money, documents, programs or videos.

SHA-256 hash calculator. Online SHA-256 hash generator. Mining Bitcoin

The above link takes you to an online hash generating website. Try entering different words in the input area and see what SHA-256 Hash you get. Here are couple that I got:

Rajat -> 126e11da02566f50f0f9449586cd9069ba908ce82ebfd69d6c3b9f48f56853a5

rajat -> a2acfac793fcfd3ced6506d76fabfcaea56fcb07d59a173663f003b85465a18f

Block ->211d0bb8cf4f5b5202c2a9b7996e483898644aa24714b1e10edd80a54ba4b560

You will notice that both “Rajat” and “rajat”, while being the same word, have completely different hashes due to one of them being capitalized. This complete change of hashes due to slight changes in data is known as the avalanche effect.

It doesn’t matter if your use hash on a single letter, a single word, or an entire Harry Potter book, the length of the hash will always be 64 characters.

There are a few things you need to keep in mind while working with a hash function:

The hash function works one way only. You will not be able to figure out what the original data was using the hash.
The hash function should give the same output regardless of time. So if I use hash function on “Rajat” and get some hash, and then a couple days later I do the same thing again, I should get the same hash.
Hash function should give the hash quickly.
Even the smallest changes in data should completely change the hash.
Because the hash will always be 64 bits long, we have a limited number of potential hashes to work with. So chances are that sooner or later, you will get two blocks with the same hash value. But such collision will rarely occur and there are ways to work around them in case they happen. But hackers can also create collisions cases, which leads to further problems in blockchain.

You can read more about Secure Hash Algorithms here.

Immutability

Immutability is one of the most popular features of Blockchain. To be immutable means to remain unchanged. So in terms of Blockchain, once you create a block, no one change make any changes to it. No even the person who created.

This is a great way of maintaining trust in Blockchain.

In the real world, when you buy something big, like a house, you are actually exchanging a fixed amount of money for a deed that says that the house is yours.

But then anyone can write up a deed saying that the house that you just bought, is actually theirs!

To prevent such practices, you go to a central authority such as the city council and register your ownership. And now you legally own that house!

Similarly, in the blockchain, there needs to be some way to prevent anyone from taking unauthorized ownership of a block or the entire chain.

But Blockchain doesn’t have any central authority where users can register their blocks!

While you may think of this as a drawback of the blockchain, it is actually an advantage!

If blockchain had a central authority that recorded any data about the blocks, then a hacker simply needs to hack into it and can then make any unauthorized changes. Instead, a blockchain is decentralized.

This decentralization of systems was one of the main things that people like Satoshi Nakamoto and Vitalik Buterin (the guy behind Ethereum) spoke about.

Previously, I spoke about how changing the data on a block will change its hash value. If someone manages to do this, they will have to change the previous hash value of the next block as well, which will then change the hash value of that block as well. The hacker will then have to change the previous hash value of the next hash value, and so on.

So in order to change the data of one block, the hacker will have to make changes to the rest of the blocks as well! This will require some serious computing power and hence is considered to be an impossible task for an average hacker.

And this how Hashing is used to implement immutability in a Blockchain!

Distributed P2P Network

A Blockchain network is known as a distributed Peer to Peer Network will have multiple computers (aka nodes) interconnected to each other, with no central authority. Here is an example:

Each node will have a copy of the Blockchain. Here you see a small network of 5 computers. But in reality, Blockchain networks can have anywhere from hundreds to thousands of computers connected to each other. And these computers don’t have to be high-end. Any average laptop or desktop can become a node in a blockchain network.

While each node has a copy of the blockchain, they cannot actually someone else’s data without proper authorization.

When a node adds a block to the end of the blockchain, other nodes are notified by the network about it, and all the copies of the blockchain are updated accordingly.

It’s not just hackers that we need to worry about. Sometimes, glitches occur that can lead to different kinds of error in one of the blockchain copies.

Due to Blockchain’s immutability, users cannot make changes to the blocks once they are added to the blockchain. How can we then correct the chain if one of its copies shows different data than the rest, without there being any malicious force behind it?

The Blockchain network is constantly syncing all the nodes with each other. So when one of the node’s copies of Blockchain gets corrupted, the network will realize that something is wrong with this copy, and the network will replace the corrupted blocks of that chain with the uncorrupted blocks from the other copies.

The only way to fool the Blockchain network is by making changes to majority nodes of the network. So if a network has 100 nodes, the hacker will have to change more than 50 nodes, which is still a huge number.

Hence the security of a Blockchain network increases as its number of users increases.

To be continued…

This seems like a good place to pause.

Stick around for more posts about Blockchain. In the next post, I will explain about:

Mining
Concensus

Thanks for reading this long post! I hope this post helped you understand Blockchains a little better. If you liked this post, then please do give me a few 👏 and please feel free to comment below. Cheers!

This concludes Part 1 of 2 of my series on Blockchain. Part 2 will be out soon!

Understanding The Elusive Blockchain Technology — Part 1 was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.

Publication date

02/11/2019 - 15:06