This project is a proof of concept of building a decentralized key-value store on top of IPFS. This project was build as part of the course Building Scalable Blockchain Applications with Big Data Technology at the Hasso Plattner Institute.
The main focus of this project is a consensus algorithms for a network of nodes to agree on a consistent order of inserts transactions. The consensus algorithm is based on ELASTICO by Luu et al. (2016) which promises near linear agreement throughput with an increasing node count. For network communication Akka Cluster is used. The results were evaluated using different sized networks with up to 200 nodes.
For development, you can run the service locally with additional dummy actors using:
gradle -p store appRun
This requires a running instance of IPFS. To run the service with a mocked IPFS instance, run:
MOCK_IPFS="true" gradle -p store appRun
You can interact with the store through the REST-API as follows:
community-assigment-server/communityAssignmentService.pyto the amount of communities you want to have.
docker-compose -f docker-compose.yml buildin the root directory
docker-compose -f docker-compose.yml up --scale store=<number_of_nodes>
A store node consists of two major components, the Datastore Service and the Consensus Service. The Datastore Service handles incoming transactions and manages the in-memory index of the keys as well as the Transaction Log. The Consensus Service communicates with the network and takes part in the consensus rounds.
The Datastore Service manages the high-level interactions. It handles incoming read and write-requests, interacts with IPFS and forwards new transactions to the Consensus Service. The Datastore Service maintains an index containing the mapping of keys to IPFS content hashes, which are required to address the values stored in IPFS. Besides that, it manages the Transaction Log. The Transaction Log contains the immutable ordered list of all write-transactions. Because all transactions go through a consensus round, the Transaction Log is eventual consistent on all nodes in the network.
The Consensus Service starts and takes part in consensus rounds. For network communication we use Akka Cluster together with their PubSub implementation. Once a previous consensus round is finished, the Consensus Service pulls the next pending transaction from the Transaction Backlog and starts a new consensus round. The consensus algorithm is based on a simplified version of ELASTICO using PBFT internally. Due to the scope of this seminar, a couple of simplifications have been made such as leaving out dynamic community formation, identity management, and cryptographic protocols. Once the network agrees on the next transaction, the Consensus Service notifies the Datastore Service which integrates the new transaction in its Transaction Backlog and local index.
When adding a new key-value pair to a node, the following steps are done:
When reading a key, that reached consistency in the network, the node only has to look up the respective content hash from its index. Using this content hash, it can return the value stored in IPFS. Note that in order to guarantee consistent reads, further measures such as quorum reads have to be implemented.
For our experiments, the Community Assignment Server simplifies the dynamic community assignment by registering new nodes joining the cluster and statically assigning a community number to each node.
Following http GET routes are supported:
/joinGet a community assignment
/membersList all registered nodes and their assignment
/members/countCount the current registered nodes
/clearClear the current state of the server
For our experiments, we initially set up two seed-nodes. Next, we dynamically added further nodes for the different experiment setups. Nodes can be deployed on various instances including container engines. In our case, we used DigitalOcean with one
s-1vcpu-1gb instance for each store node. After all nodes have joined the network, we added new random key-value pairs to all nodes in the network using round-robin scheduling. Finally, the start and end events of the consensus rounds for each transaction are pulled from the logs of all nodes. The duration of a consensus round is defined by the difference between the timestamp of the event starting the consensus round and the latest timestamp of a node agreeing on this transaction.
The following table shows some of the results based on the experiment setup described before. For these experiments, the nodes are distributed equally between the communities.
|5 Nodes, 1 Community||0.35||0.5|
|25 Nodes, 2 Communities||0.59||0.89|
|100 Nodes, 25 Communities||3.04||9.49|
|200 Nodes, 50 Communities||17.89||45.64|
The chart shows that there are differences in growth between the median and mean. These differences might indicate an increasing network overhead and its resulting outliers of consensus durations.
More details on all experiments and the raw log files can be found here.