Skip to content

This repo contains a distributed & replicated KV storage with strong consistency using B+ Trees for storage.

Notifications You must be signed in to change notification settings

saqib22/distributed-replicated-kv-store

Repository files navigation

Distributed & Replicated p2p Data Storage System with Strong Consistency

How to Run the Service

ECS

docker run --rm -p 39595:39595 --name ms5-gr5-ecs-server --network ms5/gr5 ms4/gr5/ecs-server -a <Address> -p <Port> -ll <Log Level>

<Address> : IP Address where you want to run the server
<Port> : Port where you want to run the service
<LogLevel> : Logging level configuration (INFO, SEVERE, etc.)

Always make sure to run the ECS first, and then run as many KV-Servers as desired.

Server

docker run --rm -p 32933:32933 --name ms5-gr5-kv-server0 --network ms5/gr5 ms4/gr5/kv-server -a <Address> -p <port> -ll <LogLevel> -d <Directory> -s <CachingStrategy> -c <CacheSize> -b <ECSAddress> <-sdc>

<Address> : IP Address where you want to run the server
<Port> : Port where you want to run the service
<LogLevel> : Logging level configuration (INFO, SEVERE, etc.)
<Directory> : Storage directory where you want to save kv pairs.
<CachingStrategy> : Your desired caching strategy. (One of the following, "FIFO", "LRU", "LFU")
<CacheSize> : Size of Cache, in bytes
<ECSAddress> : IP Address and port of ECS server
<-sdc> : This is an optional flag that, if used, turns on stronger data consistency at the cost of performance.

You can run as many KV-Servers as you like by specifying different <Address> <Port> pairs for them. You would also have to change the --name for them. If you would like to have stronger data consistency on a server, add the -sdc argument when running the server. When more than two servers are running, replication is also turned on.

Client

docker run --rm -p 39565:39565 --name ms5-gr5-client --network ms5/gr5 ms4/gr5/client -a <Address> -p <Port> -ll <Log Level>

<Address> : IP Address where you want to run the server
<Port> : Port where you want to run the service
<LogLevel> : Logging level configuration (INFO, SEVERE, etc.)

The client provides a commandline interface to interact with the KV-Servers. Simply run the client, and then enter help for the list of available commands.

Running FIFO Consistency

To run the servers with FIFO consistency, simply run the servers as described above with the -sdc argument. The servers will start with FIFO consistency turned on. You check this status by looking at the logs generated by the server.

Running the Leader Election Protocol

For testing the leader election protocol use the following example configuration as template (Note: this example is just for demonstration it can work for any number of servers and ECS configuration)

Start the ECS

docker run --rm -p 39595:39595 --name ms5-gr5-ecs-server --network ms5/gr5 ms4/gr5/ecs-server -a <Address> -p <Port> -ll <Log Level>

<Address> : IP Address where you want to run the server
<Port> : Port where you want to run the service
<LogLevel> : Logging level configuration (INFO, SEVERE, etc.)

Start the KV1

docker run --rm -p 32933:32933 --name ms5-gr5-kv-server1 --network ms5/gr5 ms4/gr5/kv-server -a <Address> -p <port> -ll <LogLevel> -d <Directory> -s <CachingStrategy> -c <CacheSize> -b <ECSAddress> <-sdc>

<Address> : IP Address where you want to run the server
<Port> : Port where you want to run the service
<LogLevel> : Logging level configuration (INFO, SEVERE, etc.)
<Directory> : Storage directory where you want to save kv pairs.
<CachingStrategy> : Your desired caching strategy. (One of the following, "FIFO", "LRU", "LFU")
<CacheSize> : Size of Cache, in bytes
<ECSAddress> : IP Address and port of ECS server
<-sdc> : This is an optional flag that, if used, turns on stronger data consistency at the cost of performance.

Start the KV2

docker run --rm -p 32934:32934 --name ms5-gr5-kv-server1 --network ms5/gr5 ms4/gr5/kv-server -a <Address> -p <port> -ll <LogLevel> -d <Directory> -s <CachingStrategy> -c <CacheSize> -b <ECSAddress> <-sdc>

<Address> : IP Address where you want to run the server
<Port> : Port where you want to run the service
<LogLevel> : Logging level configuration (INFO, SEVERE, etc.)
<Directory> : Storage directory where you want to save kv pairs.
<CachingStrategy> : Your desired caching strategy. (One of the following, "FIFO", "LRU", "LFU")
<CacheSize> : Size of Cache, in bytes
<ECSAddress> : IP Address and port of ECS server
<-sdc> : This is an optional flag that, if used, turns on stronger data consistency at the cost of performance.

Now stop the ECS server and wait for the protocol to finish, in order to check the new ECS logs after restart refer to echo.log on the KV instance which was elected as the leader.

Evaluation

Effect of Cache sizes on PUT

There doesn't seem to be much of a difference between the three caches for smaller number of kv-servers. As the number grows larger, FIFO cache provides the best performance.

Put_Cache

Effect of Consistency Level on PUT

Increasing the level of data consistency degrades performance quite significantly. This is due to FIFO consistency requiring requests to be processed at the replicas before confirming the result to the client.

Put_Consistency

Effect of Cache sizes on GET

As with PUT requests, there isn't much of a difference with a small number of KV-servers. However, as this number increases, FIFO strategy seems to be the best while LRU's performance degrades sharply.

Get_Cache

Effect of Consistency Level on GET

The level on consistency doesn't really have much of an effect on the performance of GET requests, as expected. the small difference in our experiments can mostly be attributed to hardware noise or other minute difference in experiment conditions.

Get_Consistency

Effect of Clustered B+ Trees vs B+ Trees

Clustered B+ trees outperform B+ trees for all number of servers, when repartitioing. However, this effect is more pronounced with a smaller number of servers, and then decreases further as more servers are added to the ring. This is due to the fact that fewer data is moved around as the number of servers increases due to every chunk in the hash ring getting smaller.

CBTrees

About

This repo contains a distributed & replicated KV storage with strong consistency using B+ Trees for storage.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages