Introduction to In-Memory Data Grids

Posted on November 30, 2013 by Siva Prasad Rao Janapati — Leave a comment

In-memory data grids are gaining a lot of attention in the developer community because of high performance and dynamic scalability. Let us see more details about In-Memory Data Grids(IMDG) in this article.

The In-Memory Data Grids follow the below properties.

Stores large volume of data in-memory
The data lives in a distributed cluster and there is no master-slave.
The data model is object-based, not relational.
ACID support

The available solutions on IMDG are given below.

VMware Gemfire(http://terracotta.org/products/enterprise-suite)
Oracle Coherence(http://www.oracle.com/technetwork/middleware/coherence/overview/index.html)
Gigaspaces(http://www.gigaspaces.com/datagrid)
Hazelcast(http://www.hazelcast.com/)
Scaleout StateServer(http://www.scaleoutsoftware.com/products/scaleout-stateserver/)
Terracotta Enterprise Suite(http://terracotta.org/products/enterprise-suite)
Jboss (Redhat) Infinispan(http://www.jboss.org/infinispan)

In one of the article, we have seen Memcached as the cache to increase the performance of the application. The In-Memory Data Grids(IMDG) are also doing the same thing. Then, why we need IMDG? The Memcached is very well suited for read-mostly data. The IMDG’s are suitable for read-write data. The servers in the Memcached cluster are independent(don’t know each other) and if one server goes down from the cluster, there is a loss to the data. But, in the case of IMDG cluster, the servers know each other and the data is partitioned across the servers. In the cluster, each node will carry some data and the backup of the data. If one server goes down from the IMDG cluster, the same data from the backup will get redistributed to the other live servers. So, there is no loss to the data. The data can be persisted with Memcached and IMDG’s. We can persist the data asynchronously when we are using IMDG. But, in the case of Memcached, first, we need to persist the data and then need to warm up the cache(it is synchronous).

Now, we will see one of the IMDG solution called Hazelcast and how it fits in our application architecture?

To use the Hazelcast, first, download the Hazelcast distribution from here. The present available Hazelcast version is 3.1.2. Keep the hazelcast-3.1.2.jar, hazelcast-client-3.1.2.jar files in the classpath. The below sample code will start the Hazelcast server. If we run the same code multiple times, the Hazelcast cluster will get form.


package org.smarttechie.server;

import com.hazelcast.core.Hazelcast;

public class HazelcastServer {

/**
* @param args
*/
public static void main(String[] args) {
Hazelcast.newHazelcastInstance(null);
}

}

Once we run the above code, we will see the below output.

INFO: [127.0.01]:5701 [dev] Hazelcast Community Edition 3.1.2 (20131120) starting at Address[127.0.01]:5701
Nov 28, 2013 5:14:59 PM com.hazelcast.system
INFO: [127.0.01]:5701 [dev] Copyright (C) 2008-2013 Hazelcast.com
Nov 28, 2013 5:14:59 PM com.hazelcast.instance.Node
INFO: [127.0.01]:5701 [dev] Creating MulticastJoiner
Nov 28, 2013 5:14:59 PM com.hazelcast.core.LifecycleService
INFO: [127.0.01]:5701 [dev] Address[127.0.01]:5701 is STARTING
Nov 28, 2013 5:15:09 PM com.hazelcast.cluster.MulticastJoiner
INFO: [127.0.01]:5701 [dev]

Members [1] {
Member [127.0.01]:5701 this
}

Nov 28, 2013 5:15:09 PM com.hazelcast.core.LifecycleService
INFO: [127.0.01]:5701 [dev] Address[127.0.01]:5701 is STARTED

The above output indicates that there is only one server in the Hazelcast cluster. Now, if we run the server code again, we will see one more server instance added to the cluster.


Members [2] {
Member [127.0.0.1]:5701
Member [127.0.0.1]:5702 this
}

Now, we will try to write Hazelcast client code to interact with the cluster.

package org.smarttechie.client;

import java.util.Map;

import org.smarttechie.model.Customer;

import com.hazelcast.client.config.ClientConfig;
import com.hazelcast.core.HazelcastInstance;

public class HazelcastClient {

/**
* @param args
*/
public static void main(String[] args) {
ClientConfig clientConfig = new ClientConfig();
clientConfig.getGroupConfig().setName("dev").setPassword("dev-pass");
clientConfig.addAddress("localhost:5701");

HazelcastInstance client = com.hazelcast.client.HazelcastClient.newHazelcastClient(clientConfig);
//All cluster operations that you can do with ordinary HazelcastInstance
Map<String, Customer> mapCustomers = client.getMap("customers");
mapCustomers.put("1", new Customer("1", "xyz", "xyz", "xyz"));
mapCustomers.put("2", new Customer("2", "abc", "abc", "abc"));
mapCustomers.put("3", new Customer("1", "123", "123", "123"));

Map<String, Customer> colCustomers = client.getMap("customers");
System.out.println(colCustomers.size());

}

}

Once the client code runs, the partition service will start and distribute the data across the nodes. The output of the server node is given below.


Nov 28, 2013 5:25:58 PM com.hazelcast.partition.PartitionService
INFO: [127.0.01]:5702 [dev] Initializing cluster partition table first arrangement...

Now, we will bring down one of the server nodes from the cluster. In this case, the data which is there on that server node from the backup will get redistribute to the live server node. The log of the live server node is given below.


Nov 28, 2013 5:42:38 PM com.hazelcast.cluster.ClusterService
INFO: [127.0.01]:5702 [dev] Removing Member [127.0.01]:5701
Nov 28, 2013 5:42:38 PM com.hazelcast.cluster.ClusterService
INFO: [127.0.01]:5702 [dev]

Members [1] {
Member [127.0.01]:5702 this
}

Nov 28, 2013 5:42:40 PM com.hazelcast.partition.PartitionService
INFO: [127.0.01]:5702 [dev] Partition balance is ok, no need to re-partition cluster data...

The code used in this article is available here.

For further information, you can go through the below references.

http://highscalability.com/blog/2011/12/21/in-memory-data-grid-technologies.html

http://www.infoq.com/articles/in-memory-data-grids

About Siva Prasad Rao Janapati

Siva Janapati is an Architect with experience in building Cloud Native Microservices architectures, Reactive Systems, Large scale distributed systems, and Serverless Systems. Siva has hands-on in architecture, design, and implementation of scalable systems using Cloud, Java, Go lang, Apache Kafka, Apache Solr, Spring, Spring Boot, Lightbend reactive tech stack, APIGEE edge & on-premise and other open-source, proprietary technologies. Expertise working with and building RESTful, GraphQL APIs. He has successfully delivered multiple applications in retail, telco, and financial services domains. He manages the GitHub(https://github.com/2013techsmarts) where he put the source code of his work related to his blog posts.

Tagged with: datagrid, hazelcast, IMDG
Posted in Data Grids

Introduction to In-Memory Data Grids

Share this:

Related

Leave a comment Cancel reply