Thursday, April 16, 2015

ElasticSearch and Mongo through river – scalable data store and search engine platform

1. MONGODB
MongoDB is an open-source document database with built in replication, high availability, auto-sharding and map/reduce mechanisms.Install MongoDB from 
http://docs.mongodb.org/manual/tutorial/install-mongodb-on-os-x/

2. ELASTICSEARCH
Elastic search is a powerful open source, real-time search and analytics engine, designed from the ground up to be used in distributed environments with reliability and scalability as a must have. Looks great as a search engine.

Download Elastic Search from: https://www.elastic.co OR Use below command from terminal.
Export ES_HOME=/Users/xxx/Downloads/elasticsearch-1.4.2

3. MONGODB RIVER PLUGIN FOR ES
Elastic search provides ability to enhance the basic functionality by plugins, which are easy to use and develop. They can be used for analysis, discovery, monitoring, data synchronization and many others. Rivers is a group of plugins used for data synchronization between database and elastic search
The first is a dependency called Mapper Attachments. You can install via the ES plugin script:
$ES_HOME/bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/2.4.3

The second plugin is the ES 'river' for Mongo. The syntax to install it is slightly different as it's a third-party plugin:
$ES_HOME/bin/plugin -install com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.9

Restart the server 
                        sh elastic search restart

Elastic Search is on board and running – 3 minutes in my case. It’s recommended to use the second machine to avoid sharing resources, but for test deployments a single one is good enough.

Tell elastic search to index the “person” collection in testmongo database by issuing the following command in your terminal.

    "type": "mongodb",
    "mongodb": {
        "db": "testmongo",
        "collection": "person"
    },
    "index": {
        "name": "mongoindex",
        "type": "person"
    }
}'

We’ve got ElasticSearch automatically synchronizing data with MongoDB
2 minutes in my case.

4. Setup MONGODB As Replica Set.

MongoDB -- setting up replica set on the local host( Mac OS X)

mongod --port 27017 --dbpath /data/db --replSet rs0
rs.initiate()  -- > this will initiate new replica set.

Create a Mongo Database: testmongo
Create a Mongo Collection under testmongo: person

5. Now finally start the Search.

Use this command to search the data from terminal or go to URL directly to see below response

Response from Elastic Search :

{
  "took": 42,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.4054651,
    "hits": [
      {
        "_index": "mongoindex",
        "_type": "person",
        "_id": "552ee22d9f829d905d5f180f",
        "_score": 1.4054651,
        "_source": {
          "lastName": "Doe",
          "_id": "552ee22d9f829d905d5f180f",
          "firstName": "John"
        }
      }
    ]
  }
}
After inserting document in MongoDB configured as replica set, it is also stored in oplog collection.The mentioned collection is operations log configured as capped collection, which keeps a rolling record of all operations that modify the data stored in databases. River plugin monitors this collection and forwards new operations to elasticsearch according to its configuration. That means that all insert, update and delete operations are forwarded to elasticsearch automatically..

We can easily check what we have in ES using head plugin, which can be installed with the help of command:
./plugin -install mobz/elasticsearch-head

Some elasticsearch plugins provide web interface that can be reached using endpoint /_plugin:





Summarizing, we have MongoDB configured as replica set, Elasticsearch with River that pulls data from database to index, and finally everything is prepared for sharding and replication.

Tuesday, April 14, 2015

Singleton Pattern

Singleton Pattern in Java

Singleton Pattern ensures a class has only one instance and provides a global point of access to it.
The default constructor of the class is made private, which prevents the direct instantiation of the object by other classes.

A static modifier is applied to the instance method that returns the object as it then makes this method a class level method that can be accessed without creating an object.

public class BookingFactory {
 
private static BookingFactory instance;
// An instance attribute.
private int data = 0;
private BookingFactory() {
  //initiate any other attributes if needed.
}
 
public static BookingFactory getInstance(){
  if(instance == null
      instance = new BookingFactory();
  return instance;
}

public int getData() {
  return data;
}

public void setData(int data) {
  this.data = data;
}
// other methods....
}

In future if you get a requirement for having more than one instance then singleton allows multiple instances without affecting a singleton class's clients. Just you need to do is make a small change in Singleton Class, which doesn't effect the client's code.

Note that the singleton instance is only created when needed. This is called lazy instantiation.

public class SingletonDemo {
 
public static void main(String args[]) {
// Get a reference to the single instance of Singleton.
BookingFactory bookingFactory = BookingFactory.getInstance();
 
// Set the data value.
bookingFactory.setData(34);
 
System.out.println("First reference: " + bookingFactory);
System.out.println("Singleton data value is: " +bookingFactory.getData());
}
 
}

How can we break Singleton:

1. It could happen that the access method may be called twice from 2 different classes at the same time and hence more than one object being created. This could violate the design patter principle. 

Solution :

In order to prevent the simultaneous invocation of the getter method by 2 threads or classes simultaneously we add the synchronized keyword to the method declaration.

Make the Instance access method Synchronized to prevent concurrent thread access.

public static synchronized BookingFactory getInstance()

Synchronization is expensive, however, and is really only needed the first time the unique instance is created.

Do an eager instantiation of the instance rather than a lazy instantiation.

You can also instantiate the Singleton as soon as the class loads. i.e You can place it in static block to ensure that it happens only once.

public class BookingFactory {
 
private static BookingFactory instance;
// An instance attribute.
private int data = 0;
 
static {
  instance = new BookingFactory();
}
 
private BookingFactory() {
  //initiate any other attributes if needed.
}
 
public static BookingFactory getInstance()
{
  return instance;
}

public int getData() {
  return data;
}

public void setData(int data) {
  this.data = data;
}
// other methods....
}

Instead of synchronizing the whole method you can also make the instance variable as static final. It is thread-safe because static member variables created when declared are guaranteed to be created the first time they are accessed. You get a thread-safe implementation that automatically employs lazy instantiation.

public class BookingFactory {

  private final static BookingFactory instance = new BookingFactory();

  private BookingFactory() {
        // Exists only to defeat instantiation.
   }

}

But you loose the flexibility of having more than one instance in future without changing client's code.

2. if you are using multiple class-loaders  this could defeat the Singleton implementation and result in multiple instances. 

Solution:

Because multiple class-loaders are commonly used in many situations—including servlet containers—you can wind up with multiple singleton instances no matter how carefully you've implemented your singleton classes. If you want to make sure the same class-loader loads your singletons, you must specify the class-loader yourself; for example:

private static Class getClass(String classname) 
        throws ClassNotFoundException {
ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
if(classLoader == null)
classLoader = Singleton.class.getClassLoader();
return (classLoader.loadClass(classname));
}

}

The preceding method tries to associate the classloader with the current thread; if that classloader is null, the method uses the same classloader that loaded a singleton base class. The preceding method can be used instead of Class.forName().

3. If SingletonClass implements the java.io.Serializable interface, the class's instances can be serialized and deserialized. However, if you serialize a singleton object and subsequently deserialize that object more than once, you will have multiple singleton instances. 

Solution:

To avoid the above you need to implement readResolve() method.

private Object readResolve() {
    return INSTANCE;

}

The previous singleton implementation returns the lone singleton instance from the readResolve() method; therefore, whenever the Singleton class is deserialized, it will return the same singleton instance.

4. We will be able to create a copy of the Object by cloning it using the Object’s clone method.

Solution:

Override the Object clone method to prevent cloning, this can be done as shown below

SingletonDemo clonedObject = (SingletonDemo) obj.clone();

This again violates the Singleton Design Pattern’s objective. So to deal with this we need to override the Object’s clone method which throws a CloneNotSupportedException exception.

public Object clone() throws CloneNotSupportedException {
throw new CloneNotSupportedException();

}

When to use:
We can use this while creating objects of thread pools, caches etc to avoid wasting resources.

Java Server Side Templating using Mustache.

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONObject;

import com.samskivert.mustache.Mustache;
import com.samskivert.mustache.Template;

/**
 * Simple java server side templating
 * @author Samar Aarkotti
 *
 */
public class Sample {

public static void main(String[] args) {
String template = "{\"products\": \"iPhone\"}";
String text = "How do I get my {{products}} serviced?";
Template tmpl = Mustache.compiler().compile(text);
try {
JSONObject jsonObj = new JSONObject(template);
Map<String,Object> map = getTemplateFromJson(jsonObj);
System.out.println(tmpl.execute(map));
} catch (JSONException e) {
e.printStackTrace();
}
}
private static Map<String, Object> getTemplateFromJson(JSONObject json) throws JSONException {
    Map<String,Object> out = new HashMap<String,Object>();
    Iterator<?> it = json.keys(); 
    while (it.hasNext()) {
        String key = (String)it.next();

        if (json.get(key) instanceof JSONArray) {

            // Copy an array
            JSONArray arrayIn = json.getJSONArray(key);
            List<Object> arrayOut = new ArrayList<Object>();
            for (int i = 0; i < arrayIn.length(); i++) {
                JSONObject item = (JSONObject)arrayIn.get(i);
                Map<String, Object> items = getTemplateFromJson(item);
                arrayOut.add(items);
            }
            out.put(key, arrayOut);
        }
        else {

            // Copy a primitive string
            out.put(key, json.getString(key));
        }
    }
    return out;
}

}

Expected Output : How do I get my iPhone serviced?

Create ElasticSearch cluster on single machine

I wanted to figure out how to create a multi-node ElasticSearch cluster on single machine. So i followed these instructions First i did...