April 1, 2020
Estimated Post Reading Time ~

How to Set Up Clustering In CQ/AEM 6 using MongoDB

Background:
With CQ / AEM 6 TarPM is not supported anymore. AEM 6 ships with Oak which for now support TarMK and MongoMK Microkernal OOTB. More information about what is New Can be found from http://www.slideshare.net/AEMHub2014/oak-michael-marth . With this change Support from Clustering is moved to the actual storage layer itself (Which makes more sense, given supporting all issues for clustering in an earlier version). TarMK does not have replication or sharding feature so it comes down to MongoDB which supports replication and sharding and hence enables High Availability (HA through replication) and Scalability (Through Sharding, Though this is still a question ?? See note below) through clustering in CQ /AEM 6.

Here we will give step by step instruction of how to set up clustering using MongoDB in CQ

Prerequisite:
Make sure that You have CQ / AEM 6 jar file
Install MongoDB using instruction http://docs.mongodb.org/manual/installation/
Make sure that it is up and running
Read about replication http://docs.mongodb.org/manual/core/replication-introduction/
Read about deployment Strategy http://docs.adobe.com/docs/en/aem/6-0/deploy/recommended-deploys.html

There are two cases for setting Up Replica Set:
Set up a new MongoDB Instance:
Set up additional MongoDB instance based on instruction above
Start any one of instance using ./mongod --port <Your Port> --dbpath <Your DB Path> --replSet <Replica Set Name could be any thing> &
You can also use configuration file to do that. More instruction here http://docs.mongodb.org/manual/tutorial/deploy-replica-set/
Once Mongo DB is started you can add additional replica using following instruction

cd <Your Mongo bin Location>
# You can also configure port using configuration file
./mongo --port <PORT>
rs.initiate()
# Check the conf that it is not part of replica set yet
rs.conf()
#This is your second Mongo Instance, Make sure that your other Mongo Instance is already set up
rs.add("<HOST>:<PORT>")
#This is your Third Mongo Instance (You need to have odd number of instances for election to work)
rs.add("<HOST>:<PORT>")
# If you do not have space or do not want additonal instance then you can add arbitor
# Arbitor is another Mongo Instance which do not store data but take part in election process one one mongo is down
# You can add arbitor using following command
# rs.addArb("<HOST>:<PORT>")
# Check Conf to make sure that everything is right
rs.conf()
# Check Status of replication
rs.status()

Once Replica set is up, Now set Up AEM
# Unpack AEM jar using
java -jar <AEM Jar> -unpack

# Go to bin directory
cd crx-quickstart/bin

#Change syour start script
vi start

#Set runmode to crx3mongo
CQ_RUNMODE='<your run modes>,crx3mongo'

#Add oak.mongo.uri JVM argument -Doak.mongo.uri=mongodb://<SERVER:PORT> -Doak.mongo.db=<NAME OF DB>
# Note that all other java param will remain as is
CQ_JVM_OPTS='-server -Xms2048m -Xmx2048m -XX:MaxPermSize=512M -Djava.awt.headless=true -Doak.mongo.uri=mongodb://<HOST1>:<PORT1>,<HOST2>:<PORT2>,<HOST3>:<PORT3> -Doak.mongo.db=<This is the DB name you created while creating Mongo>'

# Then start your Instance
./start

Then You can go to each Mongo Instance and check of data is coming using Mongo Log

Convert Existing Mongo Instance:
Stop you AEM instance
Use Following instruction to convert Mongo to replica

cd <Mongo bin directory>
use admin
db.shutdownServer()
#Take Backup
./mongodump --dbpath ../data/<NAME OF DB> -o dataout
# Now set up new Mongo DB servers that you want to add in replica set Instruction above
# Now start your existing Mongo Using following command, Note that this time we are using replSet
# You can also do this using Mongo DB configuration file
./mongod --port 27017 --dbpath <your DB path> --replSet <Replica set Name> &
#Connect to the mongodb instance
./mongod --port 27107
rs.initiate()
#Make sure that other MongoDB are are up you can now add them to replica set
rs.add("<HOST2>:<PORT2>")
# To make sure that you have odd number of instances
rs.add("<HOST3>:<PORT3>")
#If you don't have space or don't want third instance to store data then you can do something like
# rs.addArb("<HOST4>:<PORT4>")
# Check config again
rs.conf()
# Check status of Replica
rs.status()

#If something goes wrong you can do something like
cd <Mongo bin location>
mongorestore --dbpath <database path> <path to the backup>

Once this is set Change AEM start script to add mongo replica instance as given in approach one
start your AEM instance
AEM should be part of replica set now

Backup and Restore
Please check https://docs.mongodb.org/v3.0/tutorial/backup-and-restore-tools/ for MongoDB instruction of backup and restore.

Automated script can be found here: https://github.com/micahwedemeyer/automongobackup/blob/master/src/automongobackup.sh just put this script under /etc/cron.daily and you are set for backup.

Some Common Questions
Should I set up my AEM author instance on MongoDB
Unless you have clustering requirement, I would not suggest to set up your author instance with MongoDB. Mainly because of administrative overhead.

Should I set up my AEM publish instance on MongoDB
Same as above, Unless you have a requirement which requires shared content generation I would suggest not to use MongoDB. With AEM communities, now you have an option to add Mongo Persistence for community feature at any time. Here is more detail https://docs.adobe.com/docs/en/aem/6-1/administer/communities/srp/msrp.html and https://docs.adobe.com/docs/en/aem/6-1/administer/communities/srp/msrp/demo-mongo.html

Should I store Blobs in MongoDB as well in AEM
It is not recommended to store Blob data with MongoDB. There are other options like, Local Storage, NAS, AWS you can use in that case. More detail https://docs.adobe.com/content/docs/en/aem/6-1/deploy/platform/aem-with-mongodb.html#AEM Configuration and https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html

How can I secure my MongoDB deployment with AEM
Check documentation here https://blogs.adobe.com/security/2015/07/securely-deploying-mongodb-3-0.html
Notes:
1) Mongo Replication Only Provide High Availability (HA) it does not provide scalability. For scalability you need to use Sharding feature provided by Mongo. However I am not sure what would be best key to create shard on for Mongo. You can create Shard based on _id attribute. More information about sharding can be obtained here http://docs.mongodb.org/manual/sharding/ . If you are using Sharding I would suggest to use sharding with replication (Shard and then replicate shard instance) to provide both HA and scalability.

2) There are many feature available in Mongo Replication where you can make certain replica instance read only (Data Center replica), you can use this to avoid high latency across Data Center here is all configuration you can do on Mongo http://docs.mongodb.org/manual/administration/replica-set-member-configuration/

3) MongoDB recently released MMS https://mms.mongodb.com/ to monitor and deploy Mongo Cluster easily. This will be useful if you are worried about administrative cost for Mongo

4) If you don't want to store large documents in Mongo feel free to use custom Data Store using instruction here http://jackrabbit.apache.org/oak/docs/osgi_config.html

5) Mongo Recently launched another feature of pluggable datastore. You can use this for faster read and write based on your requirement (For example Primary with high Write Enabled Storage Like SSD or something and read with cheap storage). More info here https://www.mongosoup.de/blog-entry/A-closer-look-at-pluggable-storage.html (Official Doc yet to come)

6) Official AEM Documentation: https://docs.adobe.com/content/docs/en/aem/6-1/deploy/platform/aem-with-mongodb.html

Here are few more Mongo Commands
cd <Mongo bin directory>
./mongo
# Show all db
show dbs
# Check status of replica set
rs.status()
# If this is secondary you need to do
rs.slaveOk()
# Use database (Use AEM DB or DB you chose)
use <data-base-name>
# Show all collection
show collections
# Check status of blob
db.blobs.stats()
# Check all cluster Node
db.clusterNodes.find().pretty()
#Will add more later .......


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.