April 1, 2020
Estimated Post Reading Time ~

How to Perform System Clean Up in Adobe CQ / AEM (CQ5.5)

Use Case:
CQ System grows over time as more data is modified, removed, and added. CQ follows the append-only model for datastore, so data is never deleted from datastore even if it is deleted from the console. Also over time, we end up having a lot of unnecessary packages as part of deployment and migration. On top of that adding a lot of DAM asset create a lot of workflow data that is not required.

As a result of which Disk size increases and if you are planning to have many instances sharing the same hardware (Specially dev) it makes sense to reduce the size of instance time to time.

Solution:
You can use the following script to clean your data from time to time.

Prerequisite:
Get workflow purge script from here

Step 1:
Create a file with information about your instance (For example here name is host_list.txt)

#File is use to feed the cleanup package script
#FORMAT HOST:PORT
<YOUR SERVER>:<PORT>
#END


Step 2:
Actual Script

#!/bin/bash
#
# Description:
# Clean Master author Only
# Clean Old Packages
# Clean DataStore GC

PURGE_WORK_FLOWS_FILE="purge-workflows-2.zip"
CURL_USER='admin:my_super_secret'
IS_PURGE_PAK_FOUND=NO
MY_HOST_LIST=host_list.txt
# Name of package group that you want to clear
PACKAGE_GROUP=<MY PACKAGE GROUP>

if [ ! -f "${MY_HOST_LIST}" ]; then
echo "Error cannot find host list file: ${MY_HOST_LIST}"
echo "Exiting ..."
exit 1;
fi

function run_purge_job()
{
MY_HOST= <YOUR HOST NAME>
IS_PURGE_PAK_FOUND=$(curl -su "${CURL_USER}" "http://${MY_HOST}:4502/crx/packmgr/service.jsp?cmd=ls" | grep "name" | grep "purge-workflows-2" | tr -d ' \t\n\r\f')

if [ -z "${IS_PURGE_PAK_FOUND}" ]; then
IS_PURGE_PAK_FOUND=NO
else
IS_PURGE_PAK_FOUND=YES
fi

if [ "$IS_PURGE_PAK_FOUND" = "NO" -a -f $PURGE_WORK_FLOWS_FILE ]; then
MY_PAK_NAME=$(basename $PURGE_WORK_FLOWS_FILE .zip)
MY_STATUS=$(curl -su "${CURL_USER}" -f -F"install=true" -F name=$MY_PAK_NAME -F file=@$PURGE_WORK_FLOWS_FILE http://${MY_HOST}:4502/crx/packmgr/service.jsp | grep code=\"200\"| tr -d ' \t\n\r\f')

if [ -z "${MY_STATUS}" ]; then
echo "Error uploading $PURGE_WORK_FLOWS_FILE exiting..."
exit 1
fi
fi

if [ "${IS_PURGE_PAK_FOUND}" = "YES" ]; then
curl -su "${CURL_USER}" -X POST --data "status=COMPLETED&runpurge=1&Start=Run" http://${MY_HOST}:4502/apps/workflow-purge/purge.html > /dev/null 2>&1
sleep 10
curl -su "${CURL_USER}" -X POST --data "status=ABORTED&runpurge=1&Start=Run" http://${MY_HOST}:4502/apps/workflow-purge/purge.html > /dev/null 2>&1
fi
}

function clean_old()
{
for MY_HOST in $(cat $MY_HOST_LIST|grep -v '#')
do
IS_INSTANCE_UP=$(curl --connect-timeout 20 -su "${CURL_USER}" -X POST "http://${MY_HOST}/crx/packmgr/service.jsp?cmd=ls" | grep "name" | grep -i ${PACKAGE_GROUP} | tr -d ' \t\n\r\f')

if [ -z "${IS_INSTANCE_UP}" ]; then
continue
fi

# You can have multiple package here
# Or you can use Commands from here
echo "deleting package group"
curl -su "${CURL_USER}" -F" :operation=delete" http://${MY_HOST}/etc/packages/<PACKAGE GROUP NAME> > /dev/null 2>&1
sleep 10
done
}

function clean_datastore_gc()
{
for MY_HOST in $(cat $MY_HOST_LIST|grep -v '#')
do

IS_INSTANCE_UP=$(curl --connect-timeout 20 -su "${CURL_USER}" -Is "http://${MY_HOST}/crx/packmgr/index.jsp" | grep HTTP | cut -d ' ' -f2)

if [ ${IS_INSTANCE_UP} -eq 200 ]; then
continue
fi
echo "running datastore gc"
curl -su "${CURL_USER}" -X POST --data "delete=true&delay=2" http://${MY_HOST}/system/console/jmx/com.adobe.granite%3Atype%3DRepository/op/runDataStoreGarbageCollection/java.lang.Boolean > /dev/null 2>&1
done
}

case "$1" in
'purge')
run_purge_job
;;
'clean_paks')
clean_old
;;
'clean_ds')
clean_datastore_gc
;;
*)
echo $"Usage: $0 {purge|clean_paks|clean_ds}"
exit 1
;;
esac
exit 0
#
#end

Manual Cleaning:
CQ5.5 and before:
1) Download workflow purge script from here
2) Install purge script using package manager
3) Login as admin or as user having administrative access
4) Go to http://${MY_HOST}:4502/apps/workflow-purge/purge.html
5) Select completed from drop down and run purge workflow.
6) You might have to run it multiple time to make sure that everything is deleted.
7) Using crxde light or crx explorer using admin session go to /etc/packages/<Your package group>
8) Delete package you want to delete
9) After deleting click save all
10) To run datastore GC please follow http://www.wemblog.com/2012/03/how-to-run-online-backup-using-curl-in.html Or http://www.cqtutorial.com/courses/cq-admin/cq-admin-lessons/cq-maintenance/cq-datastore-gc

In CQ 5.6 OOTB you can configure audit and workflow purge using instruction here http://helpx.adobe.com/cq/kb/howtopurgewf.html


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.