October 2, 2020
Estimated Post Reading Time ~

Workflow Purge Scheduler

There are various situations while working on a project where workflows become unmanageable and a multitude of workflow instances get triggered. This can lead to various problems which must be resolved in order to maintain a healthy state of the server and uplift the performance of the website. In this blog, we will discuss such situations and remedial actions to purge the workflows on a periodic basis with the help of some features.

Use Case:
Many situations such as content syncing, content migration, development issues, bulk execution, or some unpredictable situations lead to a massive number of workflow instances running concurrently on an environment. This can lead to performance issues for that particular server.

One such situation occurred while doing a content syncing from one environment to another. When DAM was synced, it triggered an abundance of workflow instances of ‘DAM Update Asset’ and ‘Dam Metadata WriteBack’ as various assets were being added to the DAM folder of content. DAM Update Asset workflow gets triggered on adding a new asset to the DAM. This initiated an uncountable number of workflow instances pertaining to these two workflows which had a performance impact on the server.

Why should we purge unwanted or completed workflow instances?
A multitude of concurrently running workflow instances leads to performance issues on the server. It is essential to purge these instances so as to manage the size of the JCR repository. Running instances pose a performance threat but even completed workflow instances should also be purged as nodes in the repository get accumulated and it is beneficial to clean them up for the long run to manage disk space and performance. The bloated repository will also have an impact on the queries that are run to search the workflow hierarchies. So, workflow purging should be set as one of the maintenance tasks that must be carried out repeatedly.

Workflow Purge Scheduler
Workflow Purge Scheduler is categorized as Sling Service Factory so we can create an instance of the same. It is a configuration that we can configure in Felix Console and manage the workflow purge scheduling as per the requirements.
  1. Log into Apache web console, host:port/system/console/configMgr
  2. Search for ‘Adobe Granite Workflow Purge Configuration’ which can help in managing the purge schedule of the workflow instances. Since this is a Service Factory so we can create multiple instances of this configuration to cater to different criteria parallelly. For example, we can run this scheduler on different models setting different age/ time for a workflow after which it will be marked for purging. To add another instance of this configuration, click on the ‘+’ sign in front of the original configuration. The factory PID for this configuration instance will be ‘com.adobe.granite.workflow.purge.Scheduler’. Note that the PID of this instance will be set when the configuration instance is saved. It will be assigned with a unique ID appended to the Factory PID of the configuration like com.adobe.granite.workflow.purge.Scheduler-myidentifier where myidentifier is a unique id.
  3. Configure this to set up the scheduler to purge the workflows. It has the following fields:
    1. Job Name: This name is given to the particular purge instance. This is a descriptive name.
    2. Workflow Status: It is a dropdown with options ‘Completed’ and ‘Running’. Choose which status of workflows to purge.
    3. Models to purge: It is multifield in which can set different models to purge. We can configure the ids of the models that are selected for a purge. If left blank, it will be effective on all the models. For example, workflow model id is in the form of ‘/etc/workflow/models/dam/update_asset/jcr:content/model’ for DAM Update Asset Workflow. It can be seen from host:port/libs/cq/workflow/content/console.html.
    4. Workflow Age: It is a field to enter the number of days after which the workflow(s) will be purged.


We can make this configuration as a part of the codebase by adding the config file to the codebase. It will be in the form of an XML file placed inside /apps/config node. Note that the name of the XML file should be the same as the name of the PID that is of the configuration in Felix console ‘com.adobe.granite.workflow.purge.Scheduler.xml’. If we make an instance of the ‘Workflow Purge Scheduler’ then PID should be something similar to: ‘com.adobe.granite.workflow.purge.Scheduler.config.[some-arbitrary-id].xml’.

We can add an entry to the XML file giving the default value of the fields that will be sent as a part of the code package.
<?xml version="1.0" encoding="UTF-8"?> 
<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0" 
jcr:primaryType="sling:OsgiConfig" 
scheduledpurge.name="DAM Purge" 
scheduledpurge.workflowStatus="Running" scheduledpurge.modelIds="‘/etc/workflow/models/dam/update_asset/jcr:content/model " 
scheduledpurge.daysold="1" />

Workflow Remover
There might be situations when we do not require a regular service or a scheduler for purging the workflows but need to remove workflow instances on a particular instance. This might be the scenario during content sync when multiple instances of a workflow get triggered and need to be cleaned up at one time. ACS (Adobe Consulting Services) AEM Commons provides the functionality of Workflow Remover which is a user interface to remove some unwanted instances of workflows as per the requirement.
  1. Navigate to Tools and go to ACS AEM Commons. Click on Workflow Remover.
  2. Configure this to set up the scheduler to purge the workflows. It has the following fields:
  • Status: Any status among Completed, Aborted, Running, Suspended or Stale can be selected.
  • Payload Paths: We enter regex expressions to match the payload path of the workflow.
  • Batch Size: To decide the JCR batch size in which removal of the workflow will take place. The default value is 1000.
  • Max Duration: Enter time in minutes to force terminate the workflow removal process. We can set it 0 to disable it. We can leave it blank for no such limit of this workflow removal process.
  • Workflows older than Workflow instances must be created before older than the date specified here.
  • Workflow Models: It lists all the workflow models that can be removed. If no option is selected specifically by the user, then all the workflow models are eligible for removal meeting the criteria defined.




Workflow Remover of ACS AEM Commons can be set as a scheduled service too in the configuration given as ‘ACS AEM Commons- Workflow Instance Removal- Scheduled Service’ in the Felix Console.

It needs to be configured similar to the UI given in Tools. It has the following fields:
  1. Cron Expression: Cron expression at which the removal scheduled service will run.
  2. Workflow Status: It is a multifield to enter the status of workflows that needs to be removed using this scheduled service. (ABORTED, COMPLETED, STALE, SUSPENDED, RUNNING). ABORTED and COMPLETED are by default.
  3. Workflow models are paths till jcr:content/models of a particular workflow that needs to be removed using this service.
  4. Payload Patterns: This is a regex expression which when matched with the given payload removed that particular workflow instance. Example: /content/dam/.*/.*\.mp4(/.*)?
  5. Older than UTC TS: This field is used to specify time in milliseconds to remove workflow instances which are older than this UTC Time.
  6. Batch Size: This pertains to removal of workflow nodes as defined by the batch size. The default size is 1000.
  7. Max duration: It defines the maximum time (in minutes) till when this workflow process will keep on executing. It defaults to 0 which means that there is no limit for this workflow scheduler to run and remove the workflow instances.


Similar to Workflow Purge Configuration, it can be entered as the part of the codebase as an XML file in the config node in apps.

So, there are two configurations and UI in Tools with the help of which we can purge the workflow instances automatically without manual work to make the performance consistent and repository clear of unwanted workflow instances.

Hope this helps you all at some point in the project to kill multiple unwanted workflow instances.

Ref:
https://adobe-consulting-services.github.io/acs-aem-commons/features/workflow-remover/index.html

Source: https://www.argildx.com/technology/workflow-purge-scheduler/


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.