May 11, 2020
Estimated Post Reading Time ~

Batch Processing Content in AEM with Groovy Console

Problems?
One of the issues that the AEM developer may meet is processing JCR content. Sometimes you need update/modify/package/analyze a really big amount of data and double challenge if you need to do it in a live environment: There are some typical cases:
  • - Change content structure (for example news/feb/event -> news/2015-2016/feb/event)
  • - Modify/fix content (for example need to change ‘color’ on ‘color’ on every page)
  • - Analyze content (on MSM, editors use assets from another website, need to find all wrong usages)
  • - Prepare and package content (for example during migration from CQ5.x on AEM6.x you need to analyze content first and create a valid package for its migration)
Solutions
There are maybe several solutions and workarounds for each case:
  • - Servlets/JSP/Scriptlets
  • - AEM Util packages
  • - External tools
  • - Groovy Console
Let's stop on Groove Console it is a real silver bullet. Advantages of Groovy Console:
  • - does not mess project code
  • - short and clear scripts
  • - predefined services/methods
Let’s start
The AEM Groovy Console is hosted on https://github.com/Citytechinc/cq-groovy-console and available for versions AEM/CQ5 starting from CQ5.4. Once Groovy Console is installed on AEM/CQ5 instance go to http://localhost:4502/etc/groovyconsole.html

New URL from AEM 6.5 onwards: http://localhost:4502/apps/groovyconsole.html

For any groovy script you have already defined variables:
  • - session - javax.jcr.Session
  • - pageManager - com.day.cq.wcm.api.PageManager -resourceResolver - org.apache.sling.api.resource.ResourceResolver
  • - slingRequest - org.apache.sling.api.SlingHttpServletRequest -queryBuilder - com.day.cq.search.QueryBuilder -bundleContext - org.osgi.framework.BundleContext -log - org.slf4j.Logger
methods (some of them):
  • - getPage(String path) - Get the Page for the given path, or null if it does not exist.
  • - getNode(String path) - Get the Node for the given path. Throws javax.jcr.RepositoryException if it does not exist.
  • - activate(String path) - Activate the node at the given path.
  • - deactivate(String path) - Deactivate the node at the given path.
And imports:
  • - com.day.cq.search
  • - com.day.cq.tagging
  • - com.day.cq.wcm.api
  • - com.day.cq.replication
  • - javax.jcr
  • - org.apache.sling.api
  • - org.apache.sling.api.resource
Also available history and scripts archive. Those features make solving the kind of issues mentioned at the beginning of the post easy and in an elegant way.
Example:
Suppose we need to find pages that have “baking” in title and replace it with “banking”.
import com.day.cq.commons.jcr.JcrConstants
def search = "Baking"
def replace = "Banking"
def path = "/content/geometrixx"
def property = JcrConstants.JCR_TITLE;
def query = createSQL2Query(path, search , property)
def result = query.execute()
result.nodes.each{node ->
def title = node.get(property)
node.set(JcrConstants.JCR_TITLE, title.replaceAll(search ,replace))
println node.path
}
save()
def createSQL2Query(path, term, property) {
def queryManager = session.workspace.queryManager
def statement = "SELECT * FROM [cq:PageContent] AS s WHERE ISDESCENDANTNODE([${path}]) and s.[${property}] like '%${term}%'"
def query = queryManager.createQuery(statement, "JCR-SQL2")
query
}


You can see solution using Groovy Сonsole is pretty short and straightforward.


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.