March 20, 2020
Estimated Post Reading Time ~

AEM/Adobe CQ5 and Apache Solr integration

AEM/Adobe CQ5 and Apache Solr integration
Solr is used as one of the major open-source search engines in many applications.
Solr can also be used with Adobe AEM/CQ to index page data. This is more helpful to avoid burdening the query browser.

Some sites also use Solr to get specific functionality. In such cases, Solr search for the application will be restricted to a set of defined functionality like fetching a set of contents, some of the related items, etc.

If we implement the Solr in an efficient way, the search will be extremely fast.

Steps while indexing data to Solr.
When Solr is used as part of the AEM website system, all the page data will not be indexed to Solr. So to index required data, we can create custom transport handlers in CQ to insert data.

In transport handlers we should define what are all the page properties are pushed to index. Only these items are going to Solr. While doing a search in Solr through a query, the results appear faster.

When the data gets indexed?
Whenever the page content gets modified by the Author, the new data should be transferred to Solr too. We need to have a custom transport handler created which triggers replication when a page is published. So the transport handler ensures when there is a publish, the respective page is gone through a customs agent and its properties are collected and pushed to System. The properties are collected and create a Solr document out of the properties using the 'Solrj' bundle. This document can be committed to Solr with help of Solrj.

How the result is retrieved.
Solr provides search results in many formats like text, XML, JSON. The majority of the application now uses JSON as retrieval format, but it is totally on the convenience of the coder. When the system invokes the Solr with a query, it returns the results as the format we specified during Solr configuration. These data can be interpreted to display the results in an AEM(CQ) page.

How to delete the data from publish.
The above said custom handler will ensure whenever there is a un-publish, the data will be deleted from Solr too, thus the results are removed from the search.

We are coming up with a detailed explanation of the above each task.
Subscribe our blog for more updates about AEM CQ - Solr integrations.

JSON tips for SOLR
Below given set of configuration tips while working with Solr output as JSON

1) How to set the Solr output to JSON?
set media-type as below.
<xsl:output method="text" indent="no" media-type="application/json"/>

Below line to be added in solrconfig.xml (if multi core configuration, add below line in solrconfig.xml of each core)
<queryResponseWriter name="xslt" class="org.apache.solr.response.XSLTResponseWriter"/>

Given below a sample JSON response writer. Save it in your /core/conf/xslt folder.

The query should have ‘&wt=xslt&tr=json.xsl’ appended to invoke the required JSON format.

2) How to disable displaying an attribute if its value is zero.
<xsl:if test="articleDate != ''">
<xsl:text>","articleDate":"</xsl:text><xsl:apply-templates select="str[@name='articleDate']"/>
</xsl:if>

3) How to get the total number of results of search returned
If the XML/JSON returned having nodes, <result>, below expression gives the count
<xsl:variable name="count" select="result/@numFound"/>

Now print the value as
<xsl:value-of select="$count"/></xsl:text>

4) How to add a comma to JSON element except for last element
Check for the position then insert comma using below logic
<xsl:if test="position()!=last()">
<xsl:text>,</xsl:text>
</xsl:if>

5) Check for a value greater than
<xsl:if test="$count &gt; 8">
<xsl:text><xsl:value-of select="$count"/> is greater than 8</xsl:text>
</xsl:if>

6) check if a value exists,
<xsl:if test="str[@name='id']">
<xsl:text> id exists</xsl:text>
</xsl:if>

FAQ about Solr AEM Integrations:
What are all the differences between Solr usage embedded & remote in AEM?
  • Embedded Solr is recommended only for development purposes. Production search can be implemented using remote/ external Solr. This way it makes the solution more scalable.
  • If the site has smaller content, embedded could be used. but for larger content, external Solr is recommended.
  • When we work with external Solr, we have more control over schema, index options, boosting fields, more direct configurations, etc.
  • External Solr is recommended when data/ content from third-party applications needs to be indexed than from AEM.
Difference between Solr & Lucene
Lucene is the core search engine and Solr is a wrapper on Lucene. To read data from Lucene we need programs. But Solr provides a UI thus making easier to read indexed data.

Advantages of using Solr search
Below given the major advantages of using Solr as an indexing/ search engine with AEM.
  • Quick learning curve.
  • Horizontal & vertical scaling through Solr Cloud
  • Clustering through Apache Zookeeper
  • Rich full-text query syntax
  • Highly configurable relevance and indexing
  • Plugin architecture for query parsing, searching & ranking, indexing


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.