March 15, 2020
Estimated Post Reading Time ~

Comprehensive Guide on AEM Link Checker


AEM link checker is used to validate all internal and external links available on the page. The main purpose of developing a link checker is that content author should not worry about bad or broken links on publish environment, it also allows authors to view a list of all valid and invalid links available on his website at a single place.

After completing this tutorial you will have a clear and understanding about:
  • How aem external link checker works.
  • How aem internal link checker works.
  • How to fix broken links that link checker not able to validate.
  • Difference between link checker and link rewriter.
  • How to disable link checker in AEM.
AEM External link checker:
AEM Link Checker is based on an event handler and gets triggered on creates and updates for /content and its child nodes. All content under the selected root path is parsed and links are validated. All the validation of links is done asynchronously in the background and the HTML is updated based on verification results.

Note: If you are having a huge repository (/content), that includes frequent updation of links. Then it is not advised to use a link checker due to performance issues. As it gets triggered periodically and traverses the whole repository for validating links. This may cause slowness in your author instance.

Now let's see how aem link checker works:
  • As soon as the author saves any link on the page, either using RTE or any custom component. The link checker event handler gets triggered.
  • Link checker event Handler traverse /content node and checks for new/updated links, once found it will store that mapping under /var/linkchecker cache folder.

  • Then control goes to Day CQ Link Checker Service, It checks for the scheduler. period configuration. Once scheduler time is met, it triggers the scheduler to validate the syntax and structure of the link against all the given configuration like the special prefix that it has to ignore during validation and the patter that the link check should use to verify the syntax of the URL.

  • Once the syntax is validated the results are then pushed to /etc/linkchecker.html. But the links will remain in a pending state until Day CQ Link Checker Task scheduler validated these links by making an ajax GET call.

  • AEM link checker scheduler Day CQ Link Checker Task runs periodically to check the validity of valid and invalid links that are store under /etc/linkchecker.html.
The administrator user can configure the frequency on which he wants to run this scheduler by updating Scheduler Period property its default value is 3600 sec.

Once triggered it will remove all the invalid or unreachable links from /etc/linkchecker.html.

Below a screenshot of http://localhost:4502/etc/linkchecker.html will provide you a better understanding of how the values are getting fetched from /var/linkchecker and link checker list is updated. You can also request for re-validation and refresh the status of the links here.


After validation Invalid External Links will be displayed as below:


AEM Link checker is configured using below four services:
  • Day CQ Link Checker Info Storage Service – configures the link cache size. default is 500.
  • Day CQ Link Checker Service – Configure the frequency of background check, the default interval is 5 seconds
  • Day CQ Link Checker Task – Configure the frequency of background check for validating links.
  • Day CQ Link Checker Transformer – config for all the elements that need to be transformed by the link checker and rewritten.
AEM internal link checker:
Internal Links are validated as soon as content author add any internal links (repository links ex: /content/we-retail/ca) on the page either using RTE or any custom component. After validation, if the URL is no longer valid, then they are removed on the publisher or shown as broken links on the author.


Fixing broken links that link checker is not able to validate:
Sometimes, you might run into a broken link situation means the link is not available on publish even though it is a valid link. This might be because aem link checker automatically checks links and will not publish a broken link. Sometimes it is good as you have a self-monitoring system that prevents you from publishing a broken link but what happens when you know that the link is correct even though aem is not able to publish it as it is considering it as broken, then it is a problem.

There are two types of links that link checker requires configuration for validating:
  • Links that have a special prefix (ex: href=” tel:123-123-1234″ or href=”*|something|*”).
  • Links that after post-processing having query param, which you want to mark as always valid or skip validation.
  • Links having special prefix:-
  • Go to http://localhost:4502/system/console/configMgr.
  • Search for “Day CQ Link Checker Service” and update Special Link prefix.
  • For example when we add “tel:” as prefix then during syntax and structure validation it will not check or rewrite it. By default few prefixes are already added over here javascript:, data:, mailto:, #, <!—, ${


The link consists of a variable that is updated on post-processing:
These changes need to go at the coding level, where you can add one more attribute x-cq-linkchecker to <a> tag mark up. This attributed tells aem how to process this anchor tag. Let's see in more details below
  • You can add x-cq-linkchecker=” valid” parameter in the <a> tag to make sure that links are always marked as valid by CQ. In this case, the link checker will check the link but will mark it valid. ( For Ex:- <a x-cq-linkchecker=”valid” …>)
  • You can optionally use x-cq-linkchecker=” skip” in the <a> as well. In this case, the link checker will not even check for validity for the link.( For Ex:- <a x-cq-linkchecker=”skip” …>)
Difference between aem link checker and link rewriter:
A link checker is built for checking the validity of URLs. Link checker scheduler runs periodically to validate URLs available under /content in a repository and save the result under /var/linkchecker cache folder. All the links that have been checked or pending can be seen under /etc/linkchecker.html. After validating all the URLs if they are no longer valid, they are removed on the publisher or shown as broken links on the author.

A link rewriter is built if you want to rewrite the URL during the rendering of the HTML. It parses the HTML and rewrites the URLs available inside the html. If you want to do custom rewriting of URLs then you can write your own link rewriter by extending org.apache.sling.rewriter.Transformer interface.



Disable link checker in AEM:
There are two ways to disable link checker in aem, either though Felix console or by overriding Day CQ Link Checker Service regular expression. Follow below steps to disable aem link checker:-

Disabling all link checking by Felix console configuration:


Find the “DAY CQ Link Checker Transformer”
  • Check the “Disable Checking” box and save.
  • Go to /crx/explorer and login as admin
  • Open “Content Explorer“
  • Once all the changes are made browse to /var/linkchecker
  • Right-click the node and select “Delete Recursively”
  • Click “Save All”.

Note: Using this configuration we have an option either to disable only link checking or both link checking and link rewriting.

Disabling link checking of URLs using regular expressions:
AEM Link checker can be configured in such a way either to ignore all links from being processed or pattern of links based on a regular expression.

The following configuration is specific for the publish instance. To configure for author, change the configuration path from ../config.publish/.. to ../config.author/… . If you wish to configure it for both authors and publish change the configuration path from ../config/...


  • Login to crx/de as admin.
  • Create a configuration node (with node type sling:OsgiConfig) in the project ( /apps/<project-name>/config.publish/{OSGi service PID}).
  • Alternatively, you can copy the one from /libs/cq/linkchecker/config/com.day.cq.rewriter.linkchecker.impl.LinkCheckerImplin the config folder of your choice (that is /apps/myapp/config.publish)
  • Change the property service.check_override_patterns from “^system/” to “^.”
^system/:- This expression means ignore checking and rewriting of all external links that start with the system.

^. :- This expression means ignore checking and rewriting of all external links.

^http://www\.google\.com/ :- This expression means ignore checking and rewriting of http://www.google.com.
  • Delete all nodes under /var/linkchecker to stop the link checker from periodically rechecking URLs
  • If the configuration was done on the author, then make a package and install it on your publish instances as well.
Note: If you are using “^.” it will disable all link checking and link rewriting



By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.