May 15, 2020
Estimated Post Reading Time ~

AEM Dispatcher Cache Invalidation for Multiple Cache Farms

Imagine you have an Adobe Experience Manager set up hosting multiple websites. This is where AEM really shines and is common practice at most companies that host their websites with AEM.

Imagine you have an Adobe Experience Manager set up hosting multiple websites. This is where AEM really shines and is common practice at most companies that host their websites with AEM. The problems occur when there is also a different content structure in AEM for each website, along with different needs for cache settings and so on.

Setting up the webserver
Setting up a web server with the Adobe Experience Manager (AEM) dispatcher module for all your websites is pretty simple when you have the same configuration for all virtual hosts regarding statsfilelevel, publish servers (renders), cache (docroot) directory and invalidation rules.

Here is what an example set up might look like:
2 or more publish instances
2 or more web servers with the dispatcher module sharing the same configuration
a load balancer in front of the dispatcher web servers
the flush request (to invalidate the cache with the flush replication agent) is triggered by the author instance

AN EXAMPLE FOR A MULTIPLE DISPATCHER WEB SERVER SETUP
And these are the requirements for the above-mentioned setup:
divide your cache directories (docroot) for each website/dispatcherfarm
For example due to security requirements.
have different settings or statsfilelevel for each website/dispatcherfarm
Due to different content structures in AEM.
have different settings for invalidation and caching rules for each website / dispatcherfarm
Your websites may have different cache settings for pictures or html or other resources.

Further restrictions for setting up the webserver

For the configuration of the flush agent, you cannot use the domain name of the load balancer.
Because of the load balancer in front of your dispatcher webservers, you will never know which web server cache will be invalidated this way.
You could work with custom headers at that point, so the load balancer can determine to which server the request should be sent, but that may be a long way if you do not have full control of the load balancer yourself.

For these requirements, you need to split up the dispatcher configuration in multiple farms. You can use the hostname globbing in the dispatcher module to determine how the request should be handled.

The setup and the solution I describe here may be a very special case regarding the setup and number of restrictions, but I may not be the only one running into it.

Solution for this setup
Luckily the invalidation request serves as the CQ-Path header (that represents the CRX path of the content that should be flushed) which we can use to determine which website's cache directory should be invalidated.
We configure our flush replication agent to point to the dispatcher web server. One for every webserver instance.
So now we know the content path and the website it belongs to.
Changing the host header for the invalidation request in the web server will do the rest and the invalidation will work properly.

The solution in technical details
The following config can easily be added to the webserver configuration as it is processed before the request hits the dispatcher module.

With the LocationMatch we will only treat requests for the invalidation so we do not interfere with regular requests that serve the content.
With SetEnvIf we set the environment variable FLUSH_HOST depending on the CQ-Path header to the value of the domain name of the website.
This can be easily extended for a large number of domains and will work as long as the content path is different for each of the domains.

<LocationMatch "^/dispatcher/invalidate.cache$">
     # domain A
     SetEnvIf CQ-Path “.*/content-path-of-domain-A/.*" FLUSH_HOST=domain-A
     RequestHeader set Host %{FLUSH_HOST}e env=FLUSH_HOST
     # domain B
     SetEnvIf CQ-Path “.*/content-path-of-domain-B/.*" FLUSH_HOST=domain-B
     RequestHeader set Host %{FLUSH_HOST}e env=FLUSH_HOST
 </LocationMatch>

This solution eliminates the need to set up more web server instances than necessary to fulfil the requirements mentioned in this article. The AEM dispatcher set up is also described in detail on the Adobe Website.


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.