May 11, 2020
Estimated Post Reading Time ~

AEM Dispatcher. Part 4: Cache invalidation



Many of us might probably have met the situation when dispatcher serves an old version of code. This article describes how to avoid this while still using the dispatcher caching possibilities.

Invalidation is mechanism for pointing obsolete cached resources. There are some tools for automatic invalidation and manual invalidation. But firstly let’s set initial configuration for invalidation section of the dispatcher configuration file then study how invalidation works at the low level and finally return to study tools for invalidation.
Invalidation section initial settings

Inside /cache section there is /invalidate block which determines cached files that may be automatically invalidated when content is updated. For example, the following configuration invalidates all HTML pages:

/cache
{
    /invalidate
    {
        /0000  { /glob "*" /type "deny" }
        /0001  { /glob "*.html" /type "allow" }
    }
}

With automatic invalidation dispatcher doesn’t delete cached files after updating content but checks their validity when they are next requested. Documents in the cache that are not auto-invalidated will remain in the cache until a content update explicitly deletes them. For our demonstration purposes let's allow all the cache to be invalidated automatically:

/cache
{
    /invalidate
    {
        /0000  { /glob "*" /type "allow" }
    }
}

Restart httpd server after updating /invalidate section for using new changes.
Invalidation in depth

At the low level dispatcher uses special empty files which are named by default “.stat”. By default setting is used /statfileslevel "0" which means that there is only one stat-file is used and is placed at the root of htdocs directory. If modification time of stat file is newer than modification time of the resource then dispatcher consider such resource are obsolete or are invalidated.

For example we have the next cached resources after requesting the page http://localhost/content/geometrixx/en/products.html :


Let’s invalidate them by the low level mechanism of the stat-files. Create empty file with name “.stat” at the root of your htdocs directory:


You may see that stat-file modification time is newer than cached resources modification time. That means for the dispatcher that all resources are obsolete. This is invalidation mechanism at the low level in depth. After creating such stat-file if we will visit again the page http://localhost/content/geometrixx/en/products.html then requested cached resources will be updated:


This example demonstrates default invalidation scheme with /statfileslevel "0". Let’s study how we may configure invalidation more detailed with help of /statfileslevel setting.
Setting /statfileslevel

You may use /statfileslevel property of the dispatcher configuration file to selectively invalidate cached files according to their path. There are some rules for /statfileslevel property mechanism:
Dispatcher creates .stat files in each folder from the docroot folder down to the level that you specify. The docroot folder is level 0.
When a file is updated dispatcher locates the folder on the file path that is at the statfileslevel and invalidates all files below that folder.
If level of the updated file is less than statfileslevel then all files in such folder are invalidated. Files below that folder are not invalidated.
When a file is updated then all files from file folder up to the root level inclusive will be invalidated.

For better understanding of the /statfileslevel rules let’s consider a couple examples. Our default demonstration case with /statfileslevel “0” is looked like that:


There is only one stat-file at the root folder of our docroot. And the scope of responsibility of that stat-file is all file-tree under htdocs. If any file from this tree has older modification time then stat-file modification time then dispatcher consider such file is invalidated.

If we set /statfileslevel “4” then invalidation works like that:


There are stat-files at all levels from 0 (root) to 4 inclusive.

Stat-files at levels less than 4 have the scope of responsibility with only directory with this stat-file. That means if stat-file inside content/geometrixx/en folder is newer than any file from this folder then such file is invalidated but validation of all files from all other folders is determined by other stat-files. Stat-files at the level with value of statfileslevel property (level 4 in our case) only have the scope of responsibility with all underlying tree which begins from folder with this stat-file and expands down to lower levels of the file tree. That means that if stat-file inside content/geometrixx/en/products folder has modification time newer than any file from underlying tree including products folder then dispatcher considers such file is invalidated. Validation of all files which is not located in this file tree is determined by other stat-files.
Automatic invalidation and flush agents

For automatic invalidation purposes you may enable author or publish flush agents. It’s recommended to use publish flush agent for more robust auto-invalidation because using author flush agent may cause next issues:
The Dispatcher must be reachable from the authoring instance. If your network (e.g. the firewall) is configured such that access between the two is restricted this may not be the case.
Publication and cache invalidation take place at the same time. Depending on the timing a user may request a page just after it was removed from the cache and just before the new page is published. AEM now returns the old page and the Dispatcher caches it again. This is more of an issue for large sites.

Publish flush agent is located at http://localhost:4503/etc/replication/agents.publish/flush.html


To enable your publish flush agent click “Edit” button and set “Enabled” checkbox:


Update URI port on the Transport tab and set it value to 80:


Save your updates and you will see that publish flush agent has been enabled:

Manual invalidation requests

You may send next requests manually 
for invalidation your cached resources:
  • POST /dispatcher/invalidate.cache HTTP/1.1
    CQ-Action: Activate
    CQ-Handle: path-pattern
    Content-Length: 0
    
  • for delete and recache files
    POST /dispatcher/invalidate.cache HTTP/1.1
    CQ-Action: Activate 
    Content-Type: text/plain
    CQ-Handle: path-pattern
    Content-Length: numchars in bodypage_path0
    Page_path1
    …
    Page_pathn
Summary
Finally we know invalidation mechanism in depth, flush agents for auto-invalidation when publishing pages and requests for manual invalidation.

Detailed and useful documentation you can find on these pages:
http://docs.adobe.com/docs/en/dispatcher/disp-config.html
http://docs.adobe.com/docs/en/dispatcher/page-invalidate.html



By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.