May 15, 2020
Estimated Post Reading Time ~

Analyzing Issues Properly in AEM

We have all seen it. An Adobe Experience Manager instance under heavy load, unusable, non-responsive, possibly even going into a deadlock-like or deadlock state.

The question is where to start analyzing the data and how to look at it from a proper debugging perspective.

AEM per default delivers quite a lot of information that is not harnessed in many cases. In order to investigate and debug properly, it is crucial to understand the data AEM provides and use tools that can interface properly with the Java Virtual Machine (JVM).

A good start when dealing with an unresponsive AEM instance is having a look at its log files, such as:
  1. Access log
  2. Request log
  3. Error log
  4. Garbage Collection log (if configured)
  5. Other logs you may have configured by OSGi configuration in the logging section
In this article, I like to give a little more information on a few of the above-mentioned log files and other tools that helped me solve issues in the past and sped up my debugging process immensely.

Request log
Looking at the request log, for example, can provide very valuable information about the general behavior from before or after an issue may have occurred.

Jörg Hoh from Adobe has written a simple, but great, tool that helps you visualize this information.
The graph-request-log.pl script leverages the OOTB rlog.jar, delivered with AEM, which can filter out and quickly produce a readable output from your request.log.

Seeing stagnation in the responses could point you into the proper direction if AEM wasn’t able to handle the requests at a given time which would lead to a deeper analysis, for example leveraging thread dumps.

Error log
Many times when looking at the error log in production instances I see logs flooded with stack traces. Some helpful, but an awful lot of time they are just leftovers from development.

Reading stack traces properly (not from the top to bottom but from the bottom to top) can greatly increase your chance to find the root issue or at least will help to point the development team in the right direction.

The error log also provides some information about bundle cache and cache hit/miss ratios. This will improve performance if configured correctly. Adobe provides in-depth information about the CRX bundle cache on their website.

Nmon
Nnom was originally developed by IBM and it was released to the public a few years ago. Nmon is a very simple but very powerful monitoring tool that might be your lifesaver when analyzing issues on the system.

It provides you with all the standard OS-level data and even goes one step further, serving detailed information like IO statistics.
Setting this information in relation to log's of AEM, which you may have visualized in the form of graphs, or even just looking at them in raw format can make your debugging life a lot easier.

Configuring nmon is simple and won’t put additional strain on existing resources. It can run in daemon mode which saves data to the disk for later analysis. Merging nmon files by the OOTB tool nmonmerge and using the nmon visualizer will quickly provide you with an overview of the system behavior and even historic trends you will heavily depend on when analyzing an issue.

Java development kit (JDK) tools
Java provides many great tools to investigate issues with your AEM instance. JMX should be enabled to really harness the full potential of your AEM instance. To avoid potential security risks it is necessary though to properly protect the tool by using authentication and other precautions. Otherwise, it might open specific operations that may disrupt the operation of your website.

VisualVM
Further tools exist like VisualVM to help identify root causes. A plugin for VisualVM everyone should have in their arsenal to debug and analyze properly is the TDA (Thread Dump Analyzer). Additionally, there are many other plugins vital to investigations. Have a stroll on the plugin's page. A few to look out for are:
  • MBeans Browser
  • VisualGC
  • Tracer
  • Thread Inspector
  • OSGi Plugin
Running VisualVM locally on your notebook or on a management service allows you to remotely attach to the AEM instance and start monitoring system behavior on the fly.

JConsole provides many features similar to VisualVM but is not as mighty as VisualVM. A short comparison and further details can be found on the JDK tools page.

There are also many application monitoring solutions on the market that will support you in your investigations. Having a toolbox and understanding how to leverage those tools to their full potential will lead you to a successful analysis.

Debugging in AEM - final words
  • Use the tools AEM provides you with for analysis.
  • Have a solid understanding of stack trace reading.
  • Create and interpret thread dumps, they may lead you to the root cause (in many cases they will).
  • Use other tools f.e. nmon or application monitoring.
  • It is always advisable to have your QA Team checking the error log as well to ensure that it gets cleaned up properly before going into production.


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.