April 19, 2020
Estimated Post Reading Time ~

Consuming AEM Logs into New Relic with FluentD

This fall, New Relic rolled out a new log analysis product called New Relic Logs, in order to further make it possible to have an all-in-one observability solution for your infrastructure. Traditionally, when instrumenting something like Adobe Experience Manager, one has needed to have separate platforms for log aggregation and for APM (application performance monitoring), meaning you’d need to run something like Splunk or Elastic (ELK) or Loggly for logs, and then something like New Relic or Dynatrace or Appdynamics for APM. But now, New Relic’s product promises the ability to move you one step closer to just having a single product for all of those observability needs.
New Relic Logs: The Basics
As a few things to know about New Relic’s logging product:
  • New Relic Logs is not really designed as a full-scale replacement for an all-in logging solution like Splunk. However, if you’re moving up from, say, not having a log aggregator at all – it may definitely suit your needs.
  • It allows Logging and APM alerting to be configured using the same Alerting profiles. So, if you have your alerts already set up in New Relic, you can use logging events to power the same set of alerts, if you want.
  • New Relic did not author a standalone logging collector (like FileBeat or the Splunk Forwarder) along with the Logs product. Instead, if you’re deploying this newly, New Relic suggests that you use FluentD as a log collector, which is a lightweight and relatively easy-to-configure log shipper, for which New Relic has written a plugin to simplify things.
Steps for Consuming AEM Logs with FluentD
Here are the basic steps for getting AEM logs into New Relic, using FluentD, which at this writing is New Relic’s recommended method for shipping logs.

Provision your New Relic Account with Logs Access:
At this point in time, you’ll need to separately request to get your New Relic account set up for Logs access. Once you’ve got it set up, you’ll be able to click on it from the main New Relic One screen like you see here:



NOTE: Now, if your logs are already going to LogStash or are on AWS and are already going to Cloudwatch, you’ll want to go here: https://docs.newrelic.com/docs/logs/enable-logs/enable-logs/enable-new-relic-logs – as the steps I’m writing below are assuming that you’ve got AEM logs on disk in the /crx-quickstart/logs/ directory that you’re going to need to aggregate and ship off to New Relic.

Install td-agent on your AEM Hosts
FluentD is packaged by an outfit called Treasure Data into an easily-installable agent called td-agent, which works well on RHEL/CENTOS, Ubuntu, OSX, etc. It comes with init scripts, config files, loggers, etc that make the setup easy. Go to https://www.fluentd.org/download and grab the version of td-agent for the OS your AEM systems are running.

Install the New Relic Plugin on td-agent
Once you’ve got td-agent installed, you need the New Relic plugin which will know how to take data from td-agent and forward it properly into New Relic. You do that with:

td-agent-gem install fluent-plugin-newrelic

Install the Grok plugin for FluentD
In order to be able to extract fields out of your logs and report them properly to New Relic (i.e. if you want to be able to extract things like a hostname or clientip or loglevel or something like this out of your logs), you’ll need to be able to define these using a simple macro markup called Grok which – if you’ve done any Logstash work, will be fairly familiar to you. Install this for td-agent with:

td-agent-gem install fluent-plugin-grok-parser

Configure td-agent to grab your AEM logs
The td-agent process is configured with /etc/td-agent/td-agent.conf.

Below is a sample configuration which (basically) works, with a few current caveats that I am working through with the New Relic folks still.

## PUBLISH INSTANCE
<source>
@type tail
<parse>
@type grok
<grok>
pattern %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{HTTPDATE:timestamp} "(?:%{WORD:method} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}
</grok>
</parse>
path /opt/aem/publish1/crx-quickstart/logs/access.log
pos_file /var/log/td-agent/aem.access_log.pos
tag publish1.access
</source>

<filter publish1.access>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<source>
@type tail
path /opt/aem/publish1/crx-quickstart/logs/error.log
pos_file /var/log/td-agent/aem.error_log.pos
<parse>
@type grok
<grok>
pattern %{GREEDYDATA:date} %{TIME} \*%{LOGLEVEL:loglevel}\* %{GREEDYDATA:message}
</grok>
</parse>
tag publish1.error
</source>

<filter publish1.error>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<source>
@type tail
path /opt/aem/publish1/crx-quickstart/logs/request.log
pos_file /var/log/td-agent/aem.request_log.pos
<parse>
@type none
</parse>
tag publish1.request
</source>

<filter publish1.request>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<source>
@type tail
path /opt/aem/publish1/crx-quickstart/logs/stdout.log
pos_file /var/log/td-agent/aem.stdout_log.pos
<parse>
@type none
</parse>
tag publish1.stdout
</source>

<filter publish1.stdout>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

## AUTHOR INSTANCE
<source>
@type tail
<parse>
@type grok
<grok>
pattern %{IPORHOST:clientip} %{USER:ident} %{USER:auth} %{HTTPDATE:timestamp} "(?:%{WORD:method} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}
</grok>
</parse>
path /opt/aem/author1/crx-quickstart/logs/access.log
pos_file /var/log/td-agent/author.access_log.pos
tag author1.access
</source>

<filter author1.access>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<source>
@type tail
path /opt/aem/publish1/crx-quickstart/logs/error.log
pos_file /var/log/td-agent/author.error_log.pos
<parse>
@type grok
<grok>
pattern %{GREEDYDATA:date} %{TIME} \*%{LOGLEVEL:loglevel}\* %{GREEDYDATA:message}
</grok>
</parse>
tag author1.error
</source>

<filter author1.error>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<source>
@type tail
path /opt/aem/author1/crx-quickstart/logs/request.log
pos_file /var/log/td-agent/author.request_log.pos
<parse>
@type none
</parse>
tag author1.request
</source>

<filter author1.request>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<source>
@type tail
path /opt/aem/author1/crx-quickstart/logs/stdout.log
pos_file /var/log/td-agent/author.stdout_log.pos
<parse>
@type none
</parse>
tag author1.stdout
</source>

<filter author1.stdout>
@type record_transformer
<record>
service_name ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>

<match **>
@type newrelic
license_key [INSERT YOUR NEW RELIC LICENSE KEY HERE]
</match>


How it Works:
Basic searches work exactly as expected in New Relic Logs using the above configuration.


One can do faceted search based on any fields that you have coming through from FluentD. For example, I can easily click and constrain searches by hostname, or by service (i.e. the log file, which in this case is named by author or publisher).

Also, if you’ve taken the time to make sure your pre-parsing works with Grok filters as above, you’ll be able to facet your searches into fields, like the LOGLEVEL field in the AEM error log:


Or break out all of the request stats from your AEM Access logs, and be able to search & create alerts based on clientip or response codes etc:


Alerting also works easily in this case, as clicking the [+] next to the “QUERY LOGS” button will allow you to create an alert based on the search you’ve created, which does work great. It basically translates your logs search to NRQL (the New Relic Insights query language) and lets you assign that to any one of your current New Relic Alert Policies.


Which can then fire you an email or alert you in Slack or Jedi Mind Trick you into logging into the offending server, or whatever you want.

What Doesn’t Work Yet
Logging In-Context
The holy-grail feature of New Relic Logs that I haven’t gotten to work as yet with AEM is Logging In-Context, which would allow log statements from AEM to be married upon the backend with APM data and infrastructure data, so that you could see in one view in New Relic any metrics or logs which would correspond to a given application at a given point of time.

This would be huge, as right now – even though New Relic does aggregate and allow searching of log data in Insights right alongside other data, you have to provide yourself the data to marry up what servers make what logs so that you can correlate their data with data from your app.

Enabling Logging In-Context (presently a beta feature anyhow) would require annotating AEM logs with data that New Relic can use on the backend to tie everything together.

Better Log Parsing
There’s a lot more I intend to do with the FluentD config above, as there are a few things I WANT to be able to pick out of the logs. Things like being able to easily marry-up the request & response lines from the AEM request log by parsing out the request ID, and also being able to properly parse multi-line stack traces in the error log (which there IS a Grok parser for multiline stack traces, I just haven’t gotten it to work yet.)

But all told, I’m very hopeful about this product presently and hope to get a lot of use out of it in the future!



By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.