January 6, 2021
Estimated Post Reading Time ~

The log files contains the error, "Too many files," and AEM doesn't respond.

Issue
The log files contain the error, "Too many files," and AEM doesn't respond.

Cause
  1. The cause is one of two possibilities:
  2. The application isn't closing resources such as files or sockets after using them.
  3. Or the application requires more open files than is permitted by the process. 
Solution
  1. The solution to this problem is:
  2. Find out what is causing the open file limit to be reached
  3. Either increase the limit or fix the application bugs. 
Find which files or sockets left open
** Note that open file limits apply to the total combined open files, pipes, and sockets, not just files.

On the Linux platform, the lsof command can be used to debug which resources are held open by the process.

Here is a sample script to collect useful lsof output:
#!/bin/bash
if [ $# -eq 0 ]
then
echo "No PID specified"
echo "Run command with PID, for example:"
echo "lsof-script.sh 12345"
exit 2
fi

JAVA_PROCESS_PID=$1

lsof -p $JAVA_PROCESS_PID > lsof-output-$JAVA_PROCESS_PID.txt

echo "Files open by the process:"
cat lsof-output-$JAVA_PROCESS_PID.txt | wc -l

echo "Generated output file with counts of grouped open files lsof-sorted-counts-$JAVA_PROCESS_PID.txt"
cat lsof-output-$JAVA_PROCESS_PID.txt | awk '{print $9}' | sed -e "s/\(^.*\)\(segmentstore\).*$/\1\2/" | sed -e "s/\(^.*\)\(repository[/]index\).*$/\1\2/" | sed -e "s/\(^.*\)\(felix[/]bundle\).*$/\1\2/" | sed -e "s/\(^.*\)\([/]lib\).*$/\1\2/" | sed -e "s/\(^.*\)\([/]logs\).*$/\1\2/" | sed -e "s/\(^.*\)\([/]ext\).*$/\1\2/" | sort | uniq -c | sort -rn -k1 > lsof-sorted-counts-$JAVA_PROCESS_PID.txt

echo "Total open files in OS:"
lsof | wc -l


Sample output:
$> ./lsof-script.sh 18070
Files open by the process:
1995
Generated output file with counts of grouped open files: lsof-sorted-counts-18070.txt
Total open files in OS:
18399

Inspect the output of the generated lsof-sorted-counts-*.txt file. It shows which files or sockets are currently held open by the process.

If you find open sockets or files listed that shouldn't still be open then it is likely due to an application bug. Update your application code to close files and sockets after using them.

A common cause of lingering open sockets is custom code that makes web service. In many cases, libraries like Apache Commons HttpClient is used but connections are never closed by the developers. See this article for details on Apache Commons HttpClient.

Increase the limit for the shell session
Check the user's limit for maximum open files, and then run the following as the same user that the AEM process runs as:

ulimit -Sn ulimit -Hn

When using the default start script of AEM/CQ, do the following to increase the limit:
1. Open crx-quickstart/bin/start for editing
2. Add the variable CQ_MAX_OPEN_FILES to the top of the script:
CQ_MAX_OPEN_FILES=8192 export CQ_MAX_OPEN_FILES

If you see the error "-bash: ulimit: open files: cannot modify limit: Operation not permitted" when starting AEM, then the configuration above doesn't work.

Instead, it's necessary to increase the limit in /etc/security/limits.conf. See below for details on how to reconfigure the user limit.

If you are using a third-party application server such as JBoss or Websphere, follow the sections below and verify with the vendor's documentation.

Note:
If none of the configurations in this article solve the problem, then see what files are open using the command lsof -p. (-p is the process id of the problematic process.) It could be that your application is leaving file handles open. If you see that the handles are being held mostly by AEM and not your application, then contact support.

Increase the user's limits
To change the maximum number of open files for non-root users, change the file /etc/security/limits.conf. You can set the limit on a per-user basis:

crx_process_username soft nofile 8092
crx_process_username hard nofile 20000


Note:
This configuration doesn't take effect until the next time the user logs in.

Increase the system limit
Sometimes, the user limit is high enough, but the system itself has hit its maximum number of files. Run the following as su/root user:

1. Check the max open file setting on the operating system (if it is below 20000, then it makes sense to increase this setting).
cat /proc/sys/fs/file-max

2. Add this line to /etc/sysctl.conf to increase the system open file maximum value:
fs.file-max = 300000

3. Run this command:
sysctl -p

4. Verify that the new value is displayed when you run this command:
cat /proc/sys/fs/file-max

Note:
This configuration doesn't take effect until the next time the user logs in.


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.