We had a server that had been behaving oddly since at least April. I finally discovered why.
jps returns nothing, which is the first symptom. When we strace, we see it can’t write to /tmp (our setting for java.io.tmpdir), so we look at permissions…
-bash-4.1$ strace -f jps 2>&1 | grep ACC [pid 16055] lstat("/var/lib/samba/winbindd_privileged/pipe", 0x7f3e9da9e170) = -1 EACCES (Permission denied) [pid 16055] mkdir("/tmp/hsperfdata_sa-jboss", 0755) = -1 EACCES (Permission denied) -sh-4.1$ uname -n lck01.ecomm.local -sh-4.1$ ls -lart /tmp | awk '$NF == "."' drwxr-xr-x. 9 root root 4096 Aug 12 14:32 .
…the permissions on lck02 have the sticky bit set as well as world writable permissions…
-sh-4.1$ uname -n lck02.ecomm.local -sh-4.1$ ls -lart /tmp | awk '$NF == "."' drwxrwxrwt. 8 root root 4096 Aug 12 14:33 .
…so we mimic these on lck01…
root@lck01 Tue Aug 12 14:33:30 EDT 2014 [~] # chmod 777 /tmp root@lck01 Tue Aug 12 14:33:30 EDT 2014 [~] # chmod o+t /tmp -bash-4.1$ jps 16129 Jps
We had stuck threads and everything else. The entire thing now works, no stuck threads, etc.
-bash-4.1$ jps 16370 Main 16805 Jps -bash-4.1$ jstack 16370 | grep State: | sort | uniq -c 35 java.lang.Thread.State: RUNNABLE 14 java.lang.Thread.State: TIMED_WAITING (on object monitor) 5 java.lang.Thread.State: TIMED_WAITING (parking) 8 java.lang.Thread.State: TIMED_WAITING (sleeping) 46 java.lang.Thread.State: WAITING (on object monitor) 83 java.lang.Thread.State: WAITING (parking) -bash-4.1$
The moral of the story is, check the permissions on /tmp if you have unexplained behavior from a java application.