Flexibly instrumenting java with CPU utilization recordings

Each of these has their own set of strengths, and is appropriate for differing and sometimes similar requirements.

What each of them lack is the ability to flexibly, easily, and succinctly report on CPU utilization at not only an individual thread level, but also at different application code entry points as defined by the organization.

What we are proposing is the use of a custom operating system dependent library that makes system calls on demand to fetch CPU counters for the current thread. These counters can then be persistently written to disk for later, or even current analysis.

As such, our requirements for this are as follows:

1. Very little, if any, impact on the running system. We want to avoid the Heisenberg principle in which that which is being monitored is affected by the monitor itself
2. Flexible entry points that can be added with very little code
3. Ability to easily enable and disable recording
4. Ability to easily report on the output

There are two ways we can obtain CPU timings from within a running java process.

JNI

We begin by providing a definition of the Java Native Interface, or JNI as it is commonly known. This interface allows platform specific libraries to be used by java. These are normally used to use functionality available in java. In this case, we can make a system call on Linux using the getrusage() function. JNI provides a mapping interface to a C code shared library that exposes the return values from this call that we can then use in our java process.

Management Bean

Since Java 1.5.0, the management bean interface has exposed various thread statistics, including CPU time used. It does not expose memory statistics that Linux does provide, and would be accessible through a C code shared library described in the JNI section of this document.

Since the management bean is provided “out of the box”, it is less complex to implement.

To get started, EXPRESS will use the management bean. We have CPU utilization issues more often than we have memory utilization issues, so we can forego the loss of memory metrics in exchange for less complexity.

We start with a generic component that can be called from any other ATG component such as a droplet, form handler, or other backend generic service component. The source code for this component is shown below.

package com.express.util;
import atg.nucleus.*;
import java.lang.management.*;

public class ExpressCpuMonitor extends GenericService {
  public void writeIt(String operation, String component) {
    try {
      ThreadMXBean t = ManagementFactory.getThreadMXBean();
      logInfo(operation + "," + component + ","
                        + t.getCurrentThreadUserTime()/1000000 + ","
                        + (t.getCurrentThreadCpuTime() - t.getCurrentThreadUserTime()) / 1000000);
    }
    catch (Exception e) {
      e.printStackTrace();
    }
  }
}

To ensure we can access the component from a web page, we also provide a droplet as shown below.

package com.express.util;
import java.io.*;
import javax.servlet.*;
import atg.servlet.*;
import atg.nucleus.*;

public class ExpressCpuMonitorDroplet extends DynamoServlet {

  ExpressCpuMonitor expressCpuMonitor;

  public ExpressCpuMonitor getExpressCpuMonitor() {
    return this.expressCpuMonitor;
  }

  public void setExpressCpuMonitor(ExpressCpuMonitor expressCpuMonitor) {
    this.expressCpuMonitor = expressCpuMonitor;
  }

  public void service(DynamoHttpServletRequest request, 
                      DynamoHttpServletResponse response) 
    throws ServletException, IOException {

    try {
      expressCpuMonitor.writeIt(request.getParameter("operation"),
                                request.getParameter("component"));
    }
    catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Lastly, we show a sample web page with a call to our droplet.





  
  

<%
try {
  Thread.sleep(5000);
}
catch (Exception e) {
  //ignore
}
%>

  
  



When we request the web page above, we see the following output in the JBOSS server log.

2013-06-06 21:35:21,572 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-7) start,sleepingDropletTest,220,10
2013-06-06 21:35:26,574 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-7) end,sleepingDropletTest,220,10

Basket.jsp

Recent analysis has shown that basket.jsp scales the most poorly when compared with other pages. By scaling poorly, we mean that the more we do, the slower each request is on average. A system that scales well with have consistent response time when the workload increases.

We placed our droplet instrumentation calls in the basket.jsp, and found the following output was generated.

2013-06-06 22:00:23,242 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-1) start,basketTest,6450,710
2013-06-06 22:00:25,075 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-1) end,basketTest,7150,830
2013-06-06 22:00:34,636 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-2) start,basketTest,3610,290
2013-06-06 22:00:35,083 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-2) end,basketTest,3710,300

In the above output, we see that we spent 700 milliseconds (7150 – 6450) on our first request, and 100 milliseconds on the second request. We also see that we spent 80 milliseconds on system CPU time in the first execution, and 10 milliseconds in the second execution. For clarity, system CPU time is incremented by the Linux kernel. For example, a read() , memset(), or futex() function invocation are all system calls to which system CPU time would be attributed. User CPU time is attributed to things such as while and for loops, variable declaration, etc.

Reporting

Since we are proposing that we simply write the output to the standard server.log, we can use simple command line reporting to view the data. An example is shown below. Based on the operation passed at the command line to the script, a total will be printed for each thread that calls the given component.

#!/bin/awk -f
BEGIN {
  COMPONENT=ARGV[2]
  ARGC=2
}

{
  if ($4 == "[nucleusNamespace.com.express.util.ExpressCpuMonitor]" && $6 ~ COMPONENT) {
    split($6,t,",")
    if (t[1] == "start" && t[2] ~ COMPONENT) {
      vals[$5]=t[3]
    }
    else if (t[1] == "end" && t[2] ~ COMPONENT) {
      totals[$5]+=(t[3]-vals[$5])
    }
  }
}
END {
for (i in totals) {
  print i,totals[i]}
}

What is listed below will print the total CPU used by each component.

#!/bin/awk -f
BEGIN {
  COMPONENT=ARGV[2]
  ARGC=2
}

{
  if ($4 == "[nucleusNamespace.com.express.util.ExpressCpuMonitor]") {
    split($6,t,",")
    if (t[1] == "start") {
      vals[t[2]]=t[3]
    }
    else if (t[1] == "end") {
      totals[t[2]]+=(t[3]-vals[$5])
    }
  }
}
END {
for (i in totals) {
  print i,totals[i]/1000}
}

APPENDIX A

The full blown JNI implementation is provided below.

1. Download the open source C library from https://code.google.com/p/jnithreadcpuusage
2. Follow the instructions in the README. For the configure command, change the following line:
./configure –prefix=/opt/atg/ATG-Data
3. Place the generated threadCPUUsage.jar file in /var/lib/jbossas/lib
4. Change the ExpressCpuMonitor class to what is below

package com.express.util;
import atg.nucleus.*;
import net.appjax.jni.*;

public class ExpressCpuMonitor extends GenericService {

  public void writeIt(String operation, String component) {
    try {
      ThreadCPUUsage cpuUsage = ThreadCPUUsage.getCPUUsage();
      logInfo(operation + "," + component + "," + 
              cpuUsage.getTotalUserCPUTime() + "," +
              cpuUsage.getTotalSystemCPUTime());
    }
    catch (Exception e) {
      e.printStackTrace();
    }
  }
}

5. Add the following line to /var/lib/jbossas/bin/run.sh

JAVA_OPTS="$JAVA_OPTS -Djava.library.path=/opt/atg/ATG-Data/lib"

# Display our environment
echo "========================================================================="
echo ""
echo "  JBoss Bootstrap Environment"
echo ""
echo "  JBOSS_HOME: $JBOSS_HOME"
echo ""

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.