{"id":5814,"date":"2017-12-12T19:36:19","date_gmt":"2017-12-13T00:36:19","guid":{"rendered":"http:\/\/appcrawler.com\/wordpress\/?p=5814"},"modified":"2017-12-12T19:36:19","modified_gmt":"2017-12-13T00:36:19","slug":"flexibly-instrumenting-java-with-cpu-utilization-recordings","status":"publish","type":"post","link":"http:\/\/appcrawler.com\/wordpress\/2017\/12\/12\/flexibly-instrumenting-java-with-cpu-utilization-recordings\/","title":{"rendered":"Flexibly instrumenting java with CPU utilization recordings"},"content":{"rendered":"<p>Each of these has their own set of strengths, and is appropriate for differing and sometimes similar requirements.<\/p>\n<p>What each of them lack is the ability to flexibly, easily, and succinctly report on CPU utilization at not only an individual thread level, but also at different application code entry points as defined by the organization.<\/p>\n<p>What we are proposing is the use of a custom operating system dependent library that makes system calls on demand to fetch CPU counters for the current thread.  These counters can then be persistently written to disk for later, or even current analysis.<\/p>\n<p>As such, our requirements for this are as follows:<\/p>\n<p>1.\tVery little, if any, impact on the running system.  We want to avoid the Heisenberg principle in which that which is being monitored is affected by the monitor itself<br \/>\n2.\tFlexible entry points that can be added with very little code<br \/>\n3.\tAbility to easily enable and disable recording<br \/>\n4.\tAbility to easily report on the output<\/p>\n<p>There are two ways we can obtain CPU timings from within a running java process.<\/p>\n<p>JNI<\/p>\n<p>We begin by providing a definition of the Java Native Interface, or JNI as it is commonly known.  This interface allows platform specific libraries to be used by java.  These are normally used to use functionality available in java.  In this case, we can make a system call on Linux using the getrusage() function.  JNI provides a mapping interface to a C code shared library that exposes the return values from this call that we can then use in our java process.<\/p>\n<p>Management Bean<\/p>\n<p>Since Java 1.5.0, the management bean interface has exposed various thread statistics, including CPU time used.  It does not expose memory statistics that Linux does provide, and would be accessible through a C code shared library described in the JNI section of this document.<\/p>\n<p>Since the management bean is provided \u201cout of the box\u201d, it is less complex to implement.<\/p>\n<p>To get started, EXPRESS will use the management bean.  We have CPU utilization issues more often than we have memory utilization issues, so we can forego the loss of memory metrics in exchange for less complexity.<\/p>\n<p>We start with a generic component that can be called from any other ATG component such as a droplet, form handler, or other backend generic service component.  The source code for this component is shown below.<\/p>\n<pre>\r\npackage com.express.util;\r\nimport atg.nucleus.*;\r\nimport java.lang.management.*;\r\n\r\npublic class ExpressCpuMonitor extends GenericService {\r\n  public void writeIt(String operation, String component) {\r\n    try {\r\n      ThreadMXBean t = ManagementFactory.getThreadMXBean();\r\n      logInfo(operation + \",\" + component + \",\"\r\n                        + t.getCurrentThreadUserTime()\/1000000 + \",\"\r\n                        + (t.getCurrentThreadCpuTime() - t.getCurrentThreadUserTime()) \/ 1000000);\r\n    }\r\n    catch (Exception e) {\r\n      e.printStackTrace();\r\n    }\r\n  }\r\n}\r\n<\/pre>\n<p>To ensure we can access the component from a web page, we also provide a droplet as shown below.<\/p>\n<pre>\r\npackage com.express.util;\r\nimport java.io.*;\r\nimport javax.servlet.*;\r\nimport atg.servlet.*;\r\nimport atg.nucleus.*;\r\n\r\npublic class ExpressCpuMonitorDroplet extends DynamoServlet {\r\n\r\n  ExpressCpuMonitor expressCpuMonitor;\r\n\r\n  public ExpressCpuMonitor getExpressCpuMonitor() {\r\n    return this.expressCpuMonitor;\r\n  }\r\n\r\n  public void setExpressCpuMonitor(ExpressCpuMonitor expressCpuMonitor) {\r\n    this.expressCpuMonitor = expressCpuMonitor;\r\n  }\r\n\r\n  public void service(DynamoHttpServletRequest request, \r\n                      DynamoHttpServletResponse response) \r\n    throws ServletException, IOException {\r\n\r\n    try {\r\n      expressCpuMonitor.writeIt(request.getParameter(\"operation\"),\r\n                                request.getParameter(\"component\"));\r\n    }\r\n    catch (Exception e) {\r\n      e.printStackTrace();\r\n    }\r\n  }\r\n}\r\n<\/pre>\n<p>Lastly, we show a sample web page with a call to our droplet.<\/p>\n<pre>\r\n<dsp:importbean bean=\"\/com\/express\/util\/ExpressCpuMonitorDroplet\"\/>\r\n<dsp:page>\r\n\r\n<dsp:droplet name=\"ExpressCpuMonitorDroplet\">\r\n  <dsp:param name=\"operation\" value=\"start\"\/>\r\n  <dsp:param name=\"component\" value=\"sleepingDropletTest\"\/>\r\n<\/dsp:droplet>\r\n<%\r\ntry {\r\n  Thread.sleep(5000);\r\n}\r\ncatch (Exception e) {\r\n  \/\/ignore\r\n}\r\n%>\r\n<dsp:droplet name=\"ExpressCpuMonitorDroplet\">\r\n  <dsp:param name=\"operation\" value=\"end\"\/>\r\n  <dsp:param name=\"component\" value=\"sleepingDropletTest\"\/>\r\n<\/dsp:droplet>\r\n\r\n<\/dsp:page>\r\n<\/pre>\n<p>When we request the web page above, we see the following output in the JBOSS server log.<\/p>\n<pre>\r\n2013-06-06 21:35:21,572 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-7) start,sleepingDropletTest,220,10\r\n2013-06-06 21:35:26,574 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-7) end,sleepingDropletTest,220,10\r\n<\/pre>\n<p>Basket.jsp<\/p>\n<p>Recent analysis has shown that basket.jsp scales the most poorly when compared with other pages.  By scaling poorly, we mean that the more we do, the slower each request is on average.  A system that scales well with have consistent response time when the workload increases.<\/p>\n<p>We placed our droplet instrumentation calls in the basket.jsp, and found the following output was generated.<\/p>\n<pre>\r\n2013-06-06 22:00:23,242 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-1) start,basketTest,6450,710\r\n2013-06-06 22:00:25,075 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-1) end,basketTest,7150,830\r\n2013-06-06 22:00:34,636 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-2) start,basketTest,3610,290\r\n2013-06-06 22:00:35,083 INFO  [nucleusNamespace.com.express.util.ExpressCpuMonitor] (http-172.28.39.168-10180-2) end,basketTest,3710,300\r\n<\/pre>\n<p>In the above output, we see that we spent 700 milliseconds (7150 \u2013 6450) on our first request, and 100 milliseconds on the second request.  We also see that we spent 80 milliseconds on system CPU time in the first execution, and 10 milliseconds in the second execution.  For clarity, system CPU time is incremented by the Linux kernel.  For example, a read() ,  memset(), or futex()  function invocation are all system calls to which system CPU time would be attributed.  User CPU time is attributed to things such as while and for loops, variable declaration, etc.<\/p>\n<p>Reporting<\/p>\n<p>Since we are proposing that we simply write the output to the standard server.log, we can use simple command line reporting to view the data.  An example is shown below.  Based on the operation passed at the command line to the script, a total will be printed for each thread that calls the given component.<\/p>\n<pre>\r\n#!\/bin\/awk -f\r\nBEGIN {\r\n  COMPONENT=ARGV[2]\r\n  ARGC=2\r\n}\r\n\r\n{\r\n  if ($4 == \"[nucleusNamespace.com.express.util.ExpressCpuMonitor]\" && $6 ~ COMPONENT) {\r\n    split($6,t,\",\")\r\n    if (t[1] == \"start\" && t[2] ~ COMPONENT) {\r\n      vals[$5]=t[3]\r\n    }\r\n    else if (t[1] == \"end\" && t[2] ~ COMPONENT) {\r\n      totals[$5]+=(t[3]-vals[$5])\r\n    }\r\n  }\r\n}\r\nEND {\r\nfor (i in totals) {\r\n  print i,totals[i]}\r\n}\r\n<\/pre>\n<p>What is listed below will print the total CPU used by each component.<\/p>\n<pre>\r\n#!\/bin\/awk -f\r\nBEGIN {\r\n  COMPONENT=ARGV[2]\r\n  ARGC=2\r\n}\r\n\r\n{\r\n  if ($4 == \"[nucleusNamespace.com.express.util.ExpressCpuMonitor]\") {\r\n    split($6,t,\",\")\r\n    if (t[1] == \"start\") {\r\n      vals[t[2]]=t[3]\r\n    }\r\n    else if (t[1] == \"end\") {\r\n      totals[t[2]]+=(t[3]-vals[$5])\r\n    }\r\n  }\r\n}\r\nEND {\r\nfor (i in totals) {\r\n  print i,totals[i]\/1000}\r\n}\r\n<\/pre>\n<p>APPENDIX A<\/p>\n<p>The full blown JNI implementation is provided below.<\/p>\n<p>1.\tDownload the open source C library from https:\/\/code.google.com\/p\/jnithreadcpuusage<br \/>\n2.\tFollow the instructions in the README.  For the configure command, change the following line:<br \/>\n.\/configure \u2013prefix=\/opt\/atg\/ATG-Data<br \/>\n3.\tPlace the generated threadCPUUsage.jar file in \/var\/lib\/jbossas\/lib<br \/>\n4.\tChange the ExpressCpuMonitor class to what is below<\/p>\n<pre>\r\npackage com.express.util;\r\nimport atg.nucleus.*;\r\nimport net.appjax.jni.*;\r\n\r\npublic class ExpressCpuMonitor extends GenericService {\r\n\r\n  public void writeIt(String operation, String component) {\r\n    try {\r\n      ThreadCPUUsage cpuUsage = ThreadCPUUsage.getCPUUsage();\r\n      logInfo(operation + \",\" + component + \",\" + \r\n              cpuUsage.getTotalUserCPUTime() + \",\" +\r\n              cpuUsage.getTotalSystemCPUTime());\r\n    }\r\n    catch (Exception e) {\r\n      e.printStackTrace();\r\n    }\r\n  }\r\n}\r\n<\/pre>\n<p>5.\tAdd  the following line  to  \/var\/lib\/jbossas\/bin\/run.sh<\/p>\n<pre>\r\nJAVA_OPTS=\"$JAVA_OPTS -Djava.library.path=\/opt\/atg\/ATG-Data\/lib\"\r\n\r\n# Display our environment\r\necho \"=========================================================================\"\r\necho \"\"\r\necho \"  JBoss Bootstrap Environment\"\r\necho \"\"\r\necho \"  JBOSS_HOME: $JBOSS_HOME\"\r\necho \"\"\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Each of these has their own set of strengths, and is appropriate for differing and sometimes similar requirements. What each of them lack is the ability to flexibly, easily, and succinctly report on CPU utilization at not only an individual&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/appcrawler.com\/wordpress\/2017\/12\/12\/flexibly-instrumenting-java-with-cpu-utilization-recordings\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"footnotes":""},"categories":[25,77],"tags":[],"_links":{"self":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/5814"}],"collection":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/comments?post=5814"}],"version-history":[{"count":4,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/5814\/revisions"}],"predecessor-version":[{"id":5819,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/5814\/revisions\/5819"}],"wp:attachment":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/media?parent=5814"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/categories?post=5814"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/tags?post=5814"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}