Getting additional data on a slow hive reducer

We have a query that has been running for almost two hours on the last of 1,009 reducers in the first reduce phase.

We wanted to gain additional insight into what it was doing, so we first find the node on which the reducer is running. There are probably more elegant ways to do this, but this works. We base our guess on the node using the most CPU.

[root@cmhlpdlkedat15 hive]# for i in {01..12}; do ssh cmhlpdlkedat${i} sar -s 08:00:00 | grep Average; done
Average:        all     53.15      0.00      1.49      0.03      0.00     45.34
Average:        all      1.68      0.00      1.07      0.05      0.00     97.20
Average:        all      1.60      0.00      0.94      0.05      0.00     97.41
Average:        all      1.68      0.00      1.00      0.05      0.00     97.27
Average:        all      1.33      0.00      0.82      0.04      0.00     97.80
Average:        all      1.66      0.00      0.94      0.05      0.00     97.35
Average:        all      1.79      0.00      0.90      0.05      0.00     97.27
Average:        all      2.89      0.00      1.29      0.06      0.00     95.76
Average:        all      1.82      0.00      1.07      0.05      0.00     97.06
Average:        all      1.82      0.00      0.93      0.71      0.00     96.54
Average:        all      1.48      0.00      0.99      0.04      0.00     97.50
Average:        all      1.56      0.00      0.86      0.04      0.00     97.54
[root@cmhlpdlkedat15 hive]#

We then log into that server, and find the thread using the most CPU…

[root@cmhlpdlkedat01 ~]# ps -eLf | awk '{print $2,$4,$9}' | sort -k +3 | tail -5;sleep 5;ps -eLf | awk '{print $2,$4,$9}' | sort -k +3 | tail -5
1795 1862 00:36:28
13649 13656 00:59:33
1795 1808 01:02:08
3393 3412 01:18:38
PID LWP TIME
1795 1862 00:36:28
13649 13656 00:59:33
1795 1808 01:02:08
3393 3412 01:18:42
PID LWP TIME

After five seconds, we see it is PID 3412…

[root@cmhlpdlkedat01 ~]# ps -eLf | grep 3412
yarn      3393  3380  3412 79   26 06:58 ?        01:18:48 /usr/jdk64/jdk1.7.0_45/bin/java -server -Djava.net.preferIPv4Stack=true -Dhdp.version=2.2.0.0-2041 -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -server -Xmx410m -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/mnt/hdfs/hadoop/yarn/log/application_1428633698169_0025/container_1428633698169_0025_01_000004 -Dtez.root.logger=INFO,CLA -Djava.io.tmpdir=/mnt/hdfs/hadoop/yarn/local/usercache/root/appcache/application_1428633698169_0025/container_1428633698169_0025_01_000004/tmp org.apache.tez.runtime.task.TezChild 172.27.2.57 43015 container_1428633698169_0025_01_000004 application_1428633698169_0025 1
root     27426 26643 27426  0    1 08:37 pts/0    00:00:00 grep 3412

We then trace the system calls on this PID, looking only for read calls…

[root@cmhlpdlkedat01 ~]# strace -p 3412 -o t.txt -e trace=read

We see that file descriptor 2231 is being read, and we can even see dates in the input (2015-03-24)…

[root@cmhlpdlkedat01 ~]# head -20 t.txt
read(2231, "2015-03-24\0\0\0\27\0\0\0\4\0\0\0\0\0\0\0\17\3\0\1\0\n2"..., 131072) = 131072
read(2232, "\340&\17\350\243\240\371\271\352d\217\7\373\245zLW$z\25\263m\207\313\226u\342\16\\\242\377s"..., 4096) = 4096
read(2231, "\4\0\0\0\0\0\0\0\20\3\2\2\24\f\n2014-12-03\0\0\0\30\0\0\0"..., 131072) = 131072
read(2231, "15-03-11\0\0\0\30\0\0\0\4\0\0\0\0\0\0\0\20\3\2\2\31\242\n20"..., 131072) = 131072

…so we find the actual filename for this descriptor, and tail it to see it is the larger of the two tables…

[root@cmhlpdlkedat01 ~]# ls -lrt /proc/3412/fd | grep 2231
lrwx------ 1 yarn hadoop 64 Apr 10 08:32 429 -> socket:[34622312]
lr-x------ 1 yarn hadoop 64 Apr 10 08:47 2231 -> /mnt/hdfs/hadoop/yarn/local/usercache/root/appcache/application_1428633698169_0025/container_1428633698169_0025_01_000004/tmp/hive-rowcontainer7779433895197233164/RowContainer4530514485187724697.tmp
[root@cmhlpdlkedat01 ~]# tail -f /mnt/hdfs/hadoop/yarn/local/usercache/root/appcache/application_1428633698169_0025/container_1428633698169_0025_01_000004/tmp/hive-rowcontainer7779433895197233164/RowContainer4530514485187724697.tmp
2014-05-16T
2014-05-16¦
2014-05-16¦?
2014-03-13¦
2014-03-13
2014-03-13¦
2014-03-13V
2014-03-13
2014-03-13
          q
2014-03-13

We then decide to get a thread dump for the JVM. First of all, notice the name of the JVM is TezChild, which is a tip off that you are runnbing hive code on a given server. Again, there are other ways, this is just one of them.

[root@cmhlpdlkedat01 yarn]# su - yarn
[yarn@cmhlpdlkedat01 ~]$ jps
16082 NodeManager
663 Jps
3393 TezChild
[yarn@cmhlpdlkedat01 ~]$ printf "%x\n" 3412
d54
[yarn@cmhlpdlkedat01 ~]$ jstack 3393 | awk '{if ($0 ~ "0xd54") {i = 1;print $0} else if (i == 1 && $0 != "") {print $0} else if (i == 1 && $0 == "") {exit}}'
"TezChild" daemon prio=10 tid=0x00007f5e886c8000 nid=0xd54 runnable [0x00007f5e7d17c000]
   java.lang.Thread.State: RUNNABLE
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:180)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
        at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:2245)
        at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2218)
        - locked <0x00000000f68a80f0> (a org.apache.hadoop.io.SequenceFile$Reader)
        at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:109)
        - locked <0x00000000f68a80c8> (a org.apache.hadoop.mapred.SequenceFileRecordReader)
        at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:84)
        - locked <0x00000000f68a80c8> (a org.apache.hadoop.mapred.SequenceFileRecordReader)
        at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.nextBlock(RowContainer.java:360)
        at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.next(RowContainer.java:267)
        at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.next(RowContainer.java:74)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:654)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:750)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:248)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:213)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processOp(CommonMergeJoinOperator.java:196)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:258)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
[yarn@cmhlpdlkedat01 ~]$

If we again run the thread dump, we see a different call stack, with very interesting names for the classes and methods…

[yarn@cmhlpdlkedat01 ~]$ jstack 3393 | awk '{if ($0 ~ "0xd54") {i = 1;print $0} else if (i == 1 && $0 != "") {print $0} else if (i == 1 && $0 == "") {exit}}'
"TezChild" daemon prio=10 tid=0x00007f5e886c8000 nid=0xd54 runnable [0x00007f5e7d17b000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.cs.UTF_8.updatePositions(UTF_8.java:78)
        at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:564)
        at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:619)
        at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:561)
        at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:783)
        at org.apache.hadoop.io.Text.encode(Text.java:450)
        at org.apache.hadoop.io.Text.set(Text.java:198)
        at org.apache.hadoop.hive.ql.udf.UDFSubstr.evaluate(UDFSubstr.java:78)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1235)
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182)
        at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
        at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
        at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
        at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:784)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:638)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:651)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:654)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:750)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:248)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:213)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processOp(CommonMergeJoinOperator.java:196)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:258)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
[yarn@cmhlpdlkedat01 ~]$

Not an answer, but it is additional info.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.