We had an issue in which the secondary namenode was not checkpointing the edits file with updates from the primary namenode. In case you are unaware, the secondary namenode (not in an HA configuration) regularly transfers the edits file from…
Category: Hadoop
How often does YARN purge locally cached resources?

If you are stretched for space locally, this may come in handy. By default, localized resources are purged every ten minutes… [root@cmhlddlkedat01 2.2.4.2-2]# grep -A1 yarn.nodemanager.localizer.cache.cleanup.interval-ms yarn-default.xml yarn.nodemanager.localizer.cache.cleanup.interval-ms 600000 [root@cmhlddlkedat01 2.2.4.2-2]# Of course, this can be overridden.
Phoenix to a secure HBase cluster
This is just a simple example of using a custom JDBC class to connect to an HBase cluster that is secured by kerberos. import java.sql.*; import java.util.*; public class phoenixTest { public static void main(String args[]) throws Exception { Connection…
Examples of connecting to kerberos hive in JDBC
We had a need to authenticate user requests against AD in a kerberos enabled cluster, and allow “local” hive sessions to use only a keytab. Below are the examples of each. First, we show how to connect over a binary…
Are files in HDFS immutable?
Call me cynical, I just am a bit of a doubting Thomas. Using our previous write test code, we simply run the exact same test, only we do it twice. [root@cmhlpdlkedat15 ~]# hadoop HDFSWriteTest foobar.txt hdfs://cmhlpdlkedat14.expressco.com:8020 [root@cmhlpdlkedat15 ~]# hdfs dfs…
Getting additional data on a slow hive reducer
We have a query that has been running for almost two hours on the last of 1,009 reducers in the first reduce phase. We wanted to gain additional insight into what it was doing, so we first find the node…
Hadoop HA namenode example
In earlier versions of hadoop, the namenode was the Achilles heel. While there was the option of failing over to a secondary namenode, this required manual intervention, or heavy scripting at best. Even then, failover wasn’t instantaneous. With Hadoop 0.23…
Cloudera Impala simple command line test
I am a big believer in simple command line examples. Connecting the dots is so much easier when you do this. There will be much more on this, but this should get you started. CLASSPATH is shown below. The absolute…
Querying Hadoop from Tomcat
Below is a very simple example for how to print in Tomcat the contents of a file stored in HDFS. I am not entirely sure where this would be useful, for the following reasons: * Unless you have a list…
Pig script to group URL requests in JBOSS
As we move towards an enterprise data analytics platform, I take every opportunity I can to come up with simple jobs in Hadoop, Hive, and Pig. Below is one I ran in Pig that groups the top 50 URL requests…