{"id":4618,"date":"2015-01-09T13:52:01","date_gmt":"2015-01-09T18:52:01","guid":{"rendered":"http:\/\/appcrawler.com\/wordpress\/?p=4618"},"modified":"2015-01-12T12:06:01","modified_gmt":"2015-01-12T17:06:01","slug":"hadoop-ha-namenode-example","status":"publish","type":"post","link":"http:\/\/appcrawler.com\/wordpress\/2015\/01\/09\/hadoop-ha-namenode-example\/","title":{"rendered":"Hadoop HA namenode example"},"content":{"rendered":"<p>In earlier versions of hadoop, the namenode was the Achilles heel.  While there was the option of failing over to a secondary namenode, this required manual intervention, or heavy scripting at best.  Even then, failover wasn&#8217;t instantaneous.<\/p>\n<p>With Hadoop 0.23 and later releases, a primary and standby namenode configuration was introduced.  This required very little out of the box configuration, with tools such as ambari providing a next\/next\/next click setup.<\/p>\n<p>This post simply shows how it works, with an example that shows each node in the configuration.  We then start a job that will require access to the primary namenode, and terminate its process.  We see that our job runs to completion.<\/p>\n<p>First, we show our primary namenode..<\/p>\n<p><img alt='' class='alignnone size-full wp-image-4619 ' src='http:\/\/appcrawler.com\/wordpress\/wp-content\/uploads\/2015\/01\/img_54b01a1f76d8e.png' \/><\/p>\n<p>&#8230;and then, our secondary namenode&#8230;<\/p>\n<p><img alt='' class='alignnone size-full wp-image-4620 ' src='http:\/\/appcrawler.com\/wordpress\/wp-content\/uploads\/2015\/01\/img_54b01ac084e02.png' \/><\/p>\n<p>We then compile our simple class shown below&#8230;<\/p>\n<pre lang=\"java\">\r\nimport java.io.*;\r\nimport java.util.*;\r\nimport org.apache.hadoop.conf.*;\r\nimport org.apache.hadoop.fs.*;\r\n\r\npublic class HDFSFileBlocks {\r\n  public static void main(String[] args) throws IOException {\r\n    Configuration conf = new Configuration();\r\n\r\n    FileSystem fs = FileSystem.get(conf);\r\n    Path file = new Path(args[0]);\r\n    FileStatus fileStatus = fs.getFileStatus(file);\r\n    BlockLocation[] blocks = fs.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());\r\n    for (int i = 0; i < blocks.length; i++) {\r\n      System.out.println(blocks[i].toString());\r\n    }\r\n    BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(file)));\r\n    String line=br.readLine();\r\n    while (line != null){\r\n      System.out.println(line);\r\n      line=br.readLine();\r\n    }\r\n    fs.close();\r\n  }\r\n}\r\n<\/pre>\n<p>...and run the job, during which time we terminate the primary namenode on  node 14...<\/p>\n<pre>\r\n[root@cmhl******07 ~]# export CLASSPATH=$(yarn classpath)\r\n[root@cmhl******07 ~]# javac HDFSFileBlocks.java\r\n[root@cmhl******07 ~]# cat manifest.txt\r\nManifest-Version: 1.0\r\nMain-Class: HDFSFileBlocks\r\n\r\n[root@cmhl******07 ~]# jar cfm myjar.jar manifest.txt HDFSFileBlocks.class\r\n[root@cmhl******07 ~]# hadoop jar myjar.jar \/tmp\/splunk__106646.esw3c_S.201311290900-1000-0 > \/dev\/null & ssh cmhl*******14 \"kill -9 \\$(ps -ef | grep java | grep -v grep | egrep NameNode\\$ | awk '{print \\$2}')\"[4] 13071[3]\u00a0\u00a0 Done\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 hadoop jar myjar.jar \/tmp\/splunk__106646.esw3c_S.201311290900-1000-0 > \/dev\/null\r\n[root@cmhl******07 ~]#\r\n<\/pre>\n<p>We see that our standby namenode is stopped...<\/p>\n<p><img alt='' class='alignnone size-full wp-image-4622 ' src='http:\/\/appcrawler.com\/wordpress\/wp-content\/uploads\/2015\/01\/img_54b021dc02ddb.png' \/><\/p>\n<p>...and also see the primary namenode is now on node 13...<\/p>\n<p><img alt='' class='alignnone size-full wp-image-4623 ' src='http:\/\/appcrawler.com\/wordpress\/wp-content\/uploads\/2015\/01\/img_54b0222dd1c73.png' \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In earlier versions of hadoop, the namenode was the Achilles heel. While there was the option of failing over to a secondary namenode, this required manual intervention, or heavy scripting at best. Even then, failover wasn&#8217;t instantaneous. With Hadoop 0.23&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/appcrawler.com\/wordpress\/2015\/01\/09\/hadoop-ha-namenode-example\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"footnotes":""},"categories":[19,21],"tags":[],"_links":{"self":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/4618"}],"collection":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/comments?post=4618"}],"version-history":[{"count":6,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/4618\/revisions"}],"predecessor-version":[{"id":4628,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/4618\/revisions\/4628"}],"wp:attachment":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/media?parent=4618"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/categories?post=4618"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/tags?post=4618"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}