{"id":2835,"date":"2013-05-17T07:33:14","date_gmt":"2013-05-17T12:33:14","guid":{"rendered":"http:\/\/appcrawler.com\/wordpress\/?p=2835"},"modified":"2013-05-17T07:33:14","modified_gmt":"2013-05-17T12:33:14","slug":"hdfs-where-are-my-file-blocks","status":"publish","type":"post","link":"http:\/\/appcrawler.com\/wordpress\/2013\/05\/17\/hdfs-where-are-my-file-blocks\/","title":{"rendered":"HDFS &#8211; Where are my file blocks?"},"content":{"rendered":"<p>At times I would like to know how a file is stored in HDFS.  What is below will show which blocks exist for a given file, as well as on which nodes they are stored.<\/p>\n<pre lang=\"java\" line=\"1\">\r\nimport java.io.*;\r\nimport java.util.*;\r\nimport org.apache.hadoop.conf.*;\r\nimport org.apache.hadoop.fs.*;\r\n\r\npublic class HDFSFileBlocks {\r\n  public static void main(String[] args) throws IOException {\r\n    Configuration conf = new Configuration();\r\n\r\n    FileSystem fs = FileSystem.get(conf);\r\n    Path file = new Path(args[0]);\r\n    FileStatus fileStatus = fs.getFileStatus(file);\r\n    BlockLocation[] blocks = fs.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());\r\n    for (int i = 0; i < blocks.length; i++) {\r\n      System.out.println(blocks[i].toString());\r\n    }\r\n    fs.close();\r\n  }\r\n}\r\n<\/pre>\n<p>For example, a 16MB file stored in 1MB blocks may look like what is below in a two datanode cluster.<\/p>\n<pre lang=\"text\">\r\n# hadoop-0.23.7\/bin\/hadoop HDFSFileBlocks i7.txt\r\n0,1048576,expressdb1,expressdb2\r\n1048576,1048576,expressdb1,expressdb2\r\n2097152,1048576,expressdb1,expressdb2\r\n3145728,1048576,expressdb1,expressdb2\r\n4194304,1048576,expressdb1,expressdb2\r\n5242880,1048576,expressdb1,expressdb2\r\n6291456,1048576,expressdb2,expressdb1\r\n7340032,1048576,expressdb2,expressdb1\r\n8388608,1048576,expressdb1,expressdb2\r\n9437184,1048576,expressdb1,expressdb2\r\n10485760,1048576,expressdb2,expressdb1\r\n11534336,1048576,expressdb2,expressdb1\r\n12582912,1048576,expressdb1,expressdb2\r\n13631488,1048576,expressdb1,expressdb2\r\n14680064,1048576,expressdb1,expressdb2\r\n15728640,960255,expressdb1,expressdb2\r\n#\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>At times I would like to know how a file is stored in HDFS. What is below will show which blocks exist for a given file, as well as on which nodes they are stored. import java.io.*; import java.util.*; import&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/appcrawler.com\/wordpress\/2013\/05\/17\/hdfs-where-are-my-file-blocks\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"footnotes":""},"categories":[19,21,25],"tags":[],"_links":{"self":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/2835"}],"collection":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/comments?post=2835"}],"version-history":[{"count":8,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/2835\/revisions"}],"predecessor-version":[{"id":2906,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/2835\/revisions\/2906"}],"wp:attachment":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/media?parent=2835"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/categories?post=2835"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/tags?post=2835"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}