{"id":4852,"date":"2015-04-15T11:41:41","date_gmt":"2015-04-15T16:41:41","guid":{"rendered":"http:\/\/appcrawler.com\/wordpress\/?p=4852"},"modified":"2015-04-18T06:01:49","modified_gmt":"2015-04-18T11:01:49","slug":"are-files-in-hdfs-immutable","status":"publish","type":"post","link":"http:\/\/appcrawler.com\/wordpress\/2015\/04\/15\/are-files-in-hdfs-immutable\/","title":{"rendered":"Are files in HDFS immutable?"},"content":{"rendered":"<p>Call me cynical, I just am a bit of a doubting Thomas.<\/p>\n<p>Using our <a href=http:\/\/appcrawler.com\/wordpress\/2013\/05\/08\/does-hadoophdfs-distribute-writes-to-all-data-nodes-on-ingest\/ target=_blank>previous write test code<\/a>, we simply run the exact same test, only we do it twice.<\/p>\n<pre>\r\n[root@cmhlpdlkedat15 ~]# hadoop HDFSWriteTest foobar.txt\r\nhdfs:\/\/cmhlpdlkedat14.expressco.com:8020\r\n[root@cmhlpdlkedat15 ~]# hdfs dfs -ls \/user\/root\r\nFound 3 items\r\ndrwx------   - root hdfs          0 2015-04-03 02:00 \/user\/root\/.Trash\r\ndrwxr-xr-x   - root hdfs          0 2015-04-06 13:37 \/user\/root\/.hiveJars\r\n-rw-r--r--   3 root hdfs   16688895 2015-04-10 15:20 \/user\/root\/foobar.txt\r\n[root@cmhlpdlkedat15 ~]# hadoop HDFSWriteTest foobar.txt\r\nhdfs:\/\/cmhlpdlkedat14.expressco.com:8020\r\nException in thread \"main\" org.apache.hadoop.fs.FileAlreadyExistsException: \/user\/root\/foobar.txt for client 172.27.2.64 already exists\r\n<\/pre>\n<p>However, there is an append method on the FileSystem object.  If we change one line in our class&#8230;<\/p>\n<pre>   \r\n    \/\/comment out the create and change it to an append operation\r\n    \/\/FSDataOutputStream outStream = fs.create(file,false, 4096, (short)3, (long)1048576);\r\n    FSDataOutputStream outStream = fs.append(file);\r\n<\/pre>\n<p>&#8230;we find it allows us to in fact append to an existing file.<\/p>\n<pre>\r\n[root@cmhlpdlkedat15 ~]# hadoop HDFSWriteTest foobar.txt\r\nhdfs:\/\/cmhlpdlkedat14.expressco.com:8020\r\n[root@cmhlpdlkedat15 ~]# hdfs dfs -ls \/user\/root\r\nFound 3 items\r\ndrwx------   - root hdfs          0 2015-04-03 02:00 \/user\/root\/.Trash\r\ndrwxr-xr-x   - root hdfs          0 2015-04-06 13:37 \/user\/root\/.hiveJars\r\n-rw-r--r--   3 root hdfs   33377790 2015-04-10 15:28 \/user\/root\/foobar.txt\r\n[root@cmhlpdlkedat15 ~]#\r\n<\/pre>\n<p>As such, we need to be clear that while you can&#8217;t change existing content, you can add to it.  This actually has a long history, which you can find <a href=http:\/\/blog.cloudera.com\/blog\/2009\/07\/file-appends-in-hdfs\/ target=_blank>here<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Call me cynical, I just am a bit of a doubting Thomas. Using our previous write test code, we simply run the exact same test, only we do it twice. [root@cmhlpdlkedat15 ~]# hadoop HDFSWriteTest foobar.txt hdfs:\/\/cmhlpdlkedat14.expressco.com:8020 [root@cmhlpdlkedat15 ~]# hdfs dfs&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/appcrawler.com\/wordpress\/2015\/04\/15\/are-files-in-hdfs-immutable\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"footnotes":""},"categories":[21],"tags":[],"_links":{"self":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/4852"}],"collection":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/comments?post=4852"}],"version-history":[{"count":7,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/4852\/revisions"}],"predecessor-version":[{"id":4861,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/4852\/revisions\/4861"}],"wp:attachment":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/media?parent=4852"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/categories?post=4852"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/tags?post=4852"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}