{"id":265,"date":"2009-06-25T13:11:11","date_gmt":"2009-06-25T18:11:11","guid":{"rendered":"http:\/\/appcrawler.com\/wordpress\/?p=265"},"modified":"2011-07-06T09:52:36","modified_gmt":"2011-07-06T14:52:36","slug":"do-we-actually-use-characters-that-require-more-than-one-byte","status":"publish","type":"post","link":"http:\/\/appcrawler.com\/wordpress\/2009\/06\/25\/do-we-actually-use-characters-that-require-more-than-one-byte\/","title":{"rendered":"Do we actually use characters that require more than one byte?"},"content":{"rendered":"<p>I couldn&#8217;t find a way to identify those columns that actually had a character with a code point value greater than 255.  As a result, I ended up writing the following.  If you have better way, please reply.<\/p>\n<pre lang=\"java\" line=\"1\">\r\nimport java.sql.*;\r\n\r\npublic class checkUnicode {\r\n  public static void main(String args[]) {\r\n    try {\r\n      Class.forName(\"oracle.jdbc.driver.OracleDriver\");\r\n      Connection conn = DriverManager.getConnection(\"jdbc:oracle:thin:*****\/****@******:2484\/fake_db_service\");\r\n      PreparedStatement pst = conn.prepareStatement(\"select fake_col1, fake_col2 from large_fake_table sample(1)\");\r\n      ResultSet rst = pst.executeQuery();\r\n      String val = \"\";\r\n      double tot = 0;\r\n      double found = 0;\r\n      while (rst.next()) {\r\n        val = rst.getString(2);\r\n        for (int i = 0; i < val.length(); i++) {\r\n          if (val.codePointAt(i) > 255) {\r\n\t    found++;\r\n            System.out.println(rst.getInt(1) + \" has a code point value of \" + val.codePointAt(i));\r\n            break;\r\n          }\r\n        }\r\n        tot++;\r\n        if (tot % 100 == 0) {\r\n          System.out.println(\"Have checked \" + tot + \" rows, \" + ((found \/ tot) * 100) + \"% of which have unicode.\");\r\n\t}\r\n      }\r\n      System.out.println(\"Checked \" + tot + \" rows\");\r\n    }\r\n    catch (Exception e) {\r\n      e.printStackTrace();\r\n    }\r\n  }\r\n}\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>I couldn&#8217;t find a way to identify those columns that actually had a character with a code point value greater than 255. As a result, I ended up writing the following. If you have better way, please reply. import java.sql.*;&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/appcrawler.com\/wordpress\/2009\/06\/25\/do-we-actually-use-characters-that-require-more-than-one-byte\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"footnotes":""},"categories":[19,24,25,22],"tags":[],"_links":{"self":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/265"}],"collection":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/comments?post=265"}],"version-history":[{"count":11,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/265\/revisions"}],"predecessor-version":[{"id":946,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/265\/revisions\/946"}],"wp:attachment":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/media?parent=265"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/categories?post=265"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/tags?post=265"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}