Below is a very simple example to get started on how Hive can be queried from a web front end. The idea behind this would be to present the user with a csv file mime type that would be opened…
Category: Database
Querying Hadoop from Tomcat
Below is a very simple example for how to print in Tomcat the contents of a file stored in HDFS. I am not entirely sure where this would be useful, for the following reasons: * Unless you have a list…
Pig script to group URL requests in JBOSS
As we move towards an enterprise data analytics platform, I take every opportunity I can to come up with simple jobs in Hadoop, Hive, and Pig. Below is one I ran in Pig that groups the top 50 URL requests…
HDFS – Where are my file blocks?
At times I would like to know how a file is stored in HDFS. What is below will show which blocks exist for a given file, as well as on which nodes they are stored. import java.io.*; import java.util.*; import…
Does hadoop/HDFS distribute writes to all data nodes on ingest?
I like simple, command line test cases. Lather, rinse, repeat (do any shampoo bottles actually have that anymore 🙂 ?) I wanted to ensure I could prove that ingests to hadoop actually didn’t send everything through the name node, which…
Apache Hive inserts
This is a simple example for getting started with loading data into a hadoop environment front ended by hive. Notice how much longer simple operations take than they do in a RDBMS. This is attributable to the fact hadoop has…
Query AS400/iSeries from JDBC
We are migrating an in house system from the AS400 to Oracle, and I needed to look at the data in its “raw” format to assist with a migration plan. I ended up using JDBC rather than the GUI provided…
Quick way to trace a session on a server
If you identify an Oracle database process consuming a high amount of CPU, or want to trace it for any reason, you can simply run the following script and pass the PID of the offending process when requested. begin for…
Schema compare command line tool
I use what is below to quickly compare two Oracle database schema’s tables. It would be trivial to add indexes to the mix. import java.sql.*; import java.util.*; public class schemaDiffer { public static void main(String args[]) { try { Class.forName(“oracle.jdbc.driver.OracleDriver”);…
Simple dbca command line creation
dbca -silent \ -createDatabase -templateName “General_Purpose.dbc” \ -nodelist “expressdb1,expressdb2” \ -gdbName express.home \ -sid express \ -sysPassword ExpressCMH1 \ -systemPassword ExpressCMH1 \ -emConfiguration NONE \ -datafileDestination +DATA \ -redoLogFileSize 50 \ -storageType ASM \ -asmsnmpPassword ExpressCMH1 \ -characterSet AL32UTF8 \…