Category: Hive

Hive performance parser

With data in the hiveserver2.log file, this awk scriptlet prints the timestamp, SQL, and seconds to run. There is an issue where the parser thread hands off to the executor, and you can’t always tie the two together.  However, at a…

Nastiness with Hive/ODBC

We ran into an issue that is not documented, although the hive user guide does indicate it should be set. Whenever we issued a statement with the Hive 64-bit ODBC driver through knox, the knox server would truncate anything after…

Accessing Hive via DBVisualizer

Download the most recent free version of DB Visualizer from the following URL: http://www.dbvis.com/ Execute the downloaded file. If you see the following, it means you have either not installed java, or the installer can’t find where you installed it.…

Hive transactions

Below is just a simple example of hive transactions. These are very useful on slowly changing type 1 dimension tables for which you do not wish to retain history, but only the most recent value of the row. The table…

Querying hive from Excel

Install 32 bit driver from the following URL https://downloads.cloudera.com/connectors/hive-2.5.5.1006/Windows/ClouderaHiveODBC32.msi Start Excel and run the query wizard Create a new Data Source Enter the values shown below Enter the values shown below Click the Test button Select the stores table from…

Querying Hive from Tomcat

Below is a very simple example to get started on how Hive can be queried from a web front end. The idea behind this would be to present the user with a csv file mime type that would be opened…

Apache Hive inserts

This is a simple example for getting started with loading data into a hadoop environment front ended by hive. Notice how much longer simple operations take than they do in a RDBMS. This is attributable to the fact hadoop has…