Apache Hive inserts

This is a simple example for getting started with loading data into a hadoop environment front ended by hive.

Notice how much longer simple operations take than they do in a RDBMS. This is attributable to the fact hadoop has the initial setup of creating the job and task tracker.

[root@expressdb1 hive-0.9.0-bin]# export HIVE_HOME=/root/hive-0.9.0-bin
[root@expressdb1 hive-0.9.0-bin]# export HADOOP_HOME=/root/hadoop-0.23.7
[root@expressdb1 hive-0.9.0-bin]# bin/hive
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Logging initialized using configuration in jar:file:/root/hive-0.9.0-bin/lib/hive-common-0.9.0.jar!/hive-log4j.properties
Hive history file=/tmp/root/hive_job_log_root_201304301344_249377956.txt
hive> CREATE TABLE x (a INT);
OK
Time taken: 14.982 seconds
hive> select * from x;
OK
Time taken: 0.383 seconds
hive> exit;
[root@expressdb1 hive-0.9.0-bin]# for i in {1..100}; do
> echo $i >> /tmp/l.txt
> done
[root@expressdb1 hive-0.9.0-bin]# bin/hive
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Logging initialized using configuration in jar:file:/root/hive-0.9.0-bin/lib/hive-common-0.9.0.jar!/hive-log4j.properties
Hive history file=/tmp/root/hive_job_log_root_201304301351_1399235343.txt
hive> LOAD DATA LOCAL INPATH '/tmp/l.txt' OVERWRITE INTO TABLE x;
Copying data from file:/tmp/l.txt
Copying file: file:/tmp/l.txt
Loading data to table default.x
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted /user/hive/warehouse/x
OK
Time taken: 8.332 seconds
hive> select * from x;
OK
1
2
3
4
5
....
98
99
100
Time taken: 0.389 seconds
hive> exit;
[root@expressdb1 hive-0.9.0-bin]#

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.