往impala 分区表load中load数据,为什么分区条件也导入进去了

hadoop学习(8)
hadoop1.2.1集群Hive1.2.0mysql serverRedhat-6.6 Server.x86_641
导入作业数据集total.csv到Hive中,用日期做为分区表的分区ID。要求有描述型文字,Hive脚本和截图。
回购表数据:
tradedate:交易日期
tradetime:交易时间
securityid:股票ID
bidpx1:买一
bidsize1:买一交易量
offerpx1:卖一
bidsize1:卖一交易量
数据存储格式:
tradedate,tradetime,securityid,bidpx1,bidsize1,offerpx1,offersize1
数据准备:
localhost:homework11 apple$ scptotal.csv hadoop@hadoop0:/home/hadoop/Downloads/Data/
[hadoop@hadoop0Data]$ head total.csv
[hadoop@hadoop0 Data]$ head total.csv042,.415,,3.42,8960.0516,.421,,3.422,14950.0752,.431,,3.435,620.0001,.453,,3.46,7400.0901,.451,,3.452,31330.0449,.42,,3.421,22900.0254,.415,,3.416,2430.0713,.43,,3.435,2990.0540,.421,,3.422,1050.0931,.45,,3.451,13700.0
drop table if exists t_reverse_repurchase_
#创建临时表t_reverse_repurchase_tmp,分隔符','。
CREATE TABLE t_reverse_repurchase_tmp(tradedate STRING,tradetime STRING, securityid STRING, bidpx1 DOUBLE, bidsize1BIGINT,offerpx1 DOUBLE,offersize1
BIGINT) ROW FORMAT DELIMITED FIELDSTERMINATED BY ',';
#导入数据到t_reverse_repurchase_tmp。
LOAD DATA LOCAL INPATH'/home/hadoop/Downloads/Data/total.csv' OVERWRITE INTO TABLEt_reverse_repurchase_
#查看导入的数据:
hive& select * from t_reverse_repurchase_tmp limit 10;
hive& select * from t_reverse_repurchase_tmp limit 10;OK
13700Time taken: 0.056 seconds, Fetched: 10 row(s)hive&
#查看表描述
hive& DESCRIBE t_reverse_repurchase_
hive& DESCRIBE t_reverse_repurchase_OKtradedate
securityid
offersize1
Time taken: 0.095 seconds, Fetched: 7 row(s)
staticpartition常适用于使用处理时间作为partition key的例子。
但是这里需要向分区表中插入大量数据,并且插入前不清楚数据归属的partition,此时,我们需要dynamic partition。使用动态分区需要设置hive.exec.dynamic.partition参数值为true。
hive&use mydb
hive&drop table if exists t_reverse_
hive&CREATE TABLE t_reverse_repurchase (tradetime STRING, securityidSTRING, bidpx1 DOUBLE, bidsize1 BIGINT,offerpx1 DOUBLE,offersize1 BIGINT)PARTITIONED BY (tradedate STRING);
#设置参数开启dynamic partition
hive& sethive.exec.dynamic.partition.mode=
hive& sethive.exec.dynamic.partition=
#将数据从t_reverse_repurchase_tmp加载到t_reverse_repurchase中
INSERT OVERWRITE TABLEt_reverse_repurchase PARTITION (tradedate) SELECT a.tradetime astradetime,a.securityid as securityid, a.bidpx1 as bidpx1, a.bidsize1
asbidsize1,a.offerpx1 as fferpx1,a.offersize1 as offersize1, a.tradedate astradedate FROM t_reverse_repurchase_
#查看导入的数据:
hive& select * from t_reverse_repurchase wheretradedate=='' and securityid='204001' limit 10;
hive& select * from t_reverse_repurchase where tradedate=='' and securityid='204001' limit 10;OK
Time taken: 0.079 seconds, Fetched: 10 row(s)hive&
#查看表的描述信息
hive& DESCRIBE t_reverse_
hive& DESCRIBE t_reverse_OKtradetime
securityid
offersize1
# Partition Information
# col_name
Time taken: 0.08 seconds, Fetched: 12 row(s)
#查看表的分区存储情况
hive&show partitions t_reverse_
hive& show partitions t_reverse_OKtradedate=tradedate=tradedate=tradedate=tradedate=Time taken: 0.116 seconds, Fetched: 5 row(s)hive&DB2表分区数据清空维护_百度文库
两大类热门资源免费畅读
续费一年阅读会员,立省24元!
DB2表分区数据清空维护
阅读已结束,下载本文需要
想免费下载本文?
定制HR最喜欢的简历
下载文档到电脑,同时保存到云知识,更方便管理
还剩2页未读,继续阅读
定制HR最喜欢的简历
你可能喜欢}

我要回帖

更多关于 sqoop 导入hive分区表 的文章

更多推荐

版权声明:文章内容来源于网络,版权归原作者所有,如有侵权请点击这里与我们联系,我们将及时删除。

点击添加站长微信