Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 1.4.1
-
None
-
None
Description
This was reported on the impyla issue tracker here:
https://github.com/cloudera/impyla/issues/31
When using beeswax, the information on the partition identity is returned, but when using hiveserver2, it is not (as described in IMPYLA-31).
To recreate this issue, first create a table:
USE whatever;
CREATE TABLE test_partitions (
column_1 STRING,
column_2 STRING,
column_3 STRING,
column_4 STRING
)
PARTITIONED BY (
year INT,
month INT,
day INT
)
STORED AS PARQUET
LOCATION '/tmp/test_partition_dir';
then insert some data
INSERT INTO TABLE test_partitions PARTITION (year, month, day) VALUES ('a', 'b', 'c', 'd', 2014, 11, 8), ('foo', 'bar', 'car', 'done', 2013, 9, 22);
Then use the master branch of impyla:
git clone https://github.com/cloudera/impyla.git
cd impyla
python setup.py install
Then open a python interpreter and run:
from impala.dbapi import connect conn_hs2 = connect(host='impalad.host', port=21050, protocol='hiveserver2') conn_bees = connect(host='impalad.host', port=21000, protocol='beeswax') cur_hs2 = conn_hs2.cursor() cur_bees = conn_bees.cursor() cur_bees.execute('use whatever') cur_bees.execute('show partitions test_partitions') cur_bees.fetchall() cur_hs2.execute('use whatever') cur_hs2.execute('show partitions test_partitions') cur_hs2.fetchall()
The results should basically be the same, but the HS2 version has Nones in place of the partition ids.