Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.2.1
-
None
Description
Hi,
We are observing different behavior of Hive for the timestamp column values. When we apply clause like order by, distinct on same or other other column in the hive query it print different result for the timestamp value for year which start after 2300..
Steps:
1. Create a hive table
create table cutomer_sample(id int, arrival_time timestamp, dob date) stored as ORC;
2. Populate some data with future timestamp values
insert into table cutomer_sample values (1,'2015-01-01 00:00:00.0','2015-01-01'), (2,'2018-01-01 00:00:00.0','2018-01-01') , (3,'2099-01-01 00:00:00.0','2099-01-01'), (4,'2100-01-01 00:00:00.0','2100-01-01'),(5,'2500-01-01 00:00:00.0','2500-01-01'),(6,'2200-01-01 00:00:00.0','2200-01-01'),(7,'2300-01-01 00:00:00.0','2300-01-01'),(8,'2400-01-01 00:00:00.0','2400-01-01');
3. Select all data with any clause
select * from cutomer_sample;
Output:
select * from cutomer_sample;
-------------------------------------------------------------------+
cutomer_sample.id | cutomer_sample.arrival_time | cutomer_sample.dob |
-------------------------------------------------------------------+
1 | 2015-01-01 00:00:00.0 | 2015-01-01 |
2 | 2018-01-01 00:00:00.0 | 2018-01-01 |
3 | 2099-01-01 00:00:00.0 | 2099-01-01 |
4 | 2100-01-01 00:00:00.0 | 2100-01-01 |
5 | 2500-01-01 00:00:00.0 | 2500-01-01 |
6 | 2200-01-01 00:00:00.0 | 2200-01-01 |
7 | 2300-01-01 00:00:00.0 | 2300-01-01 |
8 | 2400-01-01 00:00:00.0 | 2400-01-01 |
-------------------------------------------------------------------+
4. Apply order by on timestamp column
select * from cutomer_sample order by arrival_time ;
---------------------------------------------------------------------+
cutomer_sample.id | cutomer_sample.arrival_time | cutomer_sample.dob |
---------------------------------------------------------------------+
7 | 1715-06-13 00:25:26.290448384 | 2300-01-01 |
8 | 1815-06-13 00:25:26.290448384 | 2400-01-01 |
5 | 1915-06-14 00:48:46.290448384 | 2500-01-01 |
1 | 2015-01-01 00:00:00.0 | 2015-01-01 |
2 | 2018-01-01 00:00:00.0 | 2018-01-01 |
3 | 2099-01-01 00:00:00.0 | 2099-01-01 |
4 | 2100-01-01 00:00:00.0 | 2100-01-01 |
6 | 2200-01-01 00:00:00.0 | 2200-01-01 |
---------------------------------------------------------------------+
you can see value of timestamp got changed after 2300 year..
5. Apply order by on some other column still same behavior
---------------------------------------------------------------------+
cutomer_sample.id | cutomer_sample.arrival_time | cutomer_sample.dob |
---------------------------------------------------------------------+
1 | 2015-01-01 00:00:00.0 | 2015-01-01 |
2 | 2018-01-01 00:00:00.0 | 2018-01-01 |
3 | 2099-01-01 00:00:00.0 | 2099-01-01 |
4 | 2100-01-01 00:00:00.0 | 2100-01-01 |
6 | 2200-01-01 00:00:00.0 | 2200-01-01 |
7 | 1715-06-13 00:25:26.290448384 | 2300-01-01 |
8 | 1815-06-13 00:25:26.290448384 | 2400-01-01 |
5 | 1915-06-14 00:48:46.290448384 | 2500-01-01 |
---------------------------------------------------------------------+