Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1
-
None
-
spark2.2.1-hadoop-2.6.0-chd-5.4.2
hive-1.2.1
-
Important
Description
We save the dataframe object as a hive table in orc/parquet format in the spark shell.
After we modified the column type (int to double) of this table in hive jdbc, we found the column type queried in spark-shell didn't change, but changed in hive jdbc. After we restarted the spark-shell, this table's column type is still incompatible as showed in hive jdbc.
The coding process are as follows:
spark-shell:
val df = spark.read.json("examples/src/main/resources/people.json"); df.write.format("orc").saveAsTable("people_test"); spark.sql("desc people_test").show() +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ | age| bigint| null| | name| string| null| +--------+---------+-------+
hive:
hive> desc people_test; OK age bigint name string Time taken: 0.454 seconds, Fetched: 2 row(s) hive> alter table people_test change column age age double; OK Time taken: 0.68 seconds hive> desc people_test; OK age double name string Time taken: 0.358 seconds, Fetched: 2 row(s)
spark-shell:
spark.catalog.refreshTable("people_test") spark.sql("desc people_test").show() +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ | age| bigint| null| | name| string| null| +--------+---------+-------+
We also tested in spark-shell by creating a table using spark.sql("create table XXX()"), the modified columns are consistent.