[HIVE-14919] Improve the performance of Hive on Spark 2.0.0 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

In ~~HIVE-14029~~, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to run benchmark with Spark 2.0 over 1 TB data set comparing with Spark 1.6. We can see performance improvments about 5.4% in general and 45% for the best case. However, some queries doesn't have significant performance improvements. This JIRA is the umbrella ticket addressing those performance issues.

[1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench

Attachments

Issue Links

is related to

HIVE-14029 Update Spark version to 2.0.0

Resolved

Sub-Tasks

There are no Sub-Tasks for this issue.

Activity

People

Assignee:: Ferdinand Xu

Reporter:: Ferdinand Xu

Votes:: 0 Vote for this issue

Watchers:: 16 Start watching this issue

Dates

Created:: 10/Oct/16 05:55

Updated:: 22/Mar/17 02:33