[SYSTEMDS-2163] Performance large partitioned broadcasts - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: SystemML 1.1
Component/s: None
Labels:
None

Description

Due to Spark's limitations with broadcasts larger than 2G, in SystemML we use partitioned broadcasts that split a large side input into potentially many broadcast variables. For historic reasons the meta data is still maintained in the individual partitioned blocks. However, for many operations this meta data is accessed on the first partitioned block which leads to potentially unnecessary broadcast fetches.

Attachments

Activity

People

Assignee:: Matthias Boehm

Reporter:: Matthias Boehm

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 28/Feb/18 04:15

Updated:: 01/Mar/18 05:30

Resolved:: 01/Mar/18 05:30