[SYSTEMDS-1274] Unnecessary rdd computation for nnz maintenance on write - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: SystemML 0.13
Component/s: Runtime
Labels:
None

Description

Our primitive for writing binary block RDDs to HDFS (as used in guarded collect), first computes the number of non-zeros (nnz) and subsequently writes out the data. This leads to redundant RDD computation, which can be expensive for large DAGs of RDD operations. Explicitly computing the nnz is unnecessary as we could simply piggyback this computation onto the write via an accumulator as done in multiple other places in SystemML.

Attachments

Activity

There are no comments yet on this issue.

People

Assignee:: Matthias Boehm

Reporter:: Matthias Boehm

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Feb/17 05:50

Updated:: 16/Feb/17 20:51

Resolved:: 16/Feb/17 20:51

SystemDS