Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
In order to ensure consistency across backends, we first determine the number of non-zeros per block and subsequently generate random data accordingly. However, in case of ultra-sparse data sets, this temporary array can be almost as large as the dataset. Since this memory consumption is unaccounted and even required for distributed operations, there are various possible scenarios where this would cause OOMs.
This task aims to solve this issue for all backends, by determining the nnz per block in a streaming manner without materialization.
Attachments
Issue Links
- depends upon
-
SYSTEMDS-1391 Drop java 6 and 7 support
- Closed