Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-3432

FederationUtils.bindResponses causes out of memory because of sparse matrices.

Details

    Description

      FederationUtils.bindResponses(...) causes a out of memory exception when we get sparse matrices from the federated workers.

      This happens because we hardcode the MatrixObject to be dense, so when we try to copy the sparse data from the workers into the dense matrix, we can run out of memory depending on the size of the matrix.

      I encountered this when running FTBenchT15. I've attached the log output with the trace below.

      Attachments

        1. T15.fed.parallel.failed.out
          449 kB
          Andreas Botzner

        Issue Links

          Activity

            There are no comments yet on this issue.

            People

              Unassigned Unassigned
              botand Andreas Botzner
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: