Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-3396

ConcurrentModificationException in federated execution

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • SystemDS 3.1
    • SystemDS 3.1
    • federated

    Description

      Since some federated instructions do not wait for the worker's response, it happens that the worker iterates the LocalVariableMap, while another thread at the worker is already modifying (create/remove) the map for the next federated request, which leads to a ConcurrentModificationException.
      An example for this situation can be found when executing the FederatedAlsCGTest. There, the method LocalVariableMap.hasReferences() iterates the HashMap (triggered by 'rmvar' instruction) while the thread from the next request is putting a new entry into the map.

      Attachments

        Issue Links

          Activity

            Commit a9943772cf28e82604dac88d340cae3e1e779569 in systemds's branch refs/heads/main from ywcb00
            [ https://gitbox.apache.org/repos/asf?p=systemds.git;h=a9943772cf ]

            SYSTEMDS-3396 LocalVarMap Concurrency in Federated Execution

            Changes the local variable map back to a ConcurrentHashMap to allow
            simultaneous modification and iteration of the map

            Closes #1647

            jira-bot ASF subversion and git services added a comment - Commit a9943772cf28e82604dac88d340cae3e1e779569 in systemds's branch refs/heads/main from ywcb00 [ https://gitbox.apache.org/repos/asf?p=systemds.git;h=a9943772cf ] SYSTEMDS-3396 LocalVarMap Concurrency in Federated Execution Changes the local variable map back to a ConcurrentHashMap to allow simultaneous modification and iteration of the map Closes #1647

            People

              Unassigned Unassigned
              ywcb00 David Weissteiner
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: