Details
-
Bug
-
Status: Resolved
-
Urgent
-
Resolution: Fixed
-
None
-
None
-
Critical
Description
Streaming service assumes there will only be one stream from S to T at a time for any nodes S and T. For the original purpose of node movement, this was a reasonable assumption (any node T can only perform one move at a time) but AES throws off streaming tasks much more frequently than that given the right conditions, which will de-sync the fragile file ordering that Streaming assumes (that T knows which files S is going to send, in what order). Eventually T is expecting file F1 but S sends a smaller file F2, leading to an infinite loop on T while it waits for F1 to finish, and T waits for S to acknowledge F2, which it never will.
For 0.6 maybe the best solution is for AES to manually wait for one of its streaming tasks to finish, before it allows itself to create another. For 0.7 it would be nice to make Streaming more robust. The whole 4-stage-ack process seems very fragile, and poking around in parent objects via inetaddress keys makes reasoning about small pieces impossible b/c of encapsulation violations.