Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Currently, ReplicationSourceManager only cleanups the queues for recovered sources when the queue is being closed. This can cause the already read WAL's files to be read again when a region server doing failover also dies. This can cause replication to possibly happen again
For e.g lets say RS1 dies with 5 files in queue and RS2 is doing the failover. Now, lets say RS2 dies after going thru 3 files in queue and RS3 is doing the failover. In this case, RS3 will again read those 3 files as they were not removed from the queue. (Though it will read the first file from the set pos. in ZK)