Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.5.4
-
None
-
None
Description
While debugging HBASE-27707 in a unit test, I see behaviour that I cannot explain. My test uses a minicluster, enables read replica replication, writes some data, concurrently kills a region server thread hosting a primary region, and then verifies that all replicas eventually show all data. Inspecting logs, noticed that replication source threads seem to continue working even after their associated region server is killed. Interspersing some thread dumps and sleeps, I can see that replication threads associated with the condemned region server are not being removed after it is killed. I think that this behaviour will render unreliably any replication test that relies on killing a source or sink region server. It also implies to me that the minicluster leaks replication threads and cannot be reliably recycled within a single jvm process.