Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
During a cache close, persistent regions may not cleanup as much as they should. This is because when the PersistentAdvisor is closed, CancelException is not handled causing other parts of the close to be skipped. I think the place to handle it is: DistributedRegion.distributedRegionCleanup(DistributedRegion.java:2564). Here is an exception showing what it looks like when this happens:
org.apache.geode.distributed.DistributedSystemDisconnectedException: Distribution manager on rs-RunItNow-ZH1504a1i3xlarge-hydra-client-10(dataStor egemfire2_host1_421:421)<ec><v22>:41004 started at Wed Mar 23 17:11:48 PDT 2022: Member isn't responding to heartbeat requests, caused by org.apac he.geode.ForcedDisconnectException: Member isn't responding to heartbeat requests at org.apache.geode.distributed.internal.ClusterDistributionManager$Stopper.generateCancelledException(ClusterDistributionManager.java:289 3) at org.apache.geode.distributed.internal.InternalDistributedSystem$Stopper.generateCancelledException(InternalDistributedSystem.java:1177) at org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83) at org.apache.geode.distributed.internal.ClusterElderManager.getElderId(ClusterElderManager.java:76) at org.apache.geode.distributed.internal.ClusterDistributionManager.getElderId(ClusterDistributionManager.java:2085) at org.apache.geode.distributed.internal.locks.DLockService.getElderId(DLockService.java:254) at org.apache.geode.distributed.internal.locks.DLockService.notLockGrantorId(DLockService.java:824) at org.apache.geode.distributed.internal.locks.DLockService.unlock(DLockService.java:1807) at org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.releaseTieLock(PersistenceAdvisorImpl.java:1181) at org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.close(PersistenceAdvisorImpl.java:1158) at org.apache.geode.internal.cache.DistributedRegion.distributedRegionCleanup(DistributedRegion.java:2564) at org.apache.geode.internal.cache.DistributedRegion.postDestroyRegion(DistributedRegion.java:2657) at org.apache.geode.internal.cache.LocalRegion.recursiveDestroyRegion(LocalRegion.java:2732) at org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6241) at org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1834) at org.apache.geode.internal.cache.LocalRegion.handleCacheClose(LocalRegion.java:7320) at org.apache.geode.internal.cache.DistributedRegion.handleCacheClose(DistributedRegion.java:2691) at org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2308) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2154) at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1538) at org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2545) at org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408) at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254) at org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329) at org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190) at org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1793) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: org.apache.geode.ForcedDisconnectException: Member isn't responding to heartbeat requests at org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2319) ... 3 more
Attachments
Issue Links
- links to