Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
3.0.0-alpha-1, 2.4.6
-
None
-
Reviewed
Description
DBB got from BucketCache would be freed unexpectedly before RPC completed for following two cases:
- Case 1: Replacing Block and getting Block execute concurrently.
Because of splitting or stream read,two or more read threads may try to cache the same Block concurrently and one thread read all of the next block header could replace the another one didn't(seeHBASE-20447). When replacing Block,the following BucketCache.WriterThread.putIntoBackingMap method invokes BucketCache.blockEvicted in line 916,which free the BucketEntry directly in line 544, not through the RefCount.912 private void putIntoBackingMap(BlockCacheKey key, BucketEntry bucketEntry) { 913 BucketEntry previousEntry = backingMap.put(key, bucketEntry); 914 if (previousEntry != null && previousEntry != bucketEntry) { 915 previousEntry.withWriteLock(offsetLock, () -> { 916 blockEvicted(key, previousEntry, false); 917 return null; 918 }); 919 } 920 } 543 void blockEvicted(BlockCacheKey cacheKey, BucketEntry bucketEntry, boolean decrementBlockNumber) { 544 bucketAllocator.freeBlock(bucketEntry.offset()); 545 realCacheSize.add(-1 * bucketEntry.getLength()); 546 blocksByHFile.remove(cacheKey); 547 if (decrementBlockNumber) { 548 this.blockNumber.decrement(); 549 } 550 }
Freeing the BucketEntry directly may cause HFileBlock returned from BucketCache.getBlock be freed unexpectedly before RPC completed.
HBASE-26155fixed this problem to some extent, but seems not completely. The BucketCache.getBlock might be invoked just before BucketCache.cacheBlockWithWaitInternal and after the BucketEntry.isRpcRef checking is done. Considering the following sequence:
1. Block1 was cached successfully,the RefCnt of Block1 is 1.
2. Thread1 caching the same BlockCacheKey with Block2 satisfied BlockCacheUtil.shouldReplaceExistingCacheBlock, so Block2 would replace Block1, but thread1 stopping before BucketCache.cacheBlockWithWaitInternal
3. Thread2 invoking BucketCache.getBlock with the same BlockCacheKey, which returned Block1, the RefCnt of Block1 is 2.
4. Thread1 continues caching Block2, in BucketCache.WriterThread.putIntoBackingMap , the old Block1 is freed directly which RefCnt is 2, but the Block1 is still used by Thread2 and the content of Block1 would be overwitten after it is freed, which may cause a serious error.
- Case 2: Evicting Block , caching Block and getting Block execute concurrently.
Considering the following sequence:
1. Thread1 caching Block1, stopping after WriterThread.putIntoBackingMap, the RefCnt of Block1 is 1.
2. Thread2 invoking BucketCache.evictBlock with the same BlockCacheKey , but stopping after BucketCache.removeFromRamCache.
3. Thread3 invoking BucketCache.getBlock with the same BlockCacheKey, which returned Block1, the RefCnt of Block1 is 2.
4. Thread1 continues caching block1,but finding that BucketCache.RAMCache.remove returning false, so invoking BucketCache.blockEvicted to free the the Block1 directly
which RefCnt is 2 and the Block1 is still used by Thread3.
Case 2 is caused by following line 1014 in WriteThread.doDrain, which also free the BucketEntry directly,just like the Case1.1004 boolean existed = ramCache.remove(key, re -> { 1005 if (re != null) { 1006 heapSize.add(-1 * re.getData().heapSize()); 1007 } 1008 }); 1009 if (!existed && bucketEntries[i] != null) { 1010 // Block should have already been evicted. Remove it and free space. 1011 final BucketEntry bucketEntry = bucketEntries[i]; 1012 bucketEntry.withWriteLock(offsetLock, () -> { 1013 if (backingMap.remove(key, bucketEntry)) { 1014 blockEvicted(key, bucketEntry, false); 1015 } 1016 return null; 1017 }); 1018 } 1019 }
My Fix in the PR is as following:
- We never free the BucketEntry directly, freeing the BucketEntry must through RefCnt.
- RefCnt#recycler just freeing the corresponding BucketEntry , not including removing the Block from BucketCache.backingMap or BucketCache.ramCache.
Removing the Block from BucketCache.backingMap or BucketCache.ramCache is immediately done in BucketCache.evictBlock, and in fact this behavior is the same as
BucketCache.evictBlock in branch-1, except that freeing BucketEntry is delayed until RefCnt becoming 0.
In current branch-2 implementation, removing the Block from BucketCache.backingMap or BucketCache.ramCache is done in RefCnt#recycler, which is something
unreasonable: we decreasing the RefCnt because we removing it from BucketCache.backingMap, not in reverse that because the RefCnt decreased to zero we
removing it from BucketCache.backingMap, which may causes the newly added Block with the same BlockCacheKey be incorrectly evicted from BucketCache.backingMap
or BucketCache.ramCache.
Attachments
Issue Links
- links to