Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
When there are two indexes online with the same index, eg due to decommission, over-replication or maintenance, and the container is under replicated due to another missing index, an illegal argument exception can be thrown when collecting the source indexes:
2022-08-02 10:54:26,939 WARN org.apache.hadoop.hdds.scm.container.replication.ECUnderReplicationHandler: Exception while processing for creating the EC reconstruction container commands for #2. java.lang.IllegalStateException: Duplicate key 3 (attempted merging values ContainerReplica{containerID=#2, state=CLOSED, datanodeDetails=fb63a3c8-2e5b-432e-be63-274c41aab79f{ip: 172.27.124.131, host: quasar-onjdpu-5.quasar-onjdpu.root.hwx.site, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, placeOfBirth=2ff0ed60-a461-41a4-8fff-9da6cb4e52ad, sequenceId=0, keyCount=1, bytesUsed=34603008,replicaIndex= 3} and ContainerReplica{containerID=#2, state=CLOSED, datanodeDetails=2ff0ed60-a461-41a4-8fff-9da6cb4e52ad{ip: 172.27.193.4, host: quasar-onjdpu-3.quasar-onjdpu.root.hwx.site, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default, certSerialId: null, persistedOpState: DECOMMISSIONING, persistedOpStateExpiryEpochSec: 0}, placeOfBirth=2ff0ed60-a461-41a4-8fff-9da6cb4e52ad, sequenceId=0, keyCount=1, bytesUsed=34603008,replicaIndex= 3}) at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) at java.base/java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1603) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) at org.apache.hadoop.hdds.scm.container.replication.ECUnderReplicationHandler.processAndCreateCommands(ECUnderReplicationHandler.java:151) at org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processUnderReplicatedContainer(ReplicationManager.java:366) at org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.processContainer(UnderReplicatedProcessor.java:92) at org.apache.hadoop.hdds.scm.container.replication.UnderReplicatedProcessor.processAll(UnderReplicatedProcessor.java:76) at org.apache.hadoop.hdds.scm.ha.BackgroundSCMService.run(BackgroundSCMService.java:101) at java.base/java.lang.Thread.run(Thread.java:834)
This then goes unhandled and causes the under rep processing thread to exit.
Attachments
Issue Links
- links to