Details
-
Bug
-
Status: Patch Available
-
Critical
-
Resolution: Unresolved
-
3.6.3, 3.7.0
Description
leader is zookeeper-0, it add zk-1and zk-2. when adding zk-3, zk-2 become leader, but it minCommittedLog=0. After adding zk-3, zk-3 just receive the DIFF message from zk-2, so resulting in inconsistent data.
Is there any error?
else if ((maxCommittedLog >= peerLastZxid) && (minCommittedLog <= peerLastZxid)) { // Follower is within commitLog range LOG.info("Using committedLog for peer sid: {}", getSid()); Iterator<Proposal> itr = db.getCommittedLog().iterator(); currentZxid = queueCommittedProposals(itr, peerLastZxid, null, maxCommittedLog); needSnap = false; }
there are logs:
###ZK-2
021-08-26 02:02:07,698 [myid:3] - INFO [NIOWorkerThread-1:NIOServerCnxn@507] - Processing ruok command from /127.0.0.1:56458
2021-08-26 02:02:08,046 [myid:3] - INFO [ListenerHandler-zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:3888:QuorumCnxManager$Listener$ListenerHandler@1070] - Received connection request from /10.233.64.36:50788
2021-08-26 02:02:08,072 [myid:3] - INFO [LearnerHandler-/10.233.64.36:60310:LearnerHandler@504] - Follower sid: 4 : info : zookeeper-3.zookeeper-headless.default.svc.cluster.local:2888:3888:observer;0.0.0.0:2181
2021-08-26 02:02:08,073 [myid:3] - INFO [LearnerHandler-/10.233.64.36:60310:ZKDatabase@345] - On disk txn sync enabled with snapshotSizeFactor 0.33
2021-08-26 02:02:08,074 [myid:3] - INFO [LearnerHandler-/10.233.64.36:60310:LearnerHandler@807] - Synchronizing with Learner sid: 4 maxCommittedLog=0x200000003 minCommittedLog=0x0 lastProcessedZxid=0x200000003 peerLastZxid=0x0
2021-08-26 02:02:08,074 [myid:3] - INFO [LearnerHandler-/10.233.64.36:60310:LearnerHandler@871] - Using committedLog for peer sid: 4
2021-08-26 02:02:08,074 [myid:3] - INFO [LearnerHandler-/10.233.64.36:60310:LearnerHandler@979] - Sending DIFF zxid=0x200000003 for peer sid: 4
2021-08-26 02:02:09,250 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@471] - WorkerReceiver is down
2021-08-26 02:02:09,250 [myid:3] - INFO [WorkerSender[myid=3]:FastLeaderElection$Messenger$WorkerSender@505] - WorkerSender is down
2021-08-26 02:02:12,898 [myid:3] - INFO [NIOWorkerThread-2:NIOServerCnxn@507] - Processing ruok command from /127.0.0.1:56516
###ZK-3
2021-08-26 02:02:08,064 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@1512] - The large request threshold is set to -1
2021-08-26 02:02:08,064 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):ZooKeeperServer@339] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 clientPortListenBacklog -1 datadir /data/version-2 snapdir /data/version-2
2021-08-26 02:02:08,064 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):ObserverZooKeeperServer@55] - syncEnabled =true
2021-08-26 02:02:08,066 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@864] - Peer state changed: observing - discovery
2021-08-26 02:02:08,067 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Observer@163] - Observing new leader sid=3 addr=zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:2888
2021-08-26 02:02:08,068 [myid:4] - INFO [LeaderConnector-zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:2888:Learner$LeaderConnector@370] - Successfully connected to leader, using address: zookeeper-2.zookeeper-headless.default.svc.cluster.local/10.233.64.34:2888
2021-08-26 02:02:08,072 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@864] - Peer state changed: observing - synchronization
2021-08-26 02:02:08,073 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@551] - Getting a diff from the leader 0x200000003
2021-08-26 02:02:08,073 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@869] - Peer state changed: observing - synchronization - diff
2021-08-26 02:02:08,077 [myid:4] - WARN [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@617] - Got zxid 0x100000023 expected 0x1
2021-08-26 02:02:08,078 [myid:4] - ERROR [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@1836] - setLastSeenQuorumVerifier called with stale config 4294967332. Current version: 8589934595
2021-08-26 02:02:08,080 [myid:4] - WARN [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@617] - Got zxid 0x200000001 expected 0x10000002e
2021-08-26 02:02:08,082 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@717] - Learner received NEWLEADER message
2021-08-26 02:02:08,083 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@869] - Peer state changed: observing - synchronization
2021-08-26 02:02:08,086 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):CommitProcessor@476] - Configuring CommitProcessor with readBatchSize -1 commitBatchSize 1
2021-08-26 02:02:08,086 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):CommitProcessor@438] - Configuring CommitProcessor with 1 worker threads.
2021-08-26 02:02:08,088 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):RequestThrottler@74] - zookeeper.request_throttler.shutdownTimeout = 10000
2021-08-26 02:02:08,098 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):Learner@701] - Learner received UPTODATE message
2021-08-26 02:02:08,099 [myid:4] - INFO [SyncThread:4:FileTxnLog@284] - Creating new log file: log.100000023
2021-08-26 02:02:08,099 [myid:4] - INFO [CommitProcessor:4:LearnerSessionTracker@116] - Committing global session 0x100001170e3000d
2021-08-26 02:02:08,099 [myid:4] - INFO [QuorumPeer[myid=4](plain=0.0.0.0:2181)(secure=disabled):QuorumPeer@864] - Peer state changed: observing - broadcast
2021-08-26 02:02:08,100 [myid:4] - WARN [CommitProcessor:4:RateLogger@86] - Message:Digests are not matching. Value is Zxid. Value:4294967331
2021-08-26 02:02:08,101 [myid:4] - ERROR [CommitProcessor:4:DataTree@1832] - First digest mismatch on txn: 72057668946296845,0,4294967331,1629943304746,-10
, 4000
, expected digest is 2,10124071835
, actual digest is 1371985504,
2021-08-26 02:02:08,102 [myid:4] - INFO [CommitProcessor:4:LearnerSessionTracker@116] - Committing global session 0x200000b302e0000
Attachments
Attachments
Issue Links
- links to