Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
I wasn't sure if check version (opcode 13) was permitted outside of a multi op, so I tried it. My server crashed with a NullPointerException and became unusable until restarted. I guess it's not allowed, but perhaps the server should handle this more gracefully?
Here are the server logs:
Accepted socket connection from /0:0:0:0:0:0:0:1:51737 Session establishment request from client /0:0:0:0:0:0:0:1:51737 client's lastZxid is 0x0 Connection request from old client /0:0:0:0:0:0:0:1:51737; will be dropped if server is in r-o mode Client attempting to establish new session at /0:0:0:0:0:0:0:1:51737 :Fsessionid:0x10025651faa0000 type:createSession cxid:0x0 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Processing request:: sessionid:0x10025651faa0000 type:createSession cxid:0x0 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Got zxid 0x60000065e expected 0x1 Creating new log file: log.60000065e Committing request:: sessionid:0x10025651faa0000 type:createSession cxid:0x0 zxid:0x60000065e txntype:-10 reqpath:n/a Processing request:: sessionid:0x10025651faa0000 type:createSession cxid:0x0 zxid:0x60000065e txntype:-10 reqpath:n/a :Esessionid:0x10025651faa0000 type:createSession cxid:0x0 zxid:0x60000065e txntype:-10 reqpath:n/a sessionid:0x10025651faa0000 type:createSession cxid:0x0 zxid:0x60000065e txntype:-10 reqpath:n/a Add a buffer to outgoingBuffers, sk sun.nio.ch.SelectionKeyImpl@28e9f397 is valid: true Established session 0x10025651faa0000 with negotiated timeout 20000 for client /0:0:0:0:0:0:0:1:51737 :Fsessionid:0x10025651faa0000 type:check cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:/ Processing request:: sessionid:0x10025651faa0000 type:check cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:/ Processing request:: sessionid:0x10025651faa0000 type:check cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:/ Exception causing close of session 0x10025651faa0000: Connection reset by peer :Esessionid:0x10025651faa0000 type:check cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:/ IOException stack trace java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:320) at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:530) at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:162) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Unexpected exception java.lang.NullPointerException at org.apache.zookeeper.server.ZKDatabase.addCommittedProposal(ZKDatabase.java:252) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:127) at org.apache.zookeeper.server.quorum.CommitProcessor$CommitWorkRequest.doWork(CommitProcessor.java:362) at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:162) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Committing request:: sessionid:0x10025651faa0000 type:error cxid:0x1 zxid:0x60000065f txntype:-1 reqpath:n/a Unregister MBean [org.apache.ZooKeeperService:name0=ReplicatedServer_id1,name1=replica.1,name2=Follower,name3=Connections,name4="0:0:0:0:0:0:0:1",name5=0x10025651faa0000] Exception thrown by downstream processor, unable to continue. CommitProcessor exited loop! Closed socket connection for client /0:0:0:0:0:0:0:1:51737 which had sessionid 0x10025651faa0000
And here's a one-liner to repro, which does a ConnectRequest followed by a CheckVersion(path="/", version=89235}:
echo AAAALAAAAAAAAAAAAAAAAAAAJxAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAEQAAAAEAAAANAAAAAS8AAVyT | base64 --decode | nc localhost 2181 >/dev/null
This is against master as of a couple of weeks ago (f78061a). I haven't checked to see which versions are affected.
Attachments
Issue Links
- is duplicated by
-
ZOOKEEPER-4680 OpCode.check is treated as a quorum write operation white it is not
- Resolved
- links to
(1 links to)