Details
Description
I am seeing following exception in Knox gateway logs:
2019-08-23 12:43:46,347 WARN websockets.ProxyInboundClient (AbstractEventDriver.java:unhandled(251)) - Unhandled Error (closing connection)_ java.lang.NullPointerException at org.apache.knox.gateway.websockets.ProxyWebSocketAdapter$3.onMessageText(ProxyWebSocketAdapter.java:260) at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:87) at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:78) at org.eclipse.jetty.websocket.jsr356.messages.TextWholeMessage.messageComplete(TextWholeMessage.java:56) at org.eclipse.jetty.websocket.jsr356.endpoints.JsrEndpointEventDriver.onTextFrame(JsrEndpointEventDriver.java:218) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:162) at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:476) at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:220) at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:219) at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:244) at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onConnectionStateChange(AbstractWebSocketConnection.java:337) at org.eclipse.jetty.websocket.common.io.IOState.notifyStateListeners(IOState.java:184) at org.eclipse.jetty.websocket.common.io.IOState.onOpened(IOState.java:447) at org.eclipse.jetty.websocket.common.WebSocketSession.open(WebSocketSession.java:619) at org.eclipse.jetty.websocket.common.WebSocketSession.onOpened(WebSocketSession.java:544) at org.eclipse.jetty.io.AbstractConnection.onOpened(AbstractConnection.java:209) at org.eclipse.jetty.io.AbstractConnection.onOpen(AbstractConnection.java:202) at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onOpen(AbstractWebSocketConnection.java:446) at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:440) at org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.upgrade(WebSocketUpgradeRequest.java:624) at org.eclipse.jetty.client.http.HttpChannelOverHTTP.exchangeTerminating(HttpChannelOverHTTP.java:118) at org.eclipse.jetty.client.HttpReceiver.terminateResponse(HttpReceiver.java:462) at org.eclipse.jetty.client.HttpReceiver.responseSuccess(HttpReceiver.java:416)
I added more debug logs in Knox to analyze the issue and observed following was happening:
- Knox generates proper URL for backend websocket connection
2019-08-23 12:43:46,068 DEBUG gateway.websockets (GatewayWebsocketHandler.java:createWebSocket(148)) - Message: Rajat .. backednUrl = ws://rafd001-mst-01.cloud.in.guavus.com:11011/_sock/411/anq2kfzp/websocket
- Websocket upgrade successful at HTTP protocol layer. Knox starts setting up its session management internal data structures for this connection
2019-08-23 12:43:46,313 DEBUG client.HttpReceiver (HttpReceiver.java:responseSuccess(403)) - Response success HttpResponse[HTTP/1.1 101 Switching Protocols]@708f1c60 _2019-08-23 12:43:46,324 DEBUG common.WebSocketSession (WebSocketSession.java:doStart(248)) - starting - WebSocketSession[websocket=JsrEndpointEventDriver[org.apache.knox.gateway.websockets.ProxyInboundClient],behavior=CLIENT,connection=WebSocketClientConnection@1528d316::SocketChannelEndPoint@10d45033 {[rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011|http://rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011]<->/[192.168.133.69:50702|http://192.168.133.69:50702/],OPEN,fill=-,flush=-,to=23/300000} {io=0/0,kio=0,kro=1} > r:rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011,closed=false)=>HttpChannelOverHTTP@26de2594(exchange=HttpExchange@7e8b6268 req=TERMINATED/null@null res=TERMINATED/null@null)[send=HttpSenderOverHTTP@73b692d4(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@4e3bb764\{s=START}],recv=HttpReceiverOverHTTP@7d5a172b(rsp=IDLE,failure=null)[HttpParser\{s=START,0 of1}]],remote=null,incoming=JsrEndpointEventDriver[org.apache.knox.gateway.websockets.ProxyInboundClient],outgoing=ExtensionStack[queueSize=0,extensions=[],incoming=org.eclipse.jetty.websocket.jsr356.JsrSession,outgoing=org.eclipse.jetty.websocket.client.io.WebSocketClientConnection]]
- A data frame arrives on the connection from CDAP backend and Knox starts processing it in a separate thread but internal data structures have not yet been established fully:
2019-08-23 12:43:46,346 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onMessageText(257)) - Message: Rajat .. In onMessageText .. remote is nulll 2019-08-23 12:43:46,347 WARN websockets.ProxyInboundClient (AbstractEventDriver.java:unhandled(251)) - Unhandled Error (closing connection) java.lang.NullPointerException at org.apache.knox.gateway.websockets.ProxyWebSocketAdapter$3.onMessageText(ProxyWebSocketAdapter.java:260) at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:87) at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:78)
- As a result of this exception, Knox starts closing this connection as well as connection with front end.
2019-08-23 12:43:46,347 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onError(246)) - Message: Rajat .. inside onError 2019-08-23 12:43:46,348 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:cleanupOnError(183)) - Message: Rajat .. inside cleanupOnError 2019-08-23 12:43:46,348 ERROR gateway.websockets (ProxyWebSocketAdapter.java:cleanupOnError(185)) - Error: java.lang.NullPointerException 2019-08-23 12:43:46,349 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:closeQuietly(294)) - Message: Rajat .. in closeQuietly
- Internal data structures are now setup in the other thread
2019-08-23 12:43:46,356 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onWebSocketConnect(113)) - Message: Rajat ... frontEndSession = WebSocketSession[websocket=JettyListenerEventDriver[org.apache.knox.gateway.websockets.ProxyWebSocketAdapter],behavior=SERVER,connection=WebSocketServerConnection@151585ba::DecryptedEndPoint@4463419a {/[192.168.110.254:61755|http://192.168.110.254:61755/]<>/[192.168.133.69:8443|http://192.168.133.69:8443/],OPEN,fill=,flush=,to=423/300000} ,remote=WebSocketRemoteEndpoint@66632a26[batching=true],incoming=JettyListenerEventDriver[org.apache.knox.gateway.websockets.ProxyWebSocketAdapter],outgoing=ExtensionStack[queueSize=0,extensions=[permessage-deflate],incoming=org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension,outgoing=org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension]] 2019-08-23 12:43:46,358 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onWebSocketConnect(119)) - Message: Rajat ...front remote = WebSocketRemoteEndpoint@66632a26[batching=true]
- Fronted UI sees connection closed and send a new connection request.
Looks like there is a race condition/improper synchronization between two Knox threads: one thread opening a websocket connection, setting up the connection session and the other thread processing a message/data packet sent by UI service backend.