Uploaded image for project: 'Apache Knox'
  1. Apache Knox
  2. KNOX-1997

For websocket connections, seeing java.lang.NullPointerException when a data frame arrives on websocket while connection is still being setup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0, 1.1.0, 1.2.0, 1.3.0
    • 1.4.0
    • Server
    • None
    • HDP 3.1.0

    Description

      I am seeing following exception in Knox gateway logs:

      2019-08-23 12:43:46,347 WARN websockets.ProxyInboundClient (AbstractEventDriver.java:unhandled(251)) - Unhandled Error (closing connection)_
       java.lang.NullPointerException
       at org.apache.knox.gateway.websockets.ProxyWebSocketAdapter$3.onMessageText(ProxyWebSocketAdapter.java:260)
       at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:87)
       at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:78)
       at org.eclipse.jetty.websocket.jsr356.messages.TextWholeMessage.messageComplete(TextWholeMessage.java:56)
       at org.eclipse.jetty.websocket.jsr356.endpoints.JsrEndpointEventDriver.onTextFrame(JsrEndpointEventDriver.java:218)
       at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:162)
       at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:476)
       at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:220)
       at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:219)
       at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:244)
       at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onConnectionStateChange(AbstractWebSocketConnection.java:337)
       at org.eclipse.jetty.websocket.common.io.IOState.notifyStateListeners(IOState.java:184)
       at org.eclipse.jetty.websocket.common.io.IOState.onOpened(IOState.java:447)
       at org.eclipse.jetty.websocket.common.WebSocketSession.open(WebSocketSession.java:619)
       at org.eclipse.jetty.websocket.common.WebSocketSession.onOpened(WebSocketSession.java:544)
       at org.eclipse.jetty.io.AbstractConnection.onOpened(AbstractConnection.java:209)
       at org.eclipse.jetty.io.AbstractConnection.onOpen(AbstractConnection.java:202)
       at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onOpen(AbstractWebSocketConnection.java:446)
       at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:440)
       at org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.upgrade(WebSocketUpgradeRequest.java:624)
       at org.eclipse.jetty.client.http.HttpChannelOverHTTP.exchangeTerminating(HttpChannelOverHTTP.java:118)
       at org.eclipse.jetty.client.HttpReceiver.terminateResponse(HttpReceiver.java:462)
       at org.eclipse.jetty.client.HttpReceiver.responseSuccess(HttpReceiver.java:416)  

      I added more debug logs in Knox to analyze the issue and observed following was happening:

      • Knox generates proper URL for backend websocket connection
      2019-08-23 12:43:46,068 DEBUG gateway.websockets (GatewayWebsocketHandler.java:createWebSocket(148)) - Message: Rajat .. backednUrl = ws://rafd001-mst-01.cloud.in.guavus.com:11011/_sock/411/anq2kfzp/websocket
      • Websocket upgrade successful at HTTP protocol layer. Knox starts setting up its session management internal data structures for this connection 
      2019-08-23 12:43:46,313 DEBUG client.HttpReceiver (HttpReceiver.java:responseSuccess(403)) - Response success HttpResponse[HTTP/1.1 101 Switching Protocols]@708f1c60
       _2019-08-23 12:43:46,324 DEBUG common.WebSocketSession (WebSocketSession.java:doStart(248)) - starting - WebSocketSession[websocket=JsrEndpointEventDriver[org.apache.knox.gateway.websockets.ProxyInboundClient],behavior=CLIENT,connection=WebSocketClientConnection@1528d316::SocketChannelEndPoint@10d45033
      {[rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011|http://rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011]<->/[192.168.133.69:50702|http://192.168.133.69:50702/],OPEN,fill=-,flush=-,to=23/300000}
      {io=0/0,kio=0,kro=1} > r:rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011,closed=false)=>HttpChannelOverHTTP@26de2594(exchange=HttpExchange@7e8b6268 req=TERMINATED/null@null res=TERMINATED/null@null)[send=HttpSenderOverHTTP@73b692d4(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@4e3bb764\{s=START}],recv=HttpReceiverOverHTTP@7d5a172b(rsp=IDLE,failure=null)[HttpParser\{s=START,0 of1}]],remote=null,incoming=JsrEndpointEventDriver[org.apache.knox.gateway.websockets.ProxyInboundClient],outgoing=ExtensionStack[queueSize=0,extensions=[],incoming=org.eclipse.jetty.websocket.jsr356.JsrSession,outgoing=org.eclipse.jetty.websocket.client.io.WebSocketClientConnection]]
      • A data frame arrives on the connection from CDAP backend and Knox starts processing it in a separate thread but internal data structures have not yet been established fully:
      2019-08-23 12:43:46,346 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onMessageText(257)) - Message: Rajat .. In onMessageText .. remote is nulll
       2019-08-23 12:43:46,347 WARN websockets.ProxyInboundClient (AbstractEventDriver.java:unhandled(251)) - Unhandled Error (closing connection)
       java.lang.NullPointerException
       at org.apache.knox.gateway.websockets.ProxyWebSocketAdapter$3.onMessageText(ProxyWebSocketAdapter.java:260)
       at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:87)
       at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:78)
      • As a result of this exception, Knox starts closing this connection as well as connection with front end.
      2019-08-23 12:43:46,347 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onError(246)) - Message: Rajat .. inside onError
       2019-08-23 12:43:46,348 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:cleanupOnError(183)) - Message: Rajat .. inside cleanupOnError
       2019-08-23 12:43:46,348 ERROR gateway.websockets (ProxyWebSocketAdapter.java:cleanupOnError(185)) - Error: java.lang.NullPointerException
       2019-08-23 12:43:46,349 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:closeQuietly(294)) - Message: Rajat .. in closeQuietly
      • Internal data structures are now setup in the other thread
      2019-08-23 12:43:46,356 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onWebSocketConnect(113)) - Message: Rajat ... frontEndSession = WebSocketSession[websocket=JettyListenerEventDriver[org.apache.knox.gateway.websockets.ProxyWebSocketAdapter],behavior=SERVER,connection=WebSocketServerConnection@151585ba::DecryptedEndPoint@4463419a {/[192.168.110.254:61755|http://192.168.110.254:61755/]<>/[192.168.133.69:8443|http://192.168.133.69:8443/],OPEN,fill=,flush=,to=423/300000}
      ,remote=WebSocketRemoteEndpoint@66632a26[batching=true],incoming=JettyListenerEventDriver[org.apache.knox.gateway.websockets.ProxyWebSocketAdapter],outgoing=ExtensionStack[queueSize=0,extensions=[permessage-deflate],incoming=org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension,outgoing=org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension]]
       2019-08-23 12:43:46,358 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onWebSocketConnect(119)) - Message: Rajat ...front remote = WebSocketRemoteEndpoint@66632a26[batching=true]
      • Fronted UI sees connection closed and send a new connection request.

      Looks like there is a race condition/improper synchronization between two Knox threads: one thread opening a websocket connection, setting up the connection session and the other thread processing a message/data packet sent by UI service backend.

      Attachments

        1. KNOX-1997.patch
          24 kB
          Rajat Goel
        2. gateway.log_26Aug.gz
          23 kB
          Rajat Goel

        Issue Links

          Activity

            People

              rajatg Rajat Goel
              rajatg Rajat Goel
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 11h 10m
                  11h 10m