Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19287

master hangs forever if RecoverMeta send assign meta region request to target server fail

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0-beta-1, 2.0.0
    • proc-v2
    • None
    • Reviewed

    Description

      2017-11-10 19:26:56,019 INFO [ProcExecWrkr-1] procedure.RecoverMetaProcedure: pid=138, state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure failedMetaServer=null, splitWal=true; Retaining meta assignment to server=hadoop-slave1.hadoop,16020,1510341981454
      2017-11-10 19:26:56,029 INFO [ProcExecWrkr-1] procedure2.ProcedureExecutor: Initialized subprocedures=[

      {pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454}

      ]
      2017-11-10 19:26:56,067 INFO [ProcExecWrkr-2] procedure.MasterProcedureScheduler: pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454 hbase:meta hbase:meta,,1.1588230740
      2017-11-10 19:26:56,071 INFO [ProcExecWrkr-2] assignment.AssignProcedure: Start pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; rit=OFFLINE, location=hadoop-slave1.hadoop,16020,1510341981454; forceNewPlan=false, retain=false
      2017-11-10 19:26:56,224 INFO [ProcExecWrkr-4] zookeeper.MetaTableLocator: Setting hbase:meta (replicaId=0) location in ZooKeeper as hadoop-slave2.hadoop,16020,1510341988652
      2017-11-10 19:26:56,230 INFO [ProcExecWrkr-4] assignment.RegionTransitionProcedure: Dispatch pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; rit=OPENING, location=hadoop-slave2.hadoop,16020,1510341988652
      2017-11-10 19:26:56,382 INFO [ProcedureDispatcherTimeoutThread] procedure.RSProcedureDispatcher: Using procedure batch rpc execution for serverName=hadoop-slave2.hadoop,16020,1510341988652 version=2097152
      2017-11-10 19:26:57,542 INFO [main-EventThread] zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [hadoop-slave2.hadoop,16020,1510341988652]
      2017-11-10 19:26:57,543 INFO [main-EventThread] master.ServerManager: Master doesn't enable ServerShutdownHandler during initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652
      2017-11-10 19:26:58,875 INFO [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] master.ServerManager: Registering server=hadoop-slave1.hadoop,16020,1510342016106
      2017-11-10 19:27:05,832 INFO [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] master.ServerManager: Registering server=hadoop-slave2.hadoop,16020,1510342023184
      2017-11-10 19:27:05,832 INFO [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] master.ServerManager: Triggering server recovery; existingServer hadoop-slave2.hadoop,16020,1510341988652 looks stale, new server:hadoop-slave2.hadoop,16020,1510342023184
      2017-11-10 19:27:05,832 INFO [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] master.ServerManager: Master doesn't enable ServerShutdownHandler during initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652
      2017-11-10 19:27:49,815 INFO [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] client.RpcRetryingCallerImpl: tarted=38594 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not online on hadoop-slave2.hadoop,16020,1510342023184
      at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3290)
      at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1370)
      at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2401)
      at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41544)
      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
      at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:278)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:258)
      row 'hbase:namespace' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-slave2.hadoop,16020,1510341988652, seqNum=0

      Attachments

        1. master.patch
          4 kB
          Yi Liang
        2. hbase-19287-master-v2.patch
          11 kB
          Yi Liang
        3. HBASE-19287-master-v3.patch
          12 kB
          Yi Liang
        4. HBASE-19287-master-v3.patch
          12 kB
          Yi Liang
        5. HBASE-19287-master-v4.patch
          12 kB
          Yi Liang

        Issue Links

          Activity

            People

              easyliangjob Yi Liang
              easyliangjob Yi Liang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: