Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.0.0-beta-2, 2.0.0
    • None
    • None

    Description

      It hangs in TEST_UTIL.shutdownMiniCluster() for me locally.

      Will upload the test output and jstack result for further digging.

      Attachments

        1. 0001-HBASE-19791-do-nothing.patch
          1 kB
          Michael Stack
        2. jstack
          192 kB
          Duo Zhang
        3. output
          237 kB
          Duo Zhang

        Issue Links

          Activity

            stack Michael Stack added a comment - - edited

            Ugh. It passes for me on mac on master and branch-2 but I see it excluded in the branch-2 nightly run: https://builds.apache.org/job/HBase%20Nightly/job/branch-2/183/artifact/output-jdk8-hadoop2/patch-unit-root.txt

            I tried it on a clean gce and it passes there toooo...

            [INFO] -------------------------------------------------------
            [INFO] T E S T S
            [INFO] -------------------------------------------------------
            [INFO] Running org.apache.hadoop.hbase.client.TestZKAsyncRegistry
            [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 312.345 s - in org.apache.hadoop.hbase.client.TestZKAsyncRegistry
            [INFO]
            [INFO] Results:
            [INFO]
            [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
            [INFO]
            [INFO]
            [INFO] — maven-surefire-plugin:2.20.1:test (secondPartTestsExecution) @ hbase-server —
            [INFO] Tests are skipped.
            [INFO] ------------------------------------------------------------------------

            It takes a while but passes.

            stack Michael Stack added a comment - - edited Ugh. It passes for me on mac on master and branch-2 but I see it excluded in the branch-2 nightly run: https://builds.apache.org/job/HBase%20Nightly/job/branch-2/183/artifact/output-jdk8-hadoop2/patch-unit-root.txt I tried it on a clean gce and it passes there toooo... [INFO] ------------------------------------------------------- [INFO] T E S T S [INFO] ------------------------------------------------------- [INFO] Running org.apache.hadoop.hbase.client.TestZKAsyncRegistry [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 312.345 s - in org.apache.hadoop.hbase.client.TestZKAsyncRegistry [INFO] [INFO] Results: [INFO] [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] [INFO] — maven-surefire-plugin:2.20.1:test (secondPartTestsExecution) @ hbase-server — [INFO] Tests are skipped. [INFO] ------------------------------------------------------------------------ It takes a while but passes.
            stack Michael Stack added a comment -

            The amount of time this takes up on gce varies wildly from 21 seconds to 05:21 min w/ all variants in between. Let me look into this. Thanks for the filing Apache9

            stack Michael Stack added a comment - The amount of time this takes up on gce varies wildly from 21 seconds to 05:21 min w/ all variants in between. Let me look into this. Thanks for the filing Apache9
            stack Michael Stack added a comment -

            Do nothing patch ... just to see how this does up on jenkins.

            stack Michael Stack added a comment - Do nothing patch ... just to see how this does up on jenkins.
            hadoopqa Hadoop QA added a comment -
            +1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 8s Docker mode activated.
                  Prechecks
            0 findbugs 0m 0s Findbugs executables are not available.
            +1 hbaseanti 0m 0s Patch does not have any anti-patterns.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                  master Compile Tests
            +1 mvninstall 4m 31s master passed
            +1 compile 0m 41s master passed
            +1 checkstyle 1m 5s master passed
            +1 shadedjars 5m 39s branch has no errors when building our shaded downstream artifacts.
            +1 javadoc 0m 27s master passed
                  Patch Compile Tests
            +1 mvninstall 4m 33s the patch passed
            +1 compile 0m 41s the patch passed
            +1 javac 0m 41s the patch passed
            +1 checkstyle 1m 2s the patch passed
            +1 whitespace 0m 0s The patch has no whitespace issues.
            +1 shadedjars 4m 38s patch has no errors when building our shaded downstream artifacts.
            +1 hadoopcheck 20m 6s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0.
            +1 javadoc 0m 27s the patch passed
                  Other Tests
            +1 unit 92m 26s hbase-server in the patch passed.
            +1 asflicense 0m 16s The patch does not generate ASF License warnings.
            131m 9s



            Subsystem Report/Notes
            Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01
            JIRA Issue HBASE-19791
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12906002/0001-HBASE-19791-do-nothing.patch
            Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
            uname Linux 1d6781aec3f7 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 15:49:21 UTC 2017 x86_64 GNU/Linux
            Build tool maven
            Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
            git revision master / 4ddfecac56
            maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z)
            Default Java 1.8.0_151
            Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/11062/testReport/
            modules C: hbase-server U: hbase-server
            Console output https://builds.apache.org/job/PreCommit-HBASE-Build/11062/console
            Powered by Apache Yetus 0.6.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 8s Docker mode activated.       Prechecks 0 findbugs 0m 0s Findbugs executables are not available. +1 hbaseanti 0m 0s Patch does not have any anti-patterns. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       master Compile Tests +1 mvninstall 4m 31s master passed +1 compile 0m 41s master passed +1 checkstyle 1m 5s master passed +1 shadedjars 5m 39s branch has no errors when building our shaded downstream artifacts. +1 javadoc 0m 27s master passed       Patch Compile Tests +1 mvninstall 4m 33s the patch passed +1 compile 0m 41s the patch passed +1 javac 0m 41s the patch passed +1 checkstyle 1m 2s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedjars 4m 38s patch has no errors when building our shaded downstream artifacts. +1 hadoopcheck 20m 6s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. +1 javadoc 0m 27s the patch passed       Other Tests +1 unit 92m 26s hbase-server in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 131m 9s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 JIRA Issue HBASE-19791 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12906002/0001-HBASE-19791-do-nothing.patch Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile uname Linux 1d6781aec3f7 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 15:49:21 UTC 2017 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh git revision master / 4ddfecac56 maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) Default Java 1.8.0_151 Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/11062/testReport/ modules C: hbase-server U: hbase-server Console output https://builds.apache.org/job/PreCommit-HBASE-Build/11062/console Powered by Apache Yetus 0.6.0 http://yetus.apache.org This message was automatically generated.
            zhangduo Duo Zhang added a comment -

            It is stuck in assignment manager when shutting down.

            2018-01-13 19:32:25,613 WARN  [AssignmentThread] assignment.AssignmentManager(1735): no server available, unable to find a location for 1 unassigned regions. waiting
            2018-01-13 19:32:26,616 WARN  [AssignmentThread] assignment.AssignmentManager(1735): no server available, unable to find a location for 1 unassigned regions. waiting
            
            zhangduo Duo Zhang added a comment - It is stuck in assignment manager when shutting down. 2018-01-13 19:32:25,613 WARN [AssignmentThread] assignment.AssignmentManager(1735): no server available, unable to find a location for 1 unassigned regions. waiting 2018-01-13 19:32:26,616 WARN [AssignmentThread] assignment.AssignmentManager(1735): no server available, unable to find a location for 1 unassigned regions. waiting
            hudson Hudson added a comment -

            FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4414 (See https://builds.apache.org/job/HBase-Trunk_matrix/4414/)
            HBASE-19791 TestZKAsyncRegistry hangs (stack: rev d3a306d81d3f087696fc6d45dd8d6bda939378b2)

            • (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestZKAsyncRegistry.java
            hudson Hudson added a comment - FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4414 (See https://builds.apache.org/job/HBase-Trunk_matrix/4414/ ) HBASE-19791 TestZKAsyncRegistry hangs (stack: rev d3a306d81d3f087696fc6d45dd8d6bda939378b2) (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestZKAsyncRegistry.java
            stack Michael Stack added a comment -

            The above integration message comes because I pushed the do-nothing attached patch. Reverted it from master.

            stack Michael Stack added a comment - The above integration message comes because I pushed the do-nothing attached patch. Reverted it from master.
            hudson Hudson added a comment -

            FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4421 (See https://builds.apache.org/job/HBase-Trunk_matrix/4421/)
            Revert "HBASE-19791 TestZKAsyncRegistry hangs" Premature push (stack: rev eeb40ff66c7d5b148fd693780be64358e5f7385e)

            • (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestZKAsyncRegistry.java
            hudson Hudson added a comment - FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4421 (See https://builds.apache.org/job/HBase-Trunk_matrix/4421/ ) Revert " HBASE-19791 TestZKAsyncRegistry hangs" Premature push (stack: rev eeb40ff66c7d5b148fd693780be64358e5f7385e) (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestZKAsyncRegistry.java
            stack Michael Stack added a comment -

            Another variant on the HBASE-19794 theme, we are stuck here in startup:

            Thread 425 (M:0;asf903:55398):
            State: TIMED_WAITING
            Blocked count: 168
            Waited count: 2679
            Stack:
            java.lang.Thread.sleep(Native Method)
            org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:181)
            org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:168)
            org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:142)
            org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:130)
            org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:122)
            org.apache.hadoop.hbase.master.assignment.AssignmentManager.assignMeta(AssignmentManager.java:470)
            org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMeta(MasterMetaBootstrap.java:133)
            org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:82)
            org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:945)
            org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2033)
            org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
            java.lang.Thread.run(Thread.java:748)
            stack Michael Stack added a comment - Another variant on the HBASE-19794 theme, we are stuck here in startup: Thread 425 (M:0;asf903:55398): State: TIMED_WAITING Blocked count: 168 Waited count: 2679 Stack: java.lang. Thread .sleep(Native Method) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:181) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:168) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:142) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:130) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:122) org.apache.hadoop.hbase.master.assignment.AssignmentManager.assignMeta(AssignmentManager.java:470) org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMeta(MasterMetaBootstrap.java:133) org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:82) org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:945) org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2033) org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553) java.lang. Thread .run( Thread .java:748)
            zhangduo Duo Zhang added a comment -

            Tried after HBASE-19794 goes in. We are still stuck at the same place...

            I guess the problem is meta region replica...

            Will dig more later.

            Thanks.

            zhangduo Duo Zhang added a comment - Tried after HBASE-19794 goes in. We are still stuck at the same place... I guess the problem is meta region replica... Will dig more later. Thanks.
            stack Michael Stack added a comment -

            This test doesn't show in flakies anymore but I was able to manufacture same stack trace in shutdown over in HBASE-19840. Let me see what we can do here.

            stack Michael Stack added a comment - This test doesn't show in flakies anymore but I was able to manufacture same stack trace in shutdown over in  HBASE-19840 . Let me see what we can do here.
            stack Michael Stack added a comment -

            Looks like our messing with shutdown has removed this test from the flakies list. Resolving as "Cannot Reproduce".

            stack Michael Stack added a comment - Looks like our messing with shutdown has removed this test from the flakies list. Resolving as "Cannot Reproduce".
            zhangduo Duo Zhang added a comment -

            It shows up again... But this time we limit the running time so it is reported as error.

            https://builds.apache.org/job/HBASE-Flaky-Tests/25237/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestZKAsyncRegistry-output.txt

            Still, hangs in assign meta when shutting down.

            zhangduo Duo Zhang added a comment - It shows up again... But this time we limit the running time so it is reported as error. https://builds.apache.org/job/HBASE-Flaky-Tests/25237/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestZKAsyncRegistry-output.txt Still, hangs in assign meta when shutting down.
            stack Michael Stack added a comment -

            Let me look...

            stack Michael Stack added a comment - Let me look...
            stack Michael Stack added a comment -

            Re-resolving. It no longer shows in flakies list.

            stack Michael Stack added a comment - Re-resolving. It no longer shows in flakies list.

            People

              stack Michael Stack
              zhangduo Duo Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: