Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • None
    • None
    • Reviewed

    Description

      Looks like it is stuck can't flush; bad math?

      See the dashboard where it hung here: https://builds.apache.org/job/HBASE-Flaky-Tests-branch2.0/2428/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestAvoidCellReferencesIntoShippedBlocks-output.txt

      ... the provocation could be this?

      2018-02-21 04:18:44,973 DEBUG [Thread-178] bucket.BucketCache(629): This block eefaa9a7b10e437b9fc2b55a67d63191_4356 is still referred by 1 readers. Can not be freed now. Hence will mark this for evicting at a later point
      Exception in thread "Thread-178" java.lang.AssertionError: old blocks should still be found expected:<6> but was:<5>
      at org.junit.Assert.fail(Assert.java:88)
      at org.junit.Assert.failNotEquals(Assert.java:834)
      at org.junit.Assert.assertEquals(Assert.java:645)

      ... Then we get stuck doing this:

      2018-02-21 04:23:34,661 DEBUG [master/asf903:0.Chore.1] master.HMaster(1524): Skipping normalization for table: testHBase16372InCompactionWritePath, as it's either system table or doesn't have auto normalization turned on
      2018-02-21 04:23:35,695 INFO [regionserver/asf903:0.Chore.1] regionserver.HRegionServer$PeriodicMemStoreFlusher(1752): MemstoreFlusherChore requesting flush of hbase:meta,,1.1588230740 because info has an old edit so flush to free WALs after random delay 286820ms
      2018-02-21 04:23:36,009 DEBUG [ReadOnlyZKClient-localhost:61855@0x227ea3ed] zookeeper.ReadOnlyZKClient(316): 0x227ea3ed to localhost:61855 inactive for 60000ms; closing (Will reconnect when new requests)

      It also failed a recent nightly for same reason:

      https://builds.apache.org/job/HBase%20Nightly/job/branch-2/355/

      Any chance you'd take a look ram_krish? You best at this stuff?

      Attachments

        1. HBASE-20036_1.patch
          2 kB
          ramkrishna.s.vasudevan
        2. HBASE-20036.patch
          2 kB
          ramkrishna.s.vasudevan

        Activity

          I will take a look. I ran some 10 times but it did not fail. I think with the logs I can do a compare to see if any other case happens in the flow.

          ram_krish ramkrishna.s.vasudevan added a comment - I will take a look. I ran some 10 times but it did not fail. I think with the logs I can do a compare to see if any other case happens in the flow.
          stack Michael Stack added a comment - It failed one of the nightlies in hadoop3 run.... https://builds.apache.org/job/HBase%20Nightly/job/branch-2/371/artifact/output-jdk8-hadoop3/patch-unit-root.txt Any luck ram_krish
          chia7712 Chia-Ping Tsai added a comment -

          That BucketCache cache the block by non-sync so it has chance that the block haven't been cached when doing the assert check. ram_krish Could I take it over if you are busy?

          chia7712 Chia-Ping Tsai added a comment - That BucketCache cache the block by non-sync so it has chance that the block haven't been cached when doing the assert check. ram_krish Could I take it over if you are busy?

          chia7712
          Sorry about not spending time. Last week was busy with some other things internally. Today I will check this. Also last week the report page got expired as I did not save a copy of it. Now since I have it will take a look at it. If I don find the reason will let you know.

          ram_krish ramkrishna.s.vasudevan added a comment - chia7712 Sorry about not spending time. Last week was busy with some other things internally. Today I will check this. Also last week the report page got expired as I did not save a copy of it. Now since I have it will take a look at it. If I don find the reason will let you know.

          Thanks for the heads up BTW.

          ram_krish ramkrishna.s.vasudevan added a comment - Thanks for the heads up BTW.

          The reason for the failure was that the 6th block may take some time by the time the assertion happened. So now ensured that we wait for the sixth block and then do the assert. Hopefully this test is not flaky anymore.

          ram_krish ramkrishna.s.vasudevan added a comment - The reason for the failure was that the 6th block may take some time by the time the assertion happened. So now ensured that we wait for the sixth block and then do the assert. Hopefully this test is not flaky anymore.
          chia7712 Chia-Ping Tsai added a comment -

          LGTM. Will loop the test with patch locally. Seems the assert check is useless as the value must be 6 when leaving the while loop.

           

          chia7712 Chia-Ping Tsai added a comment - LGTM. Will loop the test with patch locally. Seems the assert check is useless as the value must be 6 when leaving the while loop.  

          Seems the assert check is useless as the value must be 6 when leaving the while loop.

          Yes. I can remove it in the next patch though.

          ram_krish ramkrishna.s.vasudevan added a comment - Seems the assert check is useless as the value must be 6 when leaving the while loop. Yes. I can remove it in the next patch though.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
                Prechecks
          +1 hbaseanti 0m 0s Patch does not have any anti-patterns.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                master Compile Tests
          +1 mvninstall 4m 20s master passed
          +1 compile 0m 42s master passed
          +1 checkstyle 1m 4s master passed
          +1 shadedjars 5m 47s branch has no errors when building our shaded downstream artifacts.
          -1 findbugs 2m 10s hbase-server in master has 24 extant Findbugs warnings.
          +1 javadoc 0m 28s master passed
                Patch Compile Tests
          +1 mvninstall 4m 41s the patch passed
          +1 compile 0m 48s the patch passed
          +1 javac 0m 48s the patch passed
          +1 checkstyle 1m 5s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 shadedjars 4m 47s patch has no errors when building our shaded downstream artifacts.
          +1 hadoopcheck 19m 26s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0.
          +1 findbugs 2m 16s the patch passed
          +1 javadoc 0m 29s the patch passed
                Other Tests
          +1 unit 106m 28s hbase-server in the patch passed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          149m 38s



          Subsystem Report/Notes
          Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01
          JIRA Issue HBASE-20036
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12911997/HBASE-20036.patch
          Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
          uname Linux cfb79ed910c6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux
          Build tool maven
          Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
          git revision master / a34f129aff
          maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z)
          Default Java 1.8.0_151
          findbugs v3.1.0-RC3
          findbugs https://builds.apache.org/job/PreCommit-HBASE-Build/11675/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
          Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/11675/testReport/
          Max. process+thread count 4944 (vs. ulimit of 10000)
          modules C: hbase-server U: hbase-server
          Console output https://builds.apache.org/job/PreCommit-HBASE-Build/11675/console
          Powered by Apache Yetus 0.7.0 http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated.       Prechecks +1 hbaseanti 0m 0s Patch does not have any anti-patterns. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       master Compile Tests +1 mvninstall 4m 20s master passed +1 compile 0m 42s master passed +1 checkstyle 1m 4s master passed +1 shadedjars 5m 47s branch has no errors when building our shaded downstream artifacts. -1 findbugs 2m 10s hbase-server in master has 24 extant Findbugs warnings. +1 javadoc 0m 28s master passed       Patch Compile Tests +1 mvninstall 4m 41s the patch passed +1 compile 0m 48s the patch passed +1 javac 0m 48s the patch passed +1 checkstyle 1m 5s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedjars 4m 47s patch has no errors when building our shaded downstream artifacts. +1 hadoopcheck 19m 26s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. +1 findbugs 2m 16s the patch passed +1 javadoc 0m 29s the patch passed       Other Tests +1 unit 106m 28s hbase-server in the patch passed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 149m 38s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 JIRA Issue HBASE-20036 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12911997/HBASE-20036.patch Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile uname Linux cfb79ed910c6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh git revision master / a34f129aff maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) Default Java 1.8.0_151 findbugs v3.1.0-RC3 findbugs https://builds.apache.org/job/PreCommit-HBASE-Build/11675/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/11675/testReport/ Max. process+thread count 4944 (vs. ulimit of 10000) modules C: hbase-server U: hbase-server Console output https://builds.apache.org/job/PreCommit-HBASE-Build/11675/console Powered by Apache Yetus 0.7.0 http://yetus.apache.org This message was automatically generated.
          chia7712 Chia-Ping Tsai added a comment -

          Loop the patch 100 times. All pass. +1

          chia7712 Chia-Ping Tsai added a comment - Loop the patch 100 times. All pass. +1

          Pushed to branch-2 and master. Thanks for the verification and review chia7712.

          ram_krish ramkrishna.s.vasudevan added a comment - Pushed to branch-2 and master. Thanks for the verification and review chia7712 .

          This is what I committed. Just for reference. It has the assert removed.

          ram_krish ramkrishna.s.vasudevan added a comment - This is what I committed. Just for reference. It has the assert removed.
          hudson Hudson added a comment -

          FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4654 (See https://builds.apache.org/job/HBase-Trunk_matrix/4654/)
          HBASE-20036 TestAvoidCellReferencesIntoShippedBlocks timed out (Ram) (ramkrishna.s.vasudevan: rev 7cfb46432fbdf9b53592be11efc8a7d79d1a9455)

          • (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAvoidCellReferencesIntoShippedBlocks.java
          hudson Hudson added a comment - FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4654 (See https://builds.apache.org/job/HBase-Trunk_matrix/4654/ ) HBASE-20036 TestAvoidCellReferencesIntoShippedBlocks timed out (Ram) (ramkrishna.s.vasudevan: rev 7cfb46432fbdf9b53592be11efc8a7d79d1a9455) (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAvoidCellReferencesIntoShippedBlocks.java

          People

            ram_krish ramkrishna.s.vasudevan
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: