Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10704

test_retry_query_result_cacheing_failed and test_retry_query_set_query_in_flight_failed are flaky

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.0.0
    • Impala 4.0.0, Impala 4.1.0
    • Backend
    • None
    • ghx-label-7

    Description

      These two tests are added in IMPALA-10413. Saw the falures in

      The failures are

      tests/custom_cluster/test_query_retries.py:761: in test_retry_query_result_cacheing_failed
          lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
      tests/common/impala_test_suite.py:1128: in assert_eventually
          count, timeout_s, error_msg_str))
      E   Timeout: Check failed to return True after 0 tries and 60 seconds
      
      tests/custom_cluster/test_query_retries.py:775: in test_retry_query_set_query_in_flight_failed
          lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
      tests/common/impala_test_suite.py:1128: in assert_eventually
          count, timeout_s, error_msg_str))
      E   Timeout: Check failed to return True after 0 tries and 60 seconds
      

      Another problem is, when manually ran them locally, found that the original query is hanging in RETRYING state. See attached screenshot.
      The test codes are problematic, it only expects one query running but not expecting its finish:

        @pytest.mark.execute_serially
        @CustomClusterTestSuite.with_args(
            impalad_args="--debug_actions=QUERY_RETRY_SET_RESULT_CACHE:FAIL",
            statestored_args="--statestore_heartbeat_frequency_ms=60000")
        def test_retry_query_result_cacheing_failed(self):
          """Test setting up results cacheing failed."""
      
          self.cluster.impalads[1].kill()
          query = "select count(*) from tpch_parquet.lineitem"
          self.hs2_client.set_configuration({'retry_failed_queries': 'true'})
          self.hs2_client.set_configuration_option('impala.resultset.cache.size', '1024')
          self.hs2_client.execute_async(query)
          self.assert_eventually(60, 0.1, 
              lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
      

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: