Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6326

segfault during impyla HiveServer2Cursor.cancel_operation() over SSL

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.10.0, Impala 2.11.0
    • Impala 3.3.0
    • Clients
    • None

    Description

      During a stress test on a secure cluster one of the clients crashed in HiveServer2Cursor.cancel_operation().

      The stress test debug log shows

      2017-12-13 16:50:52,624 21607 Query Consumer DEBUG:concurrent_select[579]:Requesting memory reservation
      2017-12-13 16:50:52,624 21607 Query Consumer DEBUG:concurrent_select[245]:Reserved 102 MB; 1455 MB available; 95180 MB overcommitted
      2017-12-13 16:50:52,625 21607 Query Consumer DEBUG:concurrent_select[581]:Received memory reservation
      2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:concurrent_select[865]:Using tpcds_300_decimal_parquet database
      2017-12-13 16:50:52,658 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
      2017-12-13 16:50:52,825 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: SET ABORT_ON_ERROR=1
      2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:concurrent_select[877]:Setting mem limit to 102 MB
      2017-12-13 16:50:53,060 21607 Query Consumer DEBUG:db_connection[203]:IMPALA: SET MEM_LIMIT=102M
      2017-12-13 16:50:53,370 21607 Query Consumer DEBUG:concurrent_select[881]:Running query with 102 MB mem limit at vc0704.test with timeout secs 52:
      select
        dt.d_year,
        item.i_category_id,
        item.i_category,
        sum(ss_ext_sales_price)
      from
        date_dim dt,
        store_sales,
        item
      where
        dt.d_date_sk = store_sales.ss_sold_date_sk
        and store_sales.ss_item_sk = item.i_item_sk
        and item.i_manager_id = 1
        and dt.d_moy = 11
        and dt.d_year = 2000
      group by
        dt.d_year,
        item.i_category_id,
        item.i_category
      order by
        sum(ss_ext_sales_price) desc,
        dt.d_year,
        item.i_category_id,
        item.i_category
      limit 100;
      
      2017-12-13 16:51:08,491 21607 Query Consumer DEBUG:concurrent_select[889]:Query id is b6425b84aa45f633:9ce7cad900000000
      2017-12-13 16:51:15,337 21607 Query Consumer DEBUG:concurrent_select[900]:Waiting for query to execute
      2017-12-13 16:51:22,316 21607 Query Consumer DEBUG:concurrent_select[900]:Waiting for query to execute
      2017-12-13 16:51:27,266 21607 Fetch Results b6425b84aa45f633:9ce7cad900000000 DEBUG:concurrent_select[1009]:Fetching result for query with id b6425b84aa45f633:9ce7cad900000000
      2017-12-13 16:51:44,625 21607 Query Consumer DEBUG:concurrent_select[940]:Attempting cancellation of query with id b6425b84aa45f633:9ce7cad900000000
      2017-12-13 16:51:44,627 21607 Query Consumer INFO:hiveserver2[259]:Canceling active operation

      The impalad log shows

      I1213 16:50:54.287511 136399 admission-controller.cc:510] Schedule for id=b6425b84aa45f633:9ce7cad900000000 in pool_name=root.systest cluster_mem_needed=816.00 MB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B
      I1213 16:50:54.289767 136399 admission-controller.cc:515] Stats: agg_num_running=184, agg_num_queued=0, agg_mem_reserved=1529.63 GB,  local_host(local_mem_admitted=132.02 GB, num_admitted_running=21, num_queued=0, backend_mem_reserved=194.58 GB)
      I1213 16:50:54.291550 136399 admission-controller.cc:531] Admitted query id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:50:54.296922 136399 coordinator.cc:99] Exec() query_id=b6425b84aa45f633:9ce7cad900000000 stmt=/* Mem: 102 MB. Coordinator: vc0704.test. */
      select
        dt.d_year,
        item.i_category_id,
        item.i_category,
        sum(ss_ext_sales_price)
      from
        date_dim dt,
        store_sales,
        item
      where
        dt.d_date_sk = store_sales.ss_sold_date_sk
        and store_sales.ss_item_sk = item.i_item_sk
        and item.i_manager_id = 1
        and dt.d_moy = 11
        and dt.d_year = 2000
      group by
        dt.d_year,
        item.i_category_id,
        item.i_category
      order by
        sum(ss_ext_sales_price) desc,
        dt.d_year,
        item.i_category_id,
        item.i_category
      limit 100;
      I1213 16:50:59.263310 136399 query-state.cc:151] Using query memory limit from query options: 102.00 MB
      I1213 16:50:59.267033 136399 mem-tracker.cc:189] Using query memory limit: 102.00 MB
      I1213 16:50:59.272271 136399 coordinator.cc:357] starting execution on 8 backends for query b6425b84aa45f633:9ce7cad900000000
      I1213 16:51:07.525143 136399 coordinator.cc:370] started execution on 8 backends for query b6425b84aa45f633:9ce7cad900000000
      I1213 16:51:08.358772 136399 impala-hs2-server.cc:490] ExecuteStatement(): return_val=TExecuteStatementResp {
        01: status (struct) = TStatus {
          01: statusCode (i32) = 0,
        },
        02: operationHandle (struct) = TOperationHandle {
          01: operationId (struct) = THandleIdentifier {
            01: guid (string) = "3\xf6E\xaa\x84[B\xb6\x00\x00\x00\x00\xd9\xca\xe7\x9c",
            02: secret (string) = "3\xf6E\xaa\x84[B\xb6\x00\x00\x00\x00\xd9\xca\xe7\x9c",
          },
          02: operationType (i32) = 0,
          03: hasResultSet (bool) = true,
        },
      }
      I1213 16:51:27.310685 136399 impala-hs2-server.cc:725] GetResultSetMetadata(): query_id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:51:27.316963 136399 impala-hs2-server.cc:759] GetResultSetMetadata(): return_val=TGetResultSetMetadataResp {
        01: status (struct) = TStatus {
          01: statusCode (i32) = 0,
        },
        02: schema (struct) = TTableSchema {
          01: columns (list) = list<struct>[4] {
            [0] = TColumnDesc {
              01: columnName (string) = "d_year",
              02: typeDesc (struct) = TTypeDesc {
                01: types (list) = list<struct>[1] {
                  [0] = TTypeEntry {
                    01: primitiveEntry (struct) = TPrimitiveTypeEntry {
                      01: type (i32) = 3,
                    },
                  },
                },
              },
              03: position (i32) = 0,
            },
            [1] = TColumnDesc {
              01: columnName (string) = "i_category_id",
              02: typeDesc (struct) = TTypeDesc {
                01: types (list) = list<struct>[1] {
                  [0] = TTypeEntry {
                    01: primitiveEntry (struct) = TPrimitiveTypeEntry {
                      01: type (i32) = 3,
                    },
                  },
                },
              },
              03: position (i32) = 1,
            },
            [2] = TColumnDesc {
              01: columnName (string) = "i_category",
              02: typeDesc (struct) = TTypeDesc {
                01: types (list) = list<struct>[1] {
                  [0] = TTypeEntry {
                    01: primitiveEntry (struct) = TPrimitiveTypeEntry {
                      01: type (i32) = 7,
                    },
                  },
                },
              },
              03: position (i32) = 2,
            },
            [3] = TColumnDesc {
              01: columnName (string) = "sum(ss_ext_sales_price)",
              02: typeDesc (struct) = TTypeDesc {
                01: types (list) = list<struct>[1] {
                  [0] = TTypeEntry {
                    01: primitiveEntry (struct) = TPrimitiveTypeEntry {
                      01: type (i32) = 15,
                      02: typeQualifiers (struct) = TTypeQualifiers {
                        01: qualifiers (map) = map<string,struct>[2] {
                          "precision" -> TTypeQualifierValue {
                            01: i32Value (i32) = 38,
                          },
                          "scale" -> TTypeQualifierValue {
                            01: i32Value (i32) = 2,
                          },
                        },
                      },
                    },
                  },
                },
              },
              03: position (i32) = 3,
            },
          },
        },
      }
      I1213 16:51:27.426277 136399 coordinator.cc:789] Coordinator waiting for backends to finish, 2 remaining
      I1213 16:52:42.215745 136399 coordinator.cc:794] All backends finished successfully.
      I1213 16:52:42.223140 136399 coordinator.cc:1090] Release admssion control resources for query b6425b84aa45f633:9ce7cad900000000
      I1213 16:52:42.230653 136399 impala-hs2-server.cc:677] CancelOperation(): query_id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:52:42.230882 136399 impala-server.cc:1075] Cancel(): query_id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:52:42.233353 136399 coordinator.cc:895] Cancel() query_id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:52:42.234686 136399 coordinator.cc:905] CancelBackends() query_id=b6425b84aa45f633:9ce7cad900000000, tried to cancel 0 backends
      I1213 16:52:42.276818 136399 impala-server.cc:1796] Connection from client 172.28.194.153:60218 closed, closing 2 associated session(s)
      I1213 16:52:42.277026 136399 impala-server.cc:992] UnregisterQuery(): query_id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:52:42.279270 136399 impala-server.cc:1075] Cancel(): query_id=b6425b84aa45f633:9ce7cad900000000
      I1213 16:52:42.924029 136399 query-exec-mgr.cc:149] ReleaseQueryState(): query_id=b6425b84aa45f633:9ce7cad900000000 refcnt=1
      I1213 16:52:43.094966 136399 thrift-util.cc:123] SSL_shutdown: tlsv1 alert record overflow

      The test host kernel log shows

      Dec 14 00:52:42 ip-172-28-194-153 kernel: [ 6295.010585] show_signal_msg: 36 callbacks suppressed
      Dec 14 00:52:42 ip-172-28-194-153 kernel: [ 6295.010592] python[22107]: segfault at 7f35001d4000 ip 00007f3557110794 sp 00007f350dfe4308 error 4 in libc-2.19.so[7f3557076000+1be000]

      Although the stress test debug log shows that it started fetching results, none were saved to the output file for that query.

      Attachments

        1. test_fork_crash.py
          0.5 kB
          Sailesh Mukil

        Issue Links

          Activity

            People

              tarmstrong Tim Armstrong
              mmulder Matthew Mulder
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: