Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11729

Investigate and improve impalad startup time

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • ghx-label-11

    Description

      impalad startup takes several seconds, even few seconds before trying connecting to statestored. From a test run (release mode) with a parallel catalogd startup:

      I1113 21:02:17.334743  4363 logging.cc:247] stdout will be logged to this file.
      I1113 21:02:18.968991  4363 JniFrontend.java:141] Java Input arguments:
      I1113 21:02:19.887519  4363 exec-env.cc:467] Starting statestore subscriber service
      

      After connecting to statestore coordinators need to wait for the initial catalog update and processing it will take time depending on the number of catalog objects:

      I1113 21:02:19.888423  4363 Frontend.java:1618] Waiting for local catalog to be initialized, attempt: 0
      I1113 21:02:21.888621  4363 Frontend.java:1618] Waiting for local catalog to be initialized, attempt: 1
      I1113 21:02:23.888849  4363 Frontend.java:1614] Local catalog initialized after: 4000 ms.
      I1113 21:02:23.890105  4363 impala-server.cc:3103] Impala has started.
      

      Meanwhile on catalogd it takes 2 seconds before even trying to connect to HMS:

      I1113 21:02:17.289606  4281 logging.cc:247] stdout will be logged to this file.
      I1113 21:02:19.023339  4281 HiveMetaStoreClient.java:720] Trying to connect to metastore with URI (thrift://localhost:9083) in binary transport mode
      I1113 21:02:21.671665  5028 catalog-server.cc:400] A catalog update with 1647 entries is assembled. Catalog version: 1649 Last sent catalog version: 0
      

      Statestore starts up quickly, much before other components try to connect to it:

      I1113 21:02:17.263167  4262 logging.cc:247] stdout will be logged to this file.
      I1113 21:02:17.268682  4262 thrift-server.cc:419] ThriftServer 'StatestoreService' started on port: 24000
      I1113 21:02:19.670817  4285 TAcceptQueueServer.cpp:355] New connection to server StatestoreService from client <Host: 127.0.0.1 Port: 44156>
      

      While this 6 secs at impalad with ~2 secs waiting for initial catalog update is not very bad, making it quicker would be visible in test run times (custom cluster tests restart the cluster a lot) and in autoscaling scenarios. Finding out what takes the time during startup would be also nice ramp up task.

      The startup logic is single threaded - I see the most potential in moving some independent tasks to separate threads. It is also possible that we are doing some completely unnecessary tasks in some scenarios (e..g executor only impalad) or that some tasks could be safely moved to a later point when they are actually needed.

      Initialization is driven mainly from here:
      https://github.com/apache/impala/blob/master/be/src/service/impalad-main.cc
      https://github.com/apache/impala/blob/master/be/src/catalog/catalogd-main.cc
      but probably most of time is spend in Java code

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              csringhofer Csaba Ringhofer
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: