Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12561

Event-processor shouldn't go into ERROR state for failures in fetching events

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 4.0.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0, Impala 4.2.0, Impala 4.1.1, Impala 4.1.2, Impala 4.3.0
    • Impala 4.4.0
    • Catalog
    • None

    Description

      Since IMPALA-8240, we allow event-processor to retry for MetastoreNotificationFetchExceptions. However, there are several places that we haven't converted HMS failures in fetching events into MetastoreNotificationFetchExceptions:

      1. getNextMetastoreEvents() throws IllegalStateException if it fails to create a MetaStoreClient.

      E1024 05:00:58.458434   258 MetastoreEventsProcessor.java:888] Unexpected exception received while processing event
      Java exception follows:
      java.lang.IllegalStateException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:105)   at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:78)    at org.apache.impala.catalog.MetaStoreClientPool.getClient(MetaStoreClientPool.java:205)        at org.apache.impala.catalog.Catalog.getMetaStoreClient(Catalog.java:397)       at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:802)  at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:848)  at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:869)   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)      at java.lang.Thread.run(Thread.java:750)Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient       at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:86)      at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:98)     at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:151)  at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:122)  at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:115)  at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:99)    ... 13 moreCaused by: java.lang.reflect.InvocationTargetException       at sun.reflect.GeneratedConstructorAccessor948.newInstance(Unknown Source)      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423)      at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84)      ... 18 moreCaused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: Peer indicated failure: Failure to initialize security context        at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:171)       at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:244)     at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39)  at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:51) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:48) at java.security.AccessController.doPrivileged(Native Method)   at javax.security.auth.Subject.doAs(Subject.java:422)   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport.open(TUGIAssumingTransport.java:48)  at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:758)      at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:271)    at sun.reflect.GeneratedConstructorAccessor948.newInstance(Unknown Source)      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423)      at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84)      at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:98)     at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:151)  at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:122)  at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:115)  at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:99)    at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:78)    at org.apache.impala.catalog.MetaStoreClientPool.getClient(MetaStoreClientPool.java:205)        at org.apache.impala.catalog.Catalog.getMetaStoreClient(Catalog.java:397)       at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:802)  at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:848)  at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:869)   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)      at java.lang.Thread.run(Thread.java:750)
      )
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:829)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:271)
              ... 22 more
      

      2. processEvents() doesn't handle the failures of getCurrentEventId() as MetastoreNotificationFetchExceptions. Instead, getCurrentEventId() throws CatalogException:

      E1114 16:01:11.121475 28921 MetastoreEventsProcessor.java:942] Unexpected exception received while processing event
      Java exception follows:
      org.apache.impala.catalog.CatalogException: Unable to fetch the current notification event id. Check if metastore service is accessible
              at org.apache.impala.catalog.events.MetastoreEventsProcessor.getCurrentEventId(MetastoreEventsProcessor.java:744)
              at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:922)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:750)
      E1114 16:01:11.136973 28921 MetastoreEventsProcessor.java:1190] Notification event is null 
      W1114 16:01:11.137122 28921 MetastoreEventsProcessor.java:913] Event processing is skipped since status is ERROR. Last synced event id is 8406252
      

      Event-processor should distinguish these HMS errors and don't go into the ERROR state. So it can retry until the connection to HMS is back to normal.

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: