Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 4.0.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0, Impala 4.2.0, Impala 4.1.1, Impala 4.1.2, Impala 4.3.0
-
None
-
ghx-label-1
Description
Since IMPALA-8240, we allow event-processor to retry for MetastoreNotificationFetchExceptions. However, there are several places that we haven't converted HMS failures in fetching events into MetastoreNotificationFetchExceptions:
1. getNextMetastoreEvents() throws IllegalStateException if it fails to create a MetaStoreClient.
E1024 05:00:58.458434 258 MetastoreEventsProcessor.java:888] Unexpected exception received while processing event Java exception follows: java.lang.IllegalStateException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:105) at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:78) at org.apache.impala.catalog.MetaStoreClientPool.getClient(MetaStoreClientPool.java:205) at org.apache.impala.catalog.Catalog.getMetaStoreClient(Catalog.java:397) at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:802) at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:848) at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:869) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:98) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:151) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:122) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:115) at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:99) ... 13 moreCaused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor948.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84) ... 18 moreCaused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: Peer indicated failure: Failure to initialize security context at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:171) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:244) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:51) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:48) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport.open(TUGIAssumingTransport.java:48) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:758) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:271) at sun.reflect.GeneratedConstructorAccessor948.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:98) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:151) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:122) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:115) at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:99) at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:78) at org.apache.impala.catalog.MetaStoreClientPool.getClient(MetaStoreClientPool.java:205) at org.apache.impala.catalog.Catalog.getMetaStoreClient(Catalog.java:397) at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:802) at org.apache.impala.catalog.events.MetastoreEventsProcessor.getNextMetastoreEvents(MetastoreEventsProcessor.java:848) at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:869) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:829) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:271) ... 22 more
2. processEvents() doesn't handle the failures of getCurrentEventId() as MetastoreNotificationFetchExceptions. Instead, getCurrentEventId() throws CatalogException:
E1114 16:01:11.121475 28921 MetastoreEventsProcessor.java:942] Unexpected exception received while processing event Java exception follows: org.apache.impala.catalog.CatalogException: Unable to fetch the current notification event id. Check if metastore service is accessible at org.apache.impala.catalog.events.MetastoreEventsProcessor.getCurrentEventId(MetastoreEventsProcessor.java:744) at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:922) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) E1114 16:01:11.136973 28921 MetastoreEventsProcessor.java:1190] Notification event is null W1114 16:01:11.137122 28921 MetastoreEventsProcessor.java:913] Event processing is skipped since status is ERROR. Last synced event id is 8406252
Event-processor should distinguish these HMS errors and don't go into the ERROR state. So it can retry until the connection to HMS is back to normal.
Attachments
Issue Links
- relates to
-
IMPALA-8240 Event processor should keep trying if metastore is unavailable
- Resolved