Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
1.1.0
-
None
-
None
-
ozone: 1.1.0
hadoop: 3.1.1
hive: 3.1.0
tez: 0.10.0 (this version is needed because of TEZ-4032)both the hadoop cluster + ozone are secured using kerberos.
Description
Integration ozone with hive is producing a thread leak in HS2, in this sample, 12 open connections to hive produced 149 threads and the count kept increasing until HS2 needed to be restarted.
SETTINGS:
HDFS integration using the following settings
- viewfs-mount-table:
fs.viewfs.mounttable.clusters.link./cluster1=hdfs://cluster1
fs.viewfs.mounttable.clusters.link./ozfs1=ofs://ozfs1
- core-site.xml:
fs.ofs.impl=org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
fs.AbstractFileSystem.o3fs.impl=org.apache.hadoop.fs.ozone.OzFs
- hdfs-site.xml:
ozone.om.service.ids=ozfs1
ozone.om.nodes.ozfs1=om1,om2
ozone.om.address.ozfs1.om1=ozone1.domain.com:9862
ozone.om.address.ozfs1.om2=ozone2.domain.com:9862
dfs.nameservices=cluster1,ozfs1
ozone.om.kerberos.keytab.file=/etc/security/keytabs/om.service.keytab
ozone.om.kerberos.principal=om/_HOST@DOMAIN.COM
Hive integration using the following setting
- hive-site.xml:
tez.job.fs-servers=hdfs://cluster1,ofs://ozfs1
mapreduce.job.hdfs-servers=hdfs://cluster1,ofs://ozfs1
From hive's stack trace we see many thread like these:
Thread 4958 (Truststore reloader thread):
State: TIMED_WAITING
Blocked count: 0
Waited count: 221
Stack:
java.lang.Thread.$$YJP$$sleep(Native Method)
java.lang.Thread.sleep(Thread.java)
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)
java.lang.Thread.run(Thread.java:748)
Thread 4948 (Truststore reloader thread):
State: TIMED_WAITING
Blocked count: 79
Waited count: 221
Stack:
java.lang.Thread.$$YJP$$sleep(Native Method)
java.lang.Thread.sleep(Thread.java)
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)
java.lang.Thread.run(Thread.java:748)
Thread 4777 (Truststore reloader thread):
State: TIMED_WAITING
Blocked count: 0
Waited count: 252
Stack:
java.lang.Thread.$$YJP$$sleep(Native Method)
java.lang.Thread.sleep(Thread.java)
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)
java.lang.Thread.run(Thread.java:748)
Using yourKit we identified the following:
{{java.lang.Thread.<init>(Runnable, String) Thread.java
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() ReloadingX509TrustManager.java:95
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode) FileBasedKeyStoresFactory.java:223
org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180
org.apache.hadoop.yarn.client.api.impl.TimelineConnector.getSSLFactory(Configuration) TimelineConnector.java:181
org.apache.hadoop.yarn.client.api.impl.TimelineConnector.serviceInit(Configuration) TimelineConnector.java:108
org.apache.hadoop.service.AbstractService.init(Configuration) AbstractService.java:164
org.apache.hadoop.service.CompositeService.serviceInit(Configuration) CompositeService.java:108
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.serviceInit(Configuration) TimelineClientImpl.java:130
org.apache.hadoop.service.AbstractService.init(Configuration) AbstractService.java:164
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken() YarnClientImpl.java:405
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(ContainerLaunchContext) YarnClientImpl.java:381
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(ApplicationSubmissionContext) YarnClientImpl.java:300
org.apache.tez.client.TezYarnClient.submitApplication(ApplicationSubmissionContext) TezYarnClient.java:77
org.apache.tez.client.TezClient.start() TezClient.java:402
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient, HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionState.java:451
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionPoolSession.java:124
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(48String[]) TezSessionState.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState, String[]) TezTask.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) TezTask.java:200
org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103
org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, int, DriverContext) Driver.java:2712
org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383
org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055
org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753
org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157
org.apache.hive.service.cli.operation.SQLOperation.runQuery() SQLOperation.java:226
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) SQLOperation.java:87
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() SQLOperation.java:324
javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) Subject.java
org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction) UserGroupInformation.java:1729
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() SQLOperation.java:342
java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511
java.util.concurrent.FutureTask.run() FutureTask.java:266
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1149
java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:624
java.lang.Thread.run() Thread.java:748
java.lang.Thread.<init>(Runnable, String) Thread.java
org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() ReloadingX509TrustManager.java:95
org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode) FileBasedKeyStoresFactory.java:223
org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180
org.apache.hadoop.crypto.key.kms.KMSClientProvider.<init>(URI, Configuration) KMSClientProvider.java:390
org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProviders(Configuration, URL, int, String) KMSClientProvider.java:318
org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProvider(URI, Configuration) KMSClientProvider.java:303
org.apache.hadoop.crypto.key.KeyProviderFactory.get(URI, Configuration) KeyProviderFactory.java:96
org.apache.hadoop.util.KMSUtil.createKeyProviderFromUri(Configuration, URI) KMSUtil.java:83
org.apache.hadoop.ozone.client.rpc.OzoneKMSUtil.getKeyProvider(ConfigurationSource, URI) OzoneKMSUtil.java:138
org.apache.hadoop.ozone.client.rpc.RpcClient.getKeyProvider() RpcClient.java:1310
org.apache.hadoop.ozone.client.ObjectStore.getKeyProvider() ObjectStore.java:222
org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.getKeyProvider() BasicRootedOzoneClientAdapterImpl.java:785
org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getKeyProvider() RootedOzoneFileSystem.java:54
org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getAdditionalTokenIssuers() RootedOzoneFileSystem.java:67
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer, String, Credentials, List) DelegationTokenIssuer.java:104
org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(String, Credentials) DelegationTokenIssuer.java:76
org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(FileSystem, Credentials, Configuration) TokenCache.java:140
org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(Credentials, Path[], Configuration) TokenCache.java:101
org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(Credentials, Path[], Configuration) TokenCache.java:77
org.apache.tez.client.TezClientUtils.populateTokenCache(TezConfiguration, Credentials) TezClientUtils.java:746
org.apache.tez.client.TezClientUtils.prepareAmLaunchCredentials(AMConfiguration, Credentials, TezConfiguration, Path) TezClientUtils.java:722
org.apache.tez.client.TezClientUtils.createApplicationSubmissionContext(ApplicationId, DAG, String, AMConfiguration, Map, Credentials, boolean, TezApiVersionInfo, ServicePluginsDescriptor, JavaOptsChecker) TezClientUtils.java:487
org.apache.tez.client.TezClient.setupApplicationContext() TezClient.java:501
org.apache.tez.client.TezClient.start() TezClient.java:401
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient, HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionState.java:451
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[], boolean, SessionState$LogHelper, TezSessionState$HiveResources) TezSessionPoolSession.java:124
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(String[]) TezSessionState.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState, String[]) TezTask.java:373
org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) TezTask.java:200
org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103
org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, int, DriverContext) Driver.java:2712
org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383
org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055
org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753
org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157
org.apache.hive.service.cli.operation.SQLOperation.runQuery() SQLOperation.java:226
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) SQLOperation.java:87
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() SQLOperation.java:324
javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) Subject.java
org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction) UserGroupInformation.java:1729
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() SQLOperation.java:342
java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511
java.util.concurrent.FutureTask.run() FutureTask.java:266
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1149
java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:624
java.lang.Thread.run() Thread.java:748}}
This looks similar to HDFS-14037
Attachments
Issue Links
- is fixed by
-
HDDS-5087 Ozone RPC client leaks KeyProvider instances
- Resolved