Description
Additional query details wanted for TEZ-3530. The additional details discussed include the following:
Publish the following info ( in addition to existing bits published today):
Application Id to which the query was submitted (primary filter)
DAG Id (primary filter)
Hive query name (primary filter)
Hive Configs (everything a set command would provide except for sensitive credential info)
Potentially publish source of config i.e. set in hive query script vs hive-site.xml, etc.
Which HiveServer2 the query was submitted to
*Which IP/host the query was submitted from - not sure what filter support will be available.
Which execution mode the query is running in (primary filter)
What submission mode was used (cli/beeline/jdbc, etc)
User info ( running as, actual end user, etc) - not sure if already present
Perf logger events. The data published should be able to create a timeline view of the query i.e. actual submission time, query compile timestamps, execution timestamps, post-exec data moves, etc.
Explain plan with enough details for visualizing.
Databases and tables being queried (primary filter)
Yarn queue info (primary filter)
Caller context (primary filter)
Original source i.e. submitter
Thread info in HS2 if needed ( I believe Vikram may have added this earlier )
Query time taken (with filter support )
Additional context info e.g. llap instance name and appId if required.