[DRILL-6879] Indicate a warning in the WebUI when a query makes little to no progress for a while - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.14.0
Fix Version/s: 1.16.0
Component/s: Execution - Monitoring, Web Server
Labels:
- doc-complete
- ready-to-commit

Description

When running a very large query on a cluster with limited resource, we noticed that one of the node's VM thread freezes the fragment threads as it tries to do some work (GC perhaps?). This is a clear indication that the query is stuck in a weird state where it might not recover from.
Under such circumstances, it makes sense to cancel or atleast warn the user on that page of the query exceeding a certain threshold.
For detecting this, the user will find that the Last Progress column in the Fragments Overview section will show large times.

In addition, there are instances where a query might have buffered operators spilling to disk, which also hits performance (and, subsequently, longer run times). Calling out this skew can be very useful.