Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.0.0
-
None
-
None
Description
Compaction should generate stats about number of files it reads, min/max/avg size etc. It should also generate alerts if it looks like the system is not configured correctly.
For example, if there are lots of delta files with very small files, it's a good sign that Streaming API is configured with batches that are too small.
Simplest idea is to add another periodic task to AcidHouseKeeperService to
//periodically do select count, min(txnid),max(txnid), type from txns group by type.
//1. dump that to log file at info
//2. could also keep counts for last 10min, hour, 6 hours, 24 hours, etc
//2.2 if a large increase is detected - issue alert (at least to the log for now) at warn/error
Should also alert if there is ACID activity but no compactions running.
One way to do this is to add logic to TxnHandler to periodically check contents of COMPACTION_QUEUE table and keep a simple histogram of compactions over last few hours.
Similarly can run a periodic check of transactions started (or committed/aborted) and keep a simple histogram. Then the 2 can be used to detect that there is ACID write activity but no compaction activity.
Attachments
Attachments
Issue Links
- depends upon
-
HIVE-12832 RDBMS schema changes for HIVE-11388
- Closed
- relates to
-
HIVE-12353 When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.
- Closed
-
HIVE-16361 Automatically kill runaway client processes
- Open