Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The metrics API returns the actual partition of a tablet as one of it's attributes:
{ "type": "tablet", "id": "527d053abe3b450fac2e23c1e58b29f7", "attributes": { "partition": "HASH (hash_key_1) PARTITION 1, HASH (hash_key_2, hash_key_2) PARTITION 7, RANGE (timestamp) PARTITION 1649980800000 <= VALUES < 1651363200000", "table_name": "impala::dbname.tablename", "table_id": "63655530f5e743b1a710ba2c857b52b7" }, "metrics": [ (...) ] }
With this "partition" atribute, we can identify to which actual partition the metrics belongs to, and we could use this for metrics analytics.
However, the textual format of this attribute is good for human interpretation, it is much harder to parse it with code. I'd recommend adding a new attribute where the same info could be retrieved in a json format, something like this, for example:
"attributes": { "partition": "HASH (hash_key_1) PARTITION 1, HASH (hash_key_2, hash_key_3) PARTITION 7, RANGE (timestamp) PARTITION 1649980800000 <= VALUES < 1651363200000", "partition_json" : [ { "type": "HASH", "keys": [ "hash_key_1" ], "number": 1 }, { "type": "HASH", "keys": [ "hash_key_2", "hash_key_3" ], "number": 7 }, { "type": "RANGE", "keys": [ "timestamp" ], "start_value": 1649980800000, "start_op": "<=", "end_op": "<", "end_value": 1651363200000, } ] }