Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
There is another bug in the original tableSkew cost function for aggregation of the cost per table:
If we have 10 regions, one per table, evenly distributed on 10 nodes, the cost is scale to 1.0.
The more tables we have, the closer the value will be to 1.0. The cost function becomes useless.
All the balancer tests were set up with large numbers of tables with minimal regions per table. This artificially inflates the total cost and trigger balancer runs. With this fix on TableSkewFunction, we need to overhaul the tests too. We also need to add tests that reflect more diversified scenarios for table distribution such as large tables with large numbers of regions.
protected double cost() { double max = cluster.numRegions; double min = ((double) cluster.numRegions) / cluster.numServers; double value = 0; for (int i = 0; i < cluster.numMaxRegionsPerTable.length; i++) { value += cluster.numMaxRegionsPerTable[i]; } LOG.info("min = {}, max = {}, cost= {}", min, max, value); return scale(min, max, value); } }