Details
-
Sub-task
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
Reviewed
Description
Generated 3 million keys in ozone, and run listBucket command to get a list of buckets under a volume,
bin/hdfs oz -listBucket http://15oz1.fyre.ibm.com:9864/vol-0-15143 -user wwei
this call spent over 15 seconds to finish. The problem was caused by the inflexible structure of KSM DB. Right now ksm.db stores keys like following
/v1/b1 /v1/b1/k1 /v1/b1/k2 /v1/b1/k3 /v1/b2 /v1/b2/k1 /v1/b2/k2 /v1/b2/k3 /v1/b3 /v1/b4
keys are sorted in nature order so when we do list buckets under a volume e.g /v1, we need to seek to /v1 point and start to iterate and filter keys, this ends up with scanning all keys under volume /v1. The problem with this design is we don't have an efficient approach to locate all buckets without scanning the keys.