Details
Description
We have groups of people that have their own set of HDFS directories.
For example, they have HDFS staging place for new files:
/datascience
/analysts
...
but at the same time they have Hive warehouse directory
/hivewarehouse/datascience
/hivewarehouse/analysts
...
on top of that they also have some files stored under /user/${username}/
It's always been a challenge to maintain a combined quota on all HDFS locations a particular group of people owns. As we're currently forced to put a particular quota for each directory independently.
It would be great if HDFS would have a quota tied either
- to a set of HDFS locations ;
- or to a group of people (where `group`is defined as which HDFS group a particular file/directory belongs to).
Linux allows to define quotas at group level, i.e. `edquota -g devel` etc.. would be great to have the same at HDFS level.
Other thoughts and ideas?