Description
the Multipart Uploaders are created via service loaders. This is troublesome
HADOOP-12636,HADOOP-13323, HADOOP-13625 highlight how the load process forces the transient loading of dependencies. If a dependent class cannot be loaded (e.g aws-sdk is not on the classpath), that service won't load. Without error handling round the load process, this stops any uploader from loading. Even with that error handling, the performance hit of that load, especially with reshaded dependencies, hurts performance (HADOOP-13138).- it makes wrapping the the load with any filter impossible, stops transitive binding through viewFS, mocking, etc.
- It complicates security in a kerberized world. If you have an FS instance of user A, then you should be able to create an MPU instance with that user's permissions. currently, if a service were to try to create one, you'd be looking at doAs() games around the service loading, and a more complex bind process.
Proposed
- remove the service loader mech entirely
- add to FS & FC as createMultipartUploader(path) call, which will create one bound to the current FS, with its permissions, DTs, etc.
Attachments
Issue Links
- breaks
-
HADOOP-17233 Fix unit test of HDFS-13934
- Resolved
-
HDFS-15471 TestHDFSContractMultipartUploader fails on trunk
- Resolved
- relates to
-
HDFS-15466 remove META-INF/services/org.apache.hadoop.fs.MultipartUploader file
- Open
-
HADOOP-16150 checksumFS doesn't wrap concat(): concatenated files don't have checksums
- Resolved
- links to