Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Use case: Suppose, the size of source file (f1.txt) is 1 GB and the block size is 128 MB. I want to copy the file in destination as follows:
f1.txt.part1
f2.txt.part2
....
By default, size of each part file is 128 MB except the last part.
Design: Currently, the BlockWriter is restricted to write the part files into the HDFS on which the app is running. To achieve the above use case, operator needs the block index and relative path information. BlockMetadata which is the input port for the BlockWriter doesn't have these information.
So, I am creating the new operator(PartFileWriter) which extends from BlockWriter with the input port of type FileMetadata.