Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.18.0
-
None
-
None
-
Reviewed
Description
Problem reported here is that when the default port number (8020) is specified in the output, job succeeds but no output is created. The cause of the problem is that "listStatus" call drops the port number because NameNode.getUri removes the default port#.
Assuming that a map/reduce output directory is set to be "hdfs://localhost:8020/out", A call "listStatus" on any of its sub directory, for example, "hdfs://localhost:8020/out/tempXX", returns results like below:
hdfs://localhost/out/tempXX/part-00005
Because of this, Task.java
574 private Path getFinalPath(Path jobOutputDir, Path taskOutput) {
575 URI relativePath = taskOutputPath.toUri().relativize(taskOutput.toUri());
does not get the correct relativePath because TaskOutputPath contain ports, but taskOutput doesn't.
It seems to me that the problem could be fixed if we make Path.makeQualified() to return the same path not matter the input path contains the default port or not.
Attachments
Attachments
Issue Links
- is blocked by
-
HADOOP-4746 Job output directory should be normalized
- Closed
- relates to
-
MAPREDUCE-837 harchive fail when output directory has URI with default port of 8020
- Resolved