Description
When running Oozie jobs through Knox on a cluster with HDFS HA, one can use the logical HA service name for namenode. The workflow configuration file does not get rewritten properly if the logical name is used as it does not have a port in it.
For example in the following workflow configuration file:
<configuration> <property> <name>jobTracker</name> <value>JOBTRACKER</value> <!-- Example: <value>localhost:50300</value> --> </property> <property> <name>nameNode</name> <value>NAMENODE</value> <!-- Example: <value>hdfs://localhost:8020</value> --> </property> <property> <name>oozie.wf.application.path</name> <value>/user/guest/example</value> <!-- Example: <value>hdfs://localhost:8020/tmp/test</value> --> </property> <property> <name>user.name</name> <value>mapred</value> </property> <property> <name>inputDir</name> <value>/user/guest/example/input</value> </property> <property> <name>outputDir</name> <value>/user/guest/example/output</value> </property> </configuration>
and topology file containing the following namenode service :
<service> <role>NAMENODE</role> <url>hdfs://ha-service</url> </service>
and the command :
curl -i -k -u guest:guest-password -H Content-Type:application/xml -T workflow-configuration.xml -X POST 'https://localhost:8443/gateway/sandbox/oozie/v1/jobs?action=start'
results in the following oozie error
E0902: Exception occured: [Incomplete HDFS URI, no host: hdfs://ha-service:NAMENODE/user/guest/example]