Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
JobClient cannot use a non-default Job Tracker server:
It will use the Job Tracker specified in conf/hadoop-default.xml or conf/hadoop-site.xml
For users with multiple Hadoop systems, it is useful to be able to specify the Job Tracker.
Other hadoop command-line tools like DFSShell already have:
>bin/hadoop dfs
Usage: java DFSShell [-local | -dfs <namenode:port>] ...
Similarly I propose to add a -jt parameter:
>bin/hadoop job
JobClient -submit <job> | -status <id> | -kill <id> [-jt <jobtracker:port>|<config>]
Where: -jt <jobtracker:port> is similar to -dfs <namenode:port>
And: jt <config> will load as a final resource: hadoop<config>.xml
The latter syntax is discoverable by users because on failure the tool will say:
>bin/hadoop job -kill m7n6pi -jt unknown
Exception in thread "main" java.lang.RuntimeException: hadoop-unknown.xml not found on CLASSPATH
Or in case of success:
>bin/hadoop job -kill job_m7n6pi -jt myconfig
060221 221911 parsing file:/trunk/conf/hadoop-default.xml
060221 221911 parsing file:/trunk/conf/hadoop-myconfig.xml
060221 221911 parsing file:/trunk/conf/hadoop-site.xml
060221 221911 Client connection to 66.196.91.10:7020: starting
And with a machine:port spec:
>bin/hadoop job -kill job_m7n6pi -jt machine:8020
060221 222109 parsing file:/trunk/conf/hadoop-default.xml
060221 222109 parsing file:/trunk/conf/hadoop-site.xml
060221 222109 Client connection to 66.196.91.10:8020: starting
Patch attached.