[YARN-8569] Create an interface to provide cluster information to application - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.3.0
Component/s: None
Labels:
- Docker

Target Version/s:

3.2.0

Description

Some program requires container hostnames to be known for application to run. For example, distributed tensorflow requires launch_command that looks like:

# On ps0.example.com:
$ python trainer.py \
     --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
     --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
     --job_name=ps --task_index=0
# On ps1.example.com:
$ python trainer.py \
     --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
     --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
     --job_name=ps --task_index=1
# On worker0.example.com:
$ python trainer.py \
     --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
     --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
     --job_name=worker --task_index=0
# On worker1.example.com:
$ python trainer.py \
     --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
     --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
     --job_name=worker --task_index=1

This is a bit cumbersome to orchestrate via Distributed Shell, or YARN services launch_command. In addition, the dynamic parameters do not work with YARN flex command. This is the classic pain point for application developer attempt to automate system environment settings as parameter to end user application.

It would be great if YARN Docker integration can provide a simple option to expose hostnames of the yarn service via a mounted file. The file content gets updated when flex command is performed. This allows application developer to consume system environment settings via a standard interface. It is like /proc/devices for Linux, but for Hadoop. This may involve updating a file in distributed cache, and allow mounting of the file via container-executor.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-8569.001.patch
21/Aug/18 01:23
5 kB
Eric Yang
YARN-8569.002.patch
29/Aug/18 00:18
45 kB
Eric Yang
YARN-8569.003.patch
30/Aug/18 21:21
53 kB
Eric Yang
YARN-8569.004.patch
30/Aug/18 22:34
52 kB
Eric Yang
YARN-8569 YARN sysfs interface to provide cluster information to application.pdf
31/Aug/18 01:45
77 kB
Eric Yang
YARN-8569.005.patch
31/Aug/18 20:48
54 kB
Eric Yang
YARN-8569.006.patch
07/Sep/18 21:54
65 kB
Eric Yang
YARN-8569.007.patch
09/Sep/18 03:18
65 kB
Eric Yang
YARN-8569.008.patch
10/Sep/18 18:25
64 kB
Eric Yang
YARN-8569.009.patch
05/Oct/18 19:34
63 kB
Eric Yang
YARN-8569.010.patch
08/Oct/18 17:52
62 kB
Eric Yang
YARN-8569.011.patch
09/Oct/18 17:16
58 kB
Eric Yang
YARN-8569.012.patch
11/Oct/18 20:45
58 kB
Eric Yang
YARN-8569.013.patch
18/Oct/18 23:52
59 kB
Eric Yang
YARN-8569.014.patch
19/Oct/18 16:53
59 kB
Eric Yang
YARN-8569.015.patch
22/Oct/18 20:05
59 kB
Eric Yang
YARN-8569.016.patch
24/Oct/18 20:49
59 kB
Eric Yang

Issue Links

is blocked by

YARN-8922 Fix test-container-executor

Resolved

relates to

YARN-8863 Define yarn node manager local dirs in container-executor.cfg

Open

Activity

People

Assignee:: Eric Yang

Reporter:: Eric Yang

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 23/Jul/18 20:56

Updated:: 27/Oct/18 03:47

Resolved:: 27/Oct/18 01:01