Description
When starting a task with a s3a:// URI, the fetcher fails to download the URI, failing when trying to bind to the slave's port 5051. The URI gets successfully downloaded, but the error is fatal. If the URI is changed to http://. The root cause of this is that apparently the mesos-fetcher process has LIBPROCESS_PORT=5051 in its environment as I was able to find from cat "/proc/`pgrep mesos-fetcher`/environ".
stderr from a failing task:
I0203 00:11:55.815500 4964 fetcher.cpp:424] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/ede0e5bc-d7ac-4b9a-8d35-b210fa785db0-S0","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"s3a:\/\/strava.mesos\/foo"}}],"sandbox_directory":"\/mnt\/mesos\/slaves\/ede0e5bc-d7ac-4b9a-8d35-b210fa785db0-S0\/frameworks\/fe927665-1516-46cf-94dd-6d2ca84007f1-0000\/executors\/uris-test.bc047306-ca0a-11e5-b742-e2162bf6108e\/runs\/24ebd807-b065-4776-a0bf-84bda4a82f01"}
I0203 00:11:55.816830 4964 fetcher.cpp:379] Fetching URI 's3a://strava.mesos/foo'
I0203 00:11:55.816846 4964 fetcher.cpp:250] Fetching directly into the sandbox directory
I0203 00:11:55.816864 4964 fetcher.cpp:187] Fetching URI 's3a://strava.mesos/foo'
I0203 00:11:56.191640 4964 fetcher.cpp:109] Downloading resource with Hadoop client from 's3a://strava.mesos/foo' to '/mnt/mesos/slaves/ede0e5bc-d7ac-4b9a-8d35-b210fa785db0-S0/frameworks/fe927665-1516-46cf-94dd-6d2ca84007f1-0000/executors/uris-test.bc047306-ca0a-11e5-b742-e2162bf6108e/runs/24ebd807-b065-4776-a0bf-84bda4a82f01/foo'
F0203 00:11:56.192503 4964 process.cpp:892] Failed to initialize: Failed to bind on 0.0.0.0:5051: Address already in use: Address already in use [98]
- Check failure stack trace: ***
@ 0x7f229ce50e7d google::LogMessage::Fail()
@ 0x7f229ce52c10 google::LogMessage::SendToLog()
@ 0x7f229ce50a42 google::LogMessage::Flush()
@ 0x7f229ce50c89 google::LogMessage::~LogMessage()
@ 0x7f229ce51c32 google::ErrnoLogMessage::~ErrnoLogMessage()
@ 0x7f229cdf16b9 process::initialize()
@ 0x7f229cdf2f36 process::ProcessBase::ProcessBase()
@ 0x7f229ce22875 process::reap()
@ 0x7f229ce2ced7 process::subprocess()
@ 0x7f229c50ab7b HDFS::copyToLocal()
@ 0x40f03e download()
@ 0x40b69f main
@ 0x7f229adc8a40 (unknown)
@ 0x40cf59 _start
Aborted (core dumped)
Attachments
Attachments
Issue Links
- relates to
-
MESOS-4598 Logrotate ContainerLogger should not remove IP from environment.
- Resolved
-
MESOS-4609 Subprocess should be more intelligent about setting/inheriting libprocess environment variables
- Accepted