Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
REEF Evaluators and Drivers take over the whole memory of their host container as part of the JVM heap. This has bitten us in the past when Netty's buffers pushed us over the memory limit, which (on YARN) results in killed containers.
As a remedy, we added the JVM heap "slack" option, to leave some memory unallocated to the heap. This "solved" the issue at hand, but is hardly a principled solution.
Also, this is fundamentally an application concern, not one that REEF can generically solve. Consider Tasks that spawn processes as an example: In that case, the majority of the container memory should be used for that process, not the JVM hosting the Task.
Hence, we should add fine-grained control over the JVM startup options to the REEF API.
Attachments
Issue Links
- is blocked by
-
REEF-128 Replace the protocol buffer use in the runtime API with POJOs
- Resolved