Details
-
Sub-task
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
We use int32 for memory now, if a cluster has 10k nodes, each node has 210G memory, we will get a negative total cluster memory.
And another case that easier overflows int32 is: we added all pending resources of running apps to cluster's total pending resources. If a problematic app requires too much resources (let's say 1M+ containers, each of them has 3G containers), int32 will be not enough.
Even if we can cap each app's pending request, we cannot handle the case that there're many running apps, each of them has capped but still significant numbers of pending resources.
So we may possibly need to add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource.
Attachments
Attachments
Issue Links
- breaks
-
SLIDER-1145 More Hadoop 2.8 changes break Slider mocking
- Resolved
- causes
-
YARN-5270 Solve miscellaneous issues caused by YARN-4844
- Resolved
-
YARN-7270 Fix unsafe casting from long to int for class Resource and its sub-classes
- Resolved
- duplicates
-
YARN-4618 RM Stops allocating containers if large number of pending containers
- Resolved
- relates to
-
YARN-4618 RM Stops allocating containers if large number of pending containers
- Resolved
-
MAPREDUCE-6689 MapReduce job can infinitely increase number of reducer resource requests
- Closed