|
|
|
YARN-2841
|
YARN-128
RMProxy should retry EOFException
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-2765
|
YARN-128
Add leveldb-based implementation for RMStateStore
|
Jason Darrell Lowe
|
Jason Darrell Lowe
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-2406
|
YARN-128
Move RM recovery related proto to yarn_server_resourcemanager_recovery.proto
|
Tsuyoshi Ozawa
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-2404
|
YARN-128
Remove ApplicationAttemptState and ApplicationState class in RMStateStore class
|
Tsuyoshi Ozawa
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-2047
|
YARN-128
RM should honor NM heartbeat expiry after RM restart
|
Unassigned
|
Bikas Saha
|
|
Open |
Unresolved
|
|
|
|
|
|
|
|
YARN-2039
|
YARN-128
Better reporting of finished containers to AMs
|
Unassigned
|
Karthik Kambatla
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-1821
|
YARN-128
NPE on registerNodeManager if the request has containers for UnmanagedAMs
|
Karthik Kambatla
|
Karthik Kambatla
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1816
|
YARN-128
Succeeded application remains in accepted after RM restart
|
Jian He
|
Arpit Gupta
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1815
|
YARN-128
Work preserving recovery of Unmanged AMs
|
Subramaniam Krishnan
|
Karthik Kambatla
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
YARN-1812
|
YARN-128
Job stays in PREP state for long time after RM Restarts
|
Jian He
|
Yesha Vora
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1770
|
YARN-128
Execessive logging for app and attempts on RM recovery
|
Unassigned
|
Jian He
|
|
Open |
Unresolved
|
|
|
|
|
|
|
|
YARN-1671
|
YARN-128
Revisit RMApp transitions from NEW on RECOVER
|
Unassigned
|
Karthik Kambatla
|
|
Resolved |
Not A Problem
|
|
|
|
|
|
|
|
YARN-1618
|
YARN-128
Fix invalid RMApp transition from NEW to FINAL_SAVING
|
Karthik Kambatla
|
Karthik Kambatla
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1507
|
YARN-128
Apps should be saved after it's accepted by the scheduler
|
Jian He
|
Jian He
|
|
Open |
Unresolved
|
|
|
|
|
|
|
|
YARN-1446
|
YARN-128
Change killing application to wait until state store is done
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1406
|
YARN-128
Check time cost for recovering max-app-limit applications
|
Jian He
|
Jian He
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
YARN-1405
|
YARN-128
RM hangs on shutdown if calling system.exit in serviceInit or serviceStart
|
Jian He
|
Yesha Vora
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1378
|
YARN-128
Implement a RMStateStore cleaner for deleting application/attempt info
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1348
|
YARN-128
Batching optimization for ZKRMStateStore
|
Tsuyoshi Ozawa
|
Tsuyoshi Ozawa
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
YARN-1307
|
YARN-128
Rethink znode structure for RM HA
|
Tsuyoshi Ozawa
|
Tsuyoshi Ozawa
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1239
|
YARN-128
Save version information in the state store
|
Jian He
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1214
|
YARN-128
Register ClientToken MasterKey in SecretManager after it is saved
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1210
|
YARN-128
During RM restart, RM should start a new attempt only when previous attempt exits for real
|
Omkar Vinit Joshi
|
Vinod Kumar Vavilapalli
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1207
|
YARN-128
AM fails to register if RM restarts within 5s of job submission
|
Unassigned
|
Arpit Gupta
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-1195
|
YARN-128
RM may relaunch already KILLED / FAILED jobs after RM restarts
|
Jian He
|
Jian He
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-1185
|
YARN-128
FileSystemRMStateStore can leave partial files that prevent subsequent recovery
|
Omkar Vinit Joshi
|
Jason Darrell Lowe
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1121
|
YARN-128
RMStateStore should flush all pending store events before closing
|
Jian He
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1116
|
YARN-128
Populate AMRMTokens back to AMRMTokenSecretManager after RM restarts
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-1058
|
YARN-128
Recovery issues on RM Restart with FileSystemRMStateStore
|
Karthik Kambatla
|
Karthik Kambatla
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
YARN-1055
|
YARN-128
Handle app recovery differently for AM failures and RM restart
|
Unassigned
|
Karthik Kambatla
|
|
Resolved |
Invalid
|
|
|
|
|
|
|
|
YARN-1017
|
YARN-128
Document RM Restart feature
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-922
|
YARN-128
Change FileSystemRMStateStore to use directories
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-920
|
YARN-128
List of applications at NM web UI is inconsistent with applications at RM UI after RM restart
|
Jian He
|
Jian He
|
|
Resolved |
Cannot Reproduce
|
|
|
|
|
|
|
|
YARN-915
|
YARN-128
Apps Completed metrics on web UI is not correct after RM restart
|
Jian He
|
Jian He
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-901
|
YARN-128
"Active users" field in Resourcemanager scheduler UI gives negative values
|
Unassigned
|
Nishan Shetty
|
|
Resolved |
Cannot Reproduce
|
|
|
|
|
|
|
|
YARN-895
|
YARN-128
RM crashes if it restarts while the state-store is down
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-891
|
YARN-128
Store completed application information in RM state store
|
Jian He
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-709
|
YARN-128
verify that new jobs submitted with old RM delegation tokens after RM restart are accepted
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-674
|
YARN-128
Slow or failing DelegationToken renewals on submission itself make RM unavailable
|
Omkar Vinit Joshi
|
Vinod Kumar Vavilapalli
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-659
|
YARN-128
RMStateStore's removeApplication APIs should just take an applicationId
|
Tsuyoshi Ozawa
|
Vinod Kumar Vavilapalli
|
|
Resolved |
Invalid
|
|
|
|
|
|
|
|
YARN-638
|
YARN-128
Restore RMDelegationTokens after RM Restart
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-636
|
YARN-128
Restore clientToken for app attempt after RM restart
|
Jian He
|
Jian He
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-582
|
YARN-128
Restore appToken and clientToken for app attempt after RM restart
|
Jian He
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-581
|
YARN-128
Test and verify that app delegation tokens are added to tokenRenewer after RM restart
|
Jian He
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-562
|
YARN-128
NM should reject containers allocated by previous RM
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-540
|
YARN-128
Race condition causing RM to potentially relaunch already unregistered AMs on RM restart
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-534
|
YARN-128
AM max attempts is not checked when RM restart and try to recover attempts
|
Jian He
|
Jian He
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-514
|
YARN-128
Delayed store operations should not result in RM unavailability for app submission
|
Zhijie Shen
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-513
|
YARN-128
Create common proxy client for communicating with RM
|
Jian He
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-430
|
YARN-128
Add HDFS based store for RM which manages the store using directories
|
Jian He
|
Bikas Saha
|
|
Resolved |
Not A Problem
|
|
|
|
|
|
|
|
YARN-353
|
YARN-128
Add Zookeeper-based store implementation for RMStateStore
|
Karthik Kambatla
|
Hitesh Shah
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-248
|
YARN-128
Security related work for RM restart
|
Bikas Saha
|
Thomas White
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-232
|
YARN-128
Add FileSystem based store for RM
|
Bikas Saha
|
Bikas Saha
|
|
Resolved |
Duplicate
|
|
|
|
|
|
|
|
YARN-231
|
YARN-128
Add FS-based persistent store implementation for RMStateStore
|
Bikas Saha
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-230
|
YARN-128
Make changes for RM restart phase 1
|
Bikas Saha
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
YARN-229
|
YARN-128
Remove old code for restart
|
Bikas Saha
|
Bikas Saha
|
|
Closed |
Fixed
|
|
|
|
|