Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
3.0.0
-
None
-
None
Description
if namenode is under safe mode and let restart two journal node for maintenance activity.
In this case, the journal node will not finalize the last edit segment which is edit in-progress.
This last edit segment will be finalized or recovered when edit rolling operation else when epoch change due to namenode failover.
But the current scenario is no failover, just namenode is under safe mode. If we leave the safe mode then active namenode will crash.
Ie.
the current open segment is edits_inprogress_0000000010356376710 but it is not recovered or finalized post JN2 restart. I think we need to recover the edits after JN restart.
Journal node 2020-06-20 16:11:53,458 INFO server.Journal (Journal.java:scanStorageForLatestEdits(193)) - Latest log is EditLogFile(file=/hadoop/hdfs/journal/xxx/current/edits_inprogress_0000000010356376710,first=0000000010356376710,last=0000000010356376710,inProgress=true,hasCorruptHeader=false) 2020-06-20 16:19:06,397 INFO ipc.Server (Server.java:logException(2435)) - IPC Server handler 3 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from 10.x.x.x:28444 Call#49083225 Retry#0 org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't write, no segment open at org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:484)
{code:java}
Namenode log:
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 1 successful responses:
10.x.x.x:8485: null [success]
2 exceptions thrown:
10.y.y.y:8485: Can't write, no segment open