Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-25924

Seeing a spike in uncleanlyClosedWALs metric.

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Getting the following log line in all of our production clusters when WALEntryStream is dequeuing WAL file.

       2021-05-02 04:01:30,437 DEBUG [04901996] regionserver.WALEntryStream - Reached the end of WAL file hdfs://<wal-file-name>. It was not closed cleanly, so we did not parse 8 bytes of data. This is normally ok.
      

      The 8 bytes are usually the trailer serialized size (SIZE_OF_INT (4bytes) + "LAWP" (4 bytes) = 8 bytes)

      While dequeue'ing the WAL file from WALEntryStream, we reset the reader here.
      WALEntryStream

        private void tryAdvanceEntry() throws IOException {
          if (checkReader()) {
            readNextEntryAndSetPosition();
            if (currentEntry == null) { // no more entries in this log file - see if log was rolled
              if (logQueue.getQueue(walGroupId).size() > 1) { // log was rolled
                // Before dequeueing, we should always get one more attempt at reading.
                // This is in case more entries came in after we opened the reader,
                // and a new log was enqueued while we were reading. See HBASE-6758
                resetReader(); ---> HERE
                readNextEntryAndSetPosition();
                if (currentEntry == null) {
                  if (checkAllBytesParsed()) { // now we're certain we're done with this log file
                    dequeueCurrentLog();
                    if (openNextLog()) {
                      readNextEntryAndSetPosition();
                    }
                  }
                }
              } // no other logs, we've simply hit the end of the current open log. Do nothing
            }
          }
          // do nothing if we don't have a WAL Reader (e.g. if there's no logs in queue)
        }
      

      In resetReader, we call the following methods, WALEntryStream#resetReader ----> ProtobufLogReader#reset ---> ProtobufLogReader#initInternal.
      In ProtobufLogReader#initInternal, we try to create the whole reader object from scratch to see if any new data has been written.
      We reset all the fields of ProtobufLogReader except for ReaderBase#fileLength.
      We calculate whether trailer is present or not depending on fileLength.

      Attachments

        Issue Links

          Activity

            People

              shahrs87 Rushabh Shah
              shahrs87 Rushabh Shah
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: