Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5413

Log cleaner fails due to large offset in segment file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.10.2.0, 0.10.2.1
    • 0.10.2.2, 0.11.0.0
    • core
    • Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0

    Description

      The log cleaner thread in our brokers is failing with the trace below

      [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
      [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
      [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
      java.lang.IllegalArgumentException: requirement failed: largest offset in message set can not be safely converted to relative offset.
              at scala.Predef$.require(Predef.scala:224)
              at kafka.log.LogSegment.append(LogSegment.scala:109)
              at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478)
              at kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405)
              at kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401)
              at scala.collection.immutable.List.foreach(List.scala:381)
              at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
              at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363)
              at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362)
              at scala.collection.immutable.List.foreach(List.scala:381)
              at kafka.log.Cleaner.clean(LogCleaner.scala:362)
              at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
              at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
              at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
      [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
      

      This seems to point at the specific line here in the kafka src where the difference is actually larger than MAXINT as both baseOffset and offset are of type long. It was introduced in this pr

      These were the outputs of dumping the first two log segments

      :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration --files /kafka-logs/__consumer_offsets-12/000
      00000000000000000.log
      Dumping /kafka-logs/__consumer_offsets-12/00000000000000000000.log
      Starting offset: 0
      offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34
      
      :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration --files /kafka-logs/__consumer_offsets-12/000
      00000002147343575.log
      Dumping /kafka-logs/__consumer_offsets-12/00000000002147343575.log
      Starting offset: 2147343575
      offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo
      adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34
      

      My guess is that since 2147539884 is larger than MAXINT, we are hitting this exception. Was there a specific reason, this check was added in 0.10.2?

      E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of "key 1" following, wouldn't we run into this situation whenever the log cleaner runs?

      Attachments

        1. kafka-5413.patch
          2 kB
          Kelvin Rutt
        2. 00000000000000000000.index.cleaned
          10.00 MB
          Nicholas Ngorok
        3. 00000000000000000000.timeindex.cleaned
          10.00 MB
          Nicholas Ngorok
        4. 00000000000000000000.log.cleaned
          48 kB
          Nicholas Ngorok
        5. 00000000002147422683.log
          0.4 kB
          Nicholas Ngorok
        6. 00000000000000000000.log
          48 kB
          Nicholas Ngorok

        Activity

          People

            Kelvinrutt Kelvin Rutt
            ny2ko Nicholas Ngorok
            Votes:
            0 Vote for this issue
            Watchers:
            21 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: