Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1194

The kafka broker cannot delete the old log files after the configured time

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 0.10.0.0, 0.11.0.0, 1.0.0
    • None
    • log
    • window

    Description

      We tested it in windows environment, and set the log.retention.hours to 24 hours.

      1. The minimum age of a log file to be eligible for deletion
        log.retention.hours=24

      After several days, the kafka broker still cannot delete the old log file. And we get the following exceptions:

      [2013-12-19 01:57:38,528] ERROR Uncaught exception in scheduled task 'kafka-log-retention' (kafka.utils.KafkaScheduler)
      kafka.common.KafkaStorageException: Failed to change the log file suffix from to .deleted for log segment 1516723
      at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:249)
      at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:638)
      at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:629)
      at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:418)
      at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:418)
      at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
      at scala.collection.immutable.List.foreach(List.scala:76)
      at kafka.log.Log.deleteOldSegments(Log.scala:418)
      at kafka.log.LogManager.kafka$log$LogManager$$cleanupExpiredSegments(LogManager.scala:284)
      at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:316)
      at kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:314)
      at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:743)
      at scala.collection.Iterator$class.foreach(Iterator.scala:772)
      at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:573)
      at scala.collection.IterableLike$class.foreach(IterableLike.scala:73)
      at scala.collection.JavaConversions$JListWrapper.foreach(JavaConversions.scala:615)
      at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:742)
      at kafka.log.LogManager.cleanupLogs(LogManager.scala:314)
      at kafka.log.LogManager$$anonfun$startup$1.apply$mcV$sp(LogManager.scala:143)
      at kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:724)

      I think this error happens because kafka tries to rename the log file when it is still opened. So we should close the file first before rename.

      The index file uses a special data structure, the MappedByteBuffer. Javadoc describes it as:
      A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected.
      Fortunately, I find a forceUnmap function in kafka code, and perhaps it can be used to free the MappedByteBuffer.

      Attachments

        1. KAFKA-1194.patch
          0.7 kB
          Tao Qin
        2. kafka-1194-v1.patch
          2 kB
          Tao Qin
        3. kafka-1194-v2.patch
          33 kB
          Ravi Peri
        4. Untitled.jpg
          126 kB
          Abhi
        5. screenshot-1.png
          88 kB
          Abhi
        6. kafka-bombarder.7z
          50 kB
          Kobi Hikri
        7. RetentionExpiredWindows.txt
          166 kB
          Kobi Hikri
        8. image-2018-09-12-14-25-52-632.png
          175 kB
          Kobi Hikri
        9. image-2018-11-26-10-18-59-381.png
          63 kB
          Kobi Hikri
        10. SendFor1194StreamCrash.zip
          4 kB
          lkgen
        11. kafka370fixwin.patch
          13 kB
          lkgen

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tqin Tao Qin
              Votes:
              38 Vote for this issue
              Watchers:
              68 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - 72h
                  72h
                  Remaining:
                  Remaining Estimate - 72h
                  72h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified