Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5435

Using Limit causes Memory Leaked Error since 1.10

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.10.0
    • None
    • Storage - Parquet
    • None

    Description

      Here is the details I can provide:

      We migrated our production system from Drill 1.9 to 1.10 just 5 days ago. (220 nodes cluster)

      Our log show there was some 900+ queries ran without problem in first 4 days. (similar queries, that never use the `limit` clause)

      Yesterday we started doing simple adhoc select * ... limit 10 queries (like we often do, that was our first use of limit with 1.10)
      and we got a `Memory was leaked` exception below.

      Also, once we get the error, Most of all subsequent user queries fails with Channel Close Exception. We need to restart Drill to bring it back to normal.

      A day later, I used a similar select * limit 10 queries, and the same thing happen, had to restart Drill.

      In the exception, it was refering to a file (1_0_0.parquet)
      I moved that file to smaller test cluster (12 nodes) and got the error on the first attempt. but I am no longer able to reproduce the issue on that file. Between the 12 and 220 nodes cluster, a different Column name and Row Group Start was listed in the error.
      The parquet file was generated by Drill 1.10.

      I tried the same file with a local drill-embedded 1.9 and 1.10 and had no issue.

      Here is the error (manually typed), if you think of anything obvious, let us know.

      AsyncPageReader - User Error Occured: Exception Occurred while reading from disk (can not read class o.a.parquet.format.PageHeader: java.io.IOException: input stream is closed.)

      File:..../1_0_0.parquet
      Column: StringColXYZ
      Row Group Start: 115215476

      [Error Id: ....]
      at UserException.java:544)
      at o.a.d.exec.store.parquet.columnreaders.AsyncPageReader.handleAndThrowException(AsynvPageReader.java:199)
      at o.a.d.exec.store.parquet.columnreaders.AsyncPageReader.access(AsynvPageReader.java:81)
      at o.a.d.exec.store.parquet.columnreaders.AsyncPageReader.AsyncPageReaderTask.call(AsyncPageReader.java:483)
      at o.a.d.exec.store.parquet.columnreaders.AsyncPageReader.AsyncPageReaderTask.call(AsyncPageReader.java:392)
      at o.a.d.exec.store.parquet.columnreaders.AsyncPageReader.AsyncPageReaderTask.call(AsyncPageReader.java:392)
      ...
      Caused by: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: java.io.IOException: Input Stream is closed.
      at o.a.parquet.format.Util.read(Util.java:216)
      at o.a.parquet.format.Util.readPageHeader(Util.java:65)
      at o.a.drill.exec.store.parquet.columnreaders.AsyncPageReader(AsyncPageReaderTask:430)
      Caused by: parquet.org.apache.thrift.transport.TTransportException: Input stream is closed
      at ...read(TIOStreamTransport.java:129)
      at ....TTransport.readAll(TTransport.java:84)
      at ....TCompactProtocol.readByte(TCompactProtocol.java:474)
      at ....TCompactProtocol.readFieldBegin(TCompactProtocol.java:481)
      at ....InterningProtocol.readFieldBegin(InterningProtocol.java:158)
      at ....o.a.parquet.format.PageHeader.read(PageHeader.java:828)
      at ....o.a.parquet.format.Util.read(Util.java:213)

      Fragment 0:0
      [Error id: ...]
      o.a.drill.common.exception.UserException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (524288)
      Allocator(op:0:0:4:ParquetRowGroupScan) 1000000/524288/39919616/10000000000
      at o.a.d.common.exceptions.UserException (UserException.java:544)
      at o.a.d.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293)
      at o.a.d.exec.work.fragment.FragmentExecutor.cleanup( FragmentExecutor.java:160)
      at o.a.d.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262)
      ...
      Caused by: IllegalStateException: Memory was leaked by query. Memory leaked: (524288)
      at o.a.d.exec.memory.BaseAllocator.close(BaseAllocator.java:502)
      at o.a.d.exec.ops.OperatorContextImpl(OperatorContextImpl.java:149)
      at o.a.d.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:422)
      at o.a.d.exec.ops.FragmentContext.close(FragmentContext.java:411)
      at o.a.d.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:318)
      at o.a.d.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155)

      This fixed the problem:
      alter <session|system> set `store.parquet.reader.pagereader.async`=false;

      Attachments

        Issue Links

          Activity

            People

              parthc Parth Chandra
              fmethot F Méthot
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: