Uploaded image for project: 'Apache IoTDB'
  1. Apache IoTDB
  2. IOTDB-4873

Multi-user concurrent write and query + [ select into ] : ERROR o.a.i.c.m.t.MultiLeaderConsensusIService$AsyncProcessor$syncLog$1:903 - Exception inside handler

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.14.0-SNAPSHOT
    • 0.14.0
    • mpp-cluster
    • None
    • 2022-11-Cluster

    Description

      master_1107_523e82a
      1. start 3rep ,3C 3D cluster
      2. Start benchmark concurrent writes and queries
      3. After 16 hours, ip62 execute " select into "
      About 1000 SQL, single user execution :
      ”select s_0,s_1,s_2,s_3,s_4,s_5,s_6,s_7,s_8,s_9,s_10 into root.test.g_1.:: from root.test.g_1.d_ip62_660”

      ip62 datanode displays the following error log :
      2022-11-08 09:27:31,366 [pool-20-IoTDB-MultiLeaderConsensusRPC-Processor-72] ERROR o.a.i.c.m.t.MultiLeaderConsensusIService$AsyncProcessor$syncLog$1:903 - Exception inside handler
      java.lang.NullPointerException: null
      at org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.mergeInsertNodes(DataRegionStateMachine.java:376)
      at org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.grabInsertNode(DataRegionStateMachine.java:295)
      at org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.deserializeAndWrap(DataRegionStateMachine.java:272)
      at org.apache.iotdb.db.consensus.statemachine.DataRegionStateMachine.write(DataRegionStateMachine.java:325)
      at org.apache.iotdb.consensus.multileader.service.MultiLeaderRPCServiceProcessor.syncLog(MultiLeaderRPCServiceProcessor.java:132)
      at org.apache.iotdb.consensus.multileader.thrift.MultiLeaderConsensusIService$AsyncProcessor$syncLog.start(MultiLeaderConsensusIService.java:922)
      at org.apache.iotdb.consensus.multileader.thrift.MultiLeaderConsensusIService$AsyncProcessor$syncLog.start(MultiLeaderConsensusIService.java:865)
      at org.apache.thrift.TBaseAsyncProcessor.process(TBaseAsyncProcessor.java:103)
      at org.apache.thrift.server.AbstractNonblockingServer$AsyncFrameBuffer.invoke(AbstractNonblockingServer.java:603)
      at org.apache.thrift.server.Invocation.run(Invocation.java:18)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)

      2022-11-08 09:27:50,962 [Query-Worker-Thread-48$20221108_012730_15774_3.1.0] ERROR o.a.i.d.m.e.o.p.AbstractIntoOperator:123 - Error occurred while inserting tablets in SELECT INTO: can't connect to node {}TEndPoint(ip:192.168.10.68, port:9003)
      2022-11-08 09:27:50,962 [Query-Worker-Thread-48$20221108_012730_15774_3.1.0] ERROR o.a.i.d.m.e.s.AbstractDriverThread:80 - [ExecuteFailed]
      org.apache.iotdb.db.exception.IntoProcessException: Error occurred while inserting tablets in SELECT INTO: can't connect to node {}TEndPoint(ip:192.168.10.68, port:9003)
      at org.apache.iotdb.db.mpp.execution.operator.process.AbstractIntoOperator.insertMultiTabletsInternally(AbstractIntoOperator.java:124)
      at org.apache.iotdb.db.mpp.execution.operator.process.IntoOperator.next(IntoOperator.java:73)
      at org.apache.iotdb.db.mpp.execution.driver.Driver.processInternal(Driver.java:186)
      at org.apache.iotdb.db.mpp.execution.driver.Driver.lambda$processFor$1(Driver.java:125)
      at org.apache.iotdb.db.mpp.execution.driver.Driver.tryWithLock(Driver.java:270)
      at org.apache.iotdb.db.mpp.execution.driver.Driver.processFor(Driver.java:118)
      at org.apache.iotdb.db.mpp.execution.schedule.DriverTaskThread.execute(DriverTaskThread.java:64)
      at org.apache.iotdb.db.mpp.execution.schedule.AbstractDriverThread.run(AbstractDriverThread.java:74)
      2022-11-08 09:27:50,966 [Query-Worker-Thread-48$20221108_012730_15774_3.1.0] WARN o.a.i.d.m.e.s.DriverScheduler$Scheduler:387 - The task 20221108_012730_15774_3.1.0 is aborted. All other tasks in the same query will be cancelled

      TEST ENV:
      1. 192.168.10.62 66 64 72CPU 256GB

      ConfigNode :
      MAX_HEAP_SIZE="12G"
      MAX_DIRECT_MEMORY_SIZE="6G"

      Common :
      config_node_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
      schema_replication_factor=3
      schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
      data_replication_factor=3
      data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.MultiLeaderConsensus
      query_timeout_threshold=36000000
      multi_leader_throttle_threshold_in_byte=536870912000

      DataNode :
      MAX_HEAP_SIZE="192G"
      MAX_DIRECT_MEMORY_SIZE="32G"

      2. benchmark configuration
      192.168.10.64 : /data/liuzhen_test/weektest/benchmark_tool

      DEVICE_NUMBER=1000
      SENSOR_NUMBER=3000
      CLIENT_NUMBER=100
      DEVICE_NAME_PREFIX=d_ip62_
      SG_STRATEGY=mod
      GROUP_NUMBER=1
      OPERATION_PROPORTION=70:1:1:1:1:0:1:1:1:1:1

      3. select into is executed after the Benchmark runs for 16 hours(It's still running)
      The file is attached.

      Attachments

        1. image-2022-12-01-11-31-39-759.png
          376 kB
          刘珍
        2. image-2022-11-14-09-19-47-544.png
          112 kB
          Jinrui Zhang
        3. image-2022-11-14-09-18-10-120.png
          112 kB
          Jinrui Zhang
        4. image-2022-11-14-09-17-48-992.png
          112 kB
          Jinrui Zhang
        5. select_into.sh
          148 kB
          刘珍
        6. 4873.conf
          14 kB
          刘珍
        7. screenshot-1.png
          27 kB
          刘珍

        Issue Links

          Activity

            People

              HeimingZ Haiming Zhu
              刘珍 刘珍
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: