Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25395

FileNotFoundException during recovery caused by Incremental shared state being discarded by TM

    XMLWordPrintableJSON

Details

    Description

      Extracting from FLINK-25185 discussion

      On checkpoint abortion or any failure in AsyncCheckpointRunnable,
      it discards the state, in particular shared (incremental) state.

      Since FLINK-24611, this creates a problem because shared state can be re-used for future checkpoints.

       

      A similar case is in PeriodicMaterializationManager (uploaded SST files will be deleted on failure without notifying the wrapped RocksDB state backend).

       

      Symptom of this failure is a following exception during recovery:

      Caused by: java.io.FileNotFoundException: /tmp/junit3146957979516280339/junit1602669867129285236/d6a6dbdd-3fd7-4786-9dc1-9ccc161740da (No such file or directory)
              at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
              at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
              at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_292]
              at org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:50) ~[flink-core-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
              at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:134) ~[flink-core-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
              at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:87) ~[flink-core-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
              at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68) ~[flink-runtime-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
              at org.apache.flink.changelog.fs.StateChangeFormat.read(StateChangeFormat.java:92) ~[flink-dstl-dfs-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
              at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) ~[flink-runtime-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
      

      Attachments

        Issue Links

          Activity

            People

              roman Roman Khachatryan
              roman Roman Khachatryan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: