Details
-
Improvement
-
Status: Closed
-
Not a Priority
-
Resolution: Won't Fix
-
1.10.0
-
None
Description
This an alternative approach to FLINK-11499, to solve a problem of creating many small files with bulk formats in StreamingFileSink (which have to be rolled on checkpoint).
Merge based approach would require converting StreamingFileSink from a sink, to an operator, that would be working exactly as it’s working right now, with the same limitations (no support for arbitrary rolling policies for bulk formats), followed by another operator that would be tasked with merging small files in the background.
In the long term we probably would like to have both merge operator and write ahead log solution (WAL described in FLINK-11499) as alternatives, as WAL would behave better if small files are more common, and merge operator could behave better if small files are rare (because of data skew for example).
Attachments
Issue Links
- is related to
-
FLINK-11499 Extend StreamingFileSink BulkFormats to support arbitrary roll policies
- Open
- is superceded by
-
FLINK-25555 FLIP-191: Extend unified Sink interface to support small file compaction
- Closed
- relates to
-
FLINK-19345 In Table File Sink, introduce streaming sink compaction
- Closed