Details
Description
While running a simple pipeline having txt input and txt output on spark, the pipeline is not able to write complete output to the "output file".
How to reproduce:
1) Create a simple pipeline having 2 transforms text file input and text file output
2) Use any simple csv/txt file in Text input file transform
3) Write the data to a text/csv file using text file output transform
If we are reading x lines in #2, then we will get y lines in #3 where x > y.
As we don't have any intermediate transforms in this pipeline, there should not be any change in the output i.e. x should equal to y.
The output still won't match if we use zipped input or zipped output or use any other option in input/output/execution window.
Attaching:
1) simple pipeline - hop_pipeline_simple.hpl,
2) Pipeline with different scenarios - hop_pipeline_multiple_scenarios.hpl,
3) Input files: names.txt , names.zip and simple_mapping_output.txt_20220608_122959.txt
4) Output file - simple_mapping_output_2_20220608_172947.txt.