Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.3.2
-
None
Description
Get the Spark StructuredStreaming job status (start/stop) having multiple sink actions
We are trying to get the status of StructuredStreaming job, below is the requirement
We wanted to push data to a kafkatopic with offset value set to latest, we are using spark-listeners to get the job status, however we observed that listener is invoked because one of the spark query started but complete spark-job isn't actually started as other queries are still initializing, this results in data-loss because we pushed the data to kafka topic and kafka server set the offset inventory value to the latest, as complete spark job is not started yet but listener gets invoked, once spark job is started it didn't consume data from kafka as offset on kafka server has been already set to latest.