Details
-
Sub-task
-
Status: To Do
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Spark structured streaming provides a simple metric for each small batch, using them it is possible to monitor a streaming job.
We can simply report metrics using drop wizard library via the metricsEnabled setting, but I think it would be convenient to implement a StreamingQueryListener so that only the necessary events are taken into the callback.
Personally, if we implement a KafkaStreamingListener and send metrics to Kafka, it seems to be easy to save in other storage or configure dashboards or alarms.