Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Motivation
Current implementation of SQS streaming connector handles the following "route" of the s3 notification event:
1. S3 -> SQS -> Spark
This approach works just fine until you need to have multiple listeners (consumers) for the same S3 path. In case multiple applications require to listen and process same S3 path the following approach is recommended:
2. S3 -> SNS -> SQS -> Spark
In this case we can route messages from 1 SNS topic to multiple different SQS queues. This enables an ability to listen same S3 path for multiple applications Using approach #2, original S3 notification is wrapped into SNS message and then delivered to the SQS queue. (link to the AWS docs describing SNS message format)
To extract original S3 event from SNS message one need to look at "Message" field in json document.
Proposed approach
- Add option to the s3-sqs connector: "messageWrapper"
- It can be 'None' or 'SNS'
- Default value is 'None'
In case if 'SNS' is specified – "unwrap" original s3 notification event from SNS message and continue processing.