Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1363

Add streaming support for Spark SQL module

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • SQL
    • None

    Description

      Currently there exists some projects like Pig On Storm, SQL on storm (Squall, SQLstream) that can query over streaming data, but for Spark Streaming, it is a blank area. It will be a good feature to add streaming supported SQL to Spark SQL.

      From semantic perspective, DStream is quite alike RDD, they both have join, filter, groupBy operators and so on, also DStream is backed by RDD, so it is transplant-able and reusable from existing spark plan.

      Also Catalyst has a clear division for each step, we can fully use its parse and logical plan analysis steps, with only different physical plan.

      So here we propose to add streaming support in Catalyst.

      Attachments

        1. StreamSQLDesignDoc.pdf
          157 kB
          Saisai Shao

        Issue Links

          Activity

            People

              jerryshao Saisai Shao
              jerryshao Saisai Shao
              Votes:
              1 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: