Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4438

Calcite SQLParser: Improve support for parsing Spark SQL - INSERT OVERWRITE, RLIKE,DATE grammar

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • spark

    Description

      Calcite SQLParser: Calcite throwing error while parsing Spark SQL syntax - INSERT OVERWRITE, RLIKE,DATE,DAY,YEAR,MONTH. I am using 1.26.0 version of calcite-core and calcite-server.

      Also it throws error if year,day,month,identity,value,date is used as alias in sql query

      For example -

      1) DATE QUERIES

      val query1 = "select DATE(gns_date) as dt"

      StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Incorrect syntax near the keyword 'DATE' at line 1, column 10.

      val query3 = "select date('2020-11-07') as date"

      StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "date (" at line 1, column 8.

      val query3 = "select loan_id as year, as_of_date as date"

      StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "year" at line 1, column 19.

       

      val query4="select {D'1990-01-01'} as day,a as month from table"

      StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "day" at line 1, column 27.

      2) INSERT OVERWRITE

      INSERT OVERWRITE TABLE sbg_schema.tableA
      select distinct SURV_MARK_CUST_IDEN
      ,latest_surv_oci
      ,tran_mark_cust_iden
      ,surv_tran_mci
      ,orgz_mark_cust_iden from tableB

      StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "OVERWRITE" at line 2, column 8.

      3) RLIKE

      val query1 = "select cola from tableA where MAX(realm_email) rlike '.@.\\\\..+'"

      StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "rlike" at line 1, column 48.

       

      All these queries are correctly being parsed via SPARK SQL . But calcite's parser grammer file doesn't suport these tokens

       

      This is the code for creating sqlParser instance:

      object Spark3SqlDialect{ val DefaultContext: SqlDialect.Context = SqlDialect.EMPTY_CONTEXT .withDatabaseProduct(SqlDialect.DatabaseProduct.SPARK) .withNullCollation(NullCollation.LOW) .withCaseSensitive(false) .withConformance(SqlConformanceEnum.BABEL) .withIdentifierQuoteString("`") .withQuotedCasing(Casing.UNCHANGED) .withUnquotedCasing(Casing.UNCHANGED) val DEFAULT = new SparkSqlDialect(DefaultContext) }


      val sqlParser: SqlParser = SqlParser.create(sqlQuery, createSqlParserConfig())


      private def createSqlParserConfig() = { Spark3SqlDialect.DEFAULT.configureParser(SqlParser.config() .withParserFactory(SqlDdlParserImpl.FACTORY)) .withConformance(SqlConformanceEnum.BABEL) }

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sambekar shradha
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h