Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 3.0.0
    • SQL
    • None

    Description

      There are 2 kinds of SQL keywords: reserved and non-reserved. Reserved keywords can't be used as identifiers.

      In Spark SQL, we are too tolerant about non-reserved keywors. A lot of keywords are non-reserved and sometimes it cause ambiguity (IIRC we hit a problem when improving the INTERVAL syntax).

      I think it will be better to just follow other databases or SQL standard to define reserved keywords, so that we don't need to think very hard about how to avoid ambiguity.

      For reference: https://www.postgresql.org/docs/8.1/sql-keywords-appendix.html

      Attachments

        Issue Links

          Activity

            cloud_fan Wenchen Fan added a comment - cc maropu LI,Xiao viirya mgaido
            viirya L. C. Hsieh added a comment -

            Thanks for pinging me.

            Is "In Spark SQL, we are too tolerant about non-reserved keywords" meaning that we have too many non-reserved keywords which should be defined as reserved keywords?

            viirya L. C. Hsieh added a comment - Thanks for pinging me. Is "In Spark SQL, we are too tolerant about non-reserved keywords" meaning that we have too many non-reserved keywords which should be defined as reserved keywords?
            mgaido Marco Gaido added a comment -

            cloud_fan thanks for pinging me. I agree on putting a rule. And I think if we want to do this, since it is a breaking change, 3.0 is the right version to do that. I am wondering if we should create an umbrella JIRA for SQL standard compliance in 3.0: I have also some PRs which we can now revisit (eg. failing on overflow) in order to achieve full (or at least better) SQL standard compliance. What do you think? Moreover, I think we should also decide which SQL standard we want to use: SQL2011 maybe?

            mgaido Marco Gaido added a comment - cloud_fan thanks for pinging me. I agree on putting a rule. And I think if we want to do this, since it is a breaking change, 3.0 is the right version to do that. I am wondering if we should create an umbrella JIRA for SQL standard compliance in 3.0: I have also some PRs which we can now revisit (eg. failing on overflow) in order to achieve full (or at least better) SQL standard compliance. What do you think? Moreover, I think we should also decide which SQL standard we want to use: SQL2011 maybe?
            cloud_fan Wenchen Fan added a comment -

            > Is "In Spark SQL, we are too tolerant about non-reserved keywords" meaning that we have too many non-reserved keywords which should be defined as reserved keywords?

            Yes

            > I am wondering if we should create an umbrella JIRA for SQL standard compliance in 3.0

            sure, feel free to create one. BTW maybe SQL2003 is good enough, but we should follow the latest standard if there is a conflict: e.g. 2003 says a keyword is non-reserved, but 2011 says it's not, we should follow 2011.

            cloud_fan Wenchen Fan added a comment - > Is "In Spark SQL, we are too tolerant about non-reserved keywords" meaning that we have too many non-reserved keywords which should be defined as reserved keywords? Yes > I am wondering if we should create an umbrella JIRA for SQL standard compliance in 3.0 sure, feel free to create one. BTW maybe SQL2003 is good enough, but we should follow the latest standard if there is a conflict: e.g. 2003 says a keyword is non-reserved, but 2011 says it's not, we should follow 2011.

            These reserved words should be handled inside SqlBase.g4? It seems postgresql do so https://github.com/postgres/postgres/blob/ee2b37ae044f34851baba69e9ba737077326414e/src/backend/parser/gram.y#L15366

            maropu Takeshi Yamamuro added a comment - These reserved words should be handled inside SqlBase.g4? It seems postgresql do so https://github.com/postgres/postgres/blob/ee2b37ae044f34851baba69e9ba737077326414e/src/backend/parser/gram.y#L15366
            maropu Takeshi Yamamuro added a comment - I found some useful documents about the reserved words; https://developer.mimer.com/mimer-sql-standard-compliance/ https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
            apachespark Apache Spark added a comment -

            User 'maropu' has created a pull request for this issue:
            https://github.com/apache/spark/pull/23259

            apachespark Apache Spark added a comment - User 'maropu' has created a pull request for this issue: https://github.com/apache/spark/pull/23259
            apachespark Apache Spark added a comment -

            User 'maropu' has created a pull request for this issue:
            https://github.com/apache/spark/pull/23259

            apachespark Apache Spark added a comment - User 'maropu' has created a pull request for this issue: https://github.com/apache/spark/pull/23259
            maropu Takeshi Yamamuro added a comment - Resolved by  https://github.com/apache/spark/pull/23259

            People

              maropu Takeshi Yamamuro
              cloud_fan Wenchen Fan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: