Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.0.0
-
None
Description
Some databases allow you to specify column names when specifying the target of an INSERT INTO. For example, in SQLite:
sqlite> CREATE TABLE twocolumn (x INT, y INT); INSERT INTO twocolumn(x, y) VALUES (44,51), (NULL,52), (42,53), (45,45) ...> ; sqlite> select * from twocolumn; 44|51 |52 42|53 45|45
I have a corpus of existing queries of this form which I would like to run on Spark SQL, so I think we should extend our dialect to support this syntax.
When implementing this, we should make sure to test the following behaviors and corner-cases:
- Number of columns specified is greater than or less than the number of columns in the table.
- Specification of repeated columns.
- Specification of columns which do not exist in the target table.
- Permute column order instead of using the default order in the table.
For each of these, we should check how SQLite behaves and should also compare against another database. It looks like T-SQL supports this; see https://technet.microsoft.com/en-us/library/dd776381(v=sql.105).aspx under the "Inserting data that is not in the same order as the table columns" header.
Attachments
Issue Links
- duplicates
-
SPARK-21548 Support insert into serial columns of table
- Resolved
- is duplicated by
-
SPARK-26234 Column list specification in INSERT statement
- Closed
-
SPARK-23193 Insert into Spark Table statement cannot specify column names
- Closed
- links to