Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.21.0
-
None
-
Enabled extractHeader in the csv config of dfs plugin.
No. of drillbits: Single
OS: Windows
Description
As per documentation, Drill appends col_ to the columns that start with a number or special characters.
/** * Prefix used to replace non-alphabetic characters at the start of * a column name. For example, $foo becomes col_foo. Used * because SQL does not allow _foo. */ public static final String COLUMN_PREFIX = "col_";
But in my case I'm getting it even for all alphabetical column name.
I have the following data in the CSV file,
PRODUCTID | PRODUCTNAME | SUPPLIERID | CATEGORYID | UNIT | PRICE |
---|---|---|---|---|---|
1 | Chais | 1 | 1 | 10 boxes x 20 bags | 18 |
2 | Chang | 1 | 1 | 24 - 12 oz bottles | 19 |
3 | Aniseed Syrup | 1 | 2 | 12 - 550 ml bottles | 10 |
4 | Chef Anton's Cajun Seasoning | 2 | 2 | 48 - 6 oz jars | 22 |
5 | Chef Anton's Gumbo Mix | 2 | 2 | 36 boxes | 21.35 |
While querying on the csv file using following query:
SELECT * FROM dfs.`/var/lib/PRODUCT.csv`
The output is
I know about other criterias like
#UNITS is changed to col_UNITS
FINANCIAL$RECORD is changed to FINANCIAL_RECORD
But what's with PRODUCTID; Why is it changed to col__PRODUCTID_? In this case it has appended extra underscores also.