Details
Description
It would be great to have an option in Spark's schema inference to not to convert to int/long datatype a column that has leading zeros. Think zip codes, for example.
df = (sqlc.read.format('csv') .option('inferSchema', True) .option('header', True) .option('delimiter', '|') .option('leadingZeros', 'KEEP') # this is the new proposed option .option('mode', 'FAILFAST') .load('csvfile_withzipcodes_to_ingest.csv') )
Attachments
Issue Links
- is cloned by
-
SPARK-29316 CLONE - schemaInference option not to convert strings with leading zeros to int/long
- Resolved