Details
-
Bug
-
Status: Triage Needed
-
P2
-
Resolution: Fixed
-
None
-
None
Description
When writing to Snowflake in batch mode, if the number of files to import is more than 1000, the load will fail
From the Snowflake docs
Of the three options for identifying/specifying data files to load from a stage, providing a discrete list of files is generally the fastest; however, the FILES parameter supports a maximum of 1,000 files, meaning a COPY command executed with the FILES parameter can only load up to 1,000 files.
I noticed that the Snowflake Write in batch mode ignores the number of shards set by the user, and I think the first step should be to get the number of shards before writing.
Longer term, should Beam issue multiple COPY statements with a distinct list of files when the number of files is more than 1000? Maybe inside the same transaction (BEGIN; END; block)
Also, I wanted to set the Jira issue component as io-java-snowflake but it does not exist
Attachments
Issue Links
- links to