Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
10.0.1
-
None
-
None
Description
Casting from nullable field to not-nullable works provided all values are present. So for example this is a valid cast:
table = pa.table({'column_1': pa.array([1, 2 ,3])})table.cast( pa.schema([ f.with_nullable(False) for f in table.schema ]) )
But it doesn't work for nested field. Here's an example:
import pyarrow as pa record = {"nested_int": 1} data_type = pa.struct( [ pa.field("nested_int", pa.int32(), nullable=True), ] ) data_type_after = pa.struct( [ pa.field("nested_int", pa.int32(), nullable=False), ] ) table = pa.table({"column_1": pa.array([record], data_type)}) table.cast(pa.schema([pa.field("column_1", data_type_after)]))
Throws:
pyarrow.lib.ArrowTypeError: cannot cast nullable field to non-nullable field: struct<nested_int: int32> struct<nested_int: int32 not null>
This is somewhat related to https://github.com/apache/arrow/issues/13177 and https://issues.apache.org/jira/browse/ARROW-16603