Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25856

Fix use of UserDefinedType in from_elements

    XMLWordPrintableJSON

Details

    Description

      If we define a new UserDefinedType, and use it in `from_elements`, it will failed.

      class VectorUDT(UserDefinedType):
          @classmethod
          def sql_type(cls):
              return DataTypes.ROW(
                  [
                      DataTypes.FIELD("type", DataTypes.TINYINT()),
                      DataTypes.FIELD("size", DataTypes.INT()),
                      DataTypes.FIELD("indices", DataTypes.ARRAY(DataTypes.INT())),
                      DataTypes.FIELD("values", DataTypes.ARRAY(DataTypes.DOUBLE())),
                  ]
              )
      
          @classmethod
          def module(cls):
              return "pyflink.ml.core.linalg"
      
          def serialize(self, obj):
              if isinstance(obj, SparseVector):
                  indices = [int(i) for i in obj._indices]
                  values = [float(v) for v in obj._values]
                  return 0, obj.size(), indices, values
              elif isinstance(obj, DenseVector):
                  values = [float(v) for v in obj._values]
                  return 1, None, None, values
              else:
                  raise TypeError("Cannot serialize %r of type %r".format(obj, type(obj)))
      
      self.t_env.from_elements([
                  (Vectors.dense([1, 2, 3, 4]), 0., 1.),
                  (Vectors.dense([2, 2, 3, 4]), 0., 2.),
                  (Vectors.dense([3, 2, 3, 4]), 0., 3.),
                  (Vectors.dense([4, 2, 3, 4]), 0., 4.),
                  (Vectors.dense([5, 2, 3, 4]), 0., 5.),
                  (Vectors.dense([11, 2, 3, 4]), 1., 1.),
                  (Vectors.dense([12, 2, 3, 4]), 1., 2.),
                  (Vectors.dense([13, 2, 3, 4]), 1., 3.),
                  (Vectors.dense([14, 2, 3, 4]), 1., 4.),
                  (Vectors.dense([15, 2, 3, 4]), 1., 5.),
              ],
                  DataTypes.ROW([
                      DataTypes.FIELD("features", VectorUDT()),
                      DataTypes.FIELD("label", DataTypes.DOUBLE()),
                      DataTypes.FIELD("weight", DataTypes.DOUBLE())]))
      

      Attachments

        Activity

          People

            hxbks2ks Huang Xingbo
            hxbks2ks Huang Xingbo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: