Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Connect
    • None

    Description

      df = self.spark.createDataFrame([Row(a=i, b=(i % 3)) for i in range(100)])
      sampled = df.stat.sampleBy("b", fractions={0: 0.5, 1: 0.5}, seed=0)
      self.assertTrue(sampled.count() == 35)
      Traceback (most recent call last):
        File "/Users/s.singh/personal/spark-oss/python/pyspark/sql/tests/test_functions.py", line 202, in test_sampleby
          self.assertTrue(sampled.count() == 35)
      AssertionError: False is not true 

      Attachments

        Activity

          People

            beliefer Jiaan Geng
            techaddict Sandeep Singh
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: