Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39184

ArrayIndexOutOfBoundsException for some date/time sequences in some time-zones

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.3, 3.2.1, 3.3.0, 3.4.0
    • 3.1.4, 3.3.1, 3.2.3, 3.4.0
    • SQL
    • None

    Description

      The following query gets an ArrayIndexOutOfBoundsException when run from the America/Los_Angeles time-zone:

      spark-sql> select sequence(timestamp'2022-03-13 00:00:00', timestamp'2022-03-16 03:00:00', interval 1 day 1 hour) as x;
      22/05/13 14:47:27 ERROR SparkSQLDriver: Failed in [select sequence(timestamp'2022-03-13 00:00:00', timestamp'2022-03-16 03:00:00', interval 1 day 1 hour) as x]
      java.lang.ArrayIndexOutOfBoundsException: 3
      

      In fact, any such query will get an ArrayIndexOutOfBoundsException if the start-stop period in your time-zone includes more instances of "spring forward" than instances of "fall back" and the start-stop period is evenly divisible by the interval.

      In the America/Los_Angeles time-zone, examples include:

      -- This query encompasses 2 instances of "spring forward" but only one
      -- instance of "fall back".
      select sequence(
        timestamp'2022-03-13',
        timestamp'2022-03-13' + (interval '42' hours * 209),
        interval '42' hours) as x;
      
      select sequence(
        timestamp'2022-03-13',
        timestamp'2022-03-13' + (interval '31' hours * 11),
        interval '31' hours) as x;
      

      Attachments

        Activity

          People

            bersprockets Bruce Robbins
            bersprockets Bruce Robbins
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: