Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18886 S3A: AWS SDK V2 Migration: stabilization and S3Express
  3. HADOOP-19221

S3A: Unable to recover from failure of multipart block upload attempt "Status Code: 400; Error Code: RequestTimeout"

    XMLWordPrintableJSON

Details

    Description

      If a multipart PUT request fails for some reason (e.g. networrk error) then all subsequent retry attempts fail with a 400 Response and ErrorCode RequestTimeout .

      Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended Request ID:
      

      The list of supporessed exceptions contains the root cause (the initial failure was a 500); all retries failed to upload properly from the source input stream RequestBody.fromInputStream(fileStream, size).

      Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 sdk we would build a multipart block upload request passing in (file, offset, length), the way we are now doing this doesn't recover.

      probably fixable by providing our own ContentStreamProvider implementations for

      1. file + offset + length
      2. bytebuffer
      3. byte array

      The sdk does have explicit support for the memory ones, but they copy the data blocks first. we don't want that as it would double the memory requirements of active blocks.

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: