Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.8.0
-
None
-
Reviewed
-
Description
Currently file's "contentLength" is set as the "requestedStreamLen", when invoking S3AInputStream::reopen(). As a part of lazySeek(), sometimes the stream had to be closed and reopened. But lots of times the stream was closed with abort() causing the internal http connection to be unusable. This incurs lots of connection establishment cost in some jobs. It would be good to set the correct value for the stream length to avoid connection aborts.
I will post the patch once aws tests passes in my machine.
Attachments
Attachments
Issue Links
- incorporates
-
HADOOP-13286 add a S3A scale test to do gunzip and linecount
- Resolved
- is related to
-
HADOOP-16241 S3AInputStream PositionReadable should perform ranged read on dedicated stream
- Open
-
HDFS-2744 Extend FSDataInputStream to allow fadvise
- Open
- relates to
-
HADOOP-13028 add low level counter metrics for S3A; use in read performance tests
- Resolved
-
HADOOP-14965 s3a input stream "normal" fadvise mode to be adaptive
- Resolved