Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.3.6
-
None
-
None
Description
For config option fs.s3a.retry.throttle.interval default value in source code is 500ms:
public static final String RETRY_THROTTLE_INTERVAL_DEFAULT = "500ms";
In core-default.xml it has value 100ms, but in the description 500ms:
<property> <name>fs.s3a.retry.throttle.interval</name> <value>100ms</value> <description> Initial between retry attempts on throttled requests, +/- 50%. chosen at random. i.e. for an intial value of 3000ms, the initial delay would be in the range 1500ms to 4500ms. Backoffs are exponential; again randomness is used to avoid the thundering heard problem. 500ms is the default value used by the AWS S3 Retry policy. </description> </property>
https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml#L1750
This change introduced in HADOOP-16823.
In Hadoop-AWS module documentation it has value 1000ms:
<property> <name>fs.s3a.retry.throttle.interval</name> <value>1000ms</value> <description> Interval between retry attempts on throttled requests. </description> </property>
https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md?plain=1#L1223
File was created in HADOOP-13786, and value is left unchanged since when.
In performance tuning page it has up-to-date value 500ms:
<property> <name>fs.s3a.retry.throttle.interval</name> <value>500ms</value> <description> Interval between retry attempts on throttled requests. </description> </property>
https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/performance.md?plain=1#L435
This change introduced in HADOOP-15076.
The same issue with:
- fs.s3a.retry.throttle.limit - in source code it has value 20, but in some documents still old value ${fs.s3a.attempts.maximum}
- fs.s3a.connection.establish.timeout - in source code it has value 50_000, in config file & documentation 5_000
- fs.s3a.attempts.maximum - in source code it has value 10, in config file & documentation 20
- fs.s3a.threads.max - in source & documentation code it has value 10, in config file 64
- fs.s3a.max.total.tasks - in source code & config it has value 32, in documentation 5
- fs.s3a.connection.maximum - in source code & config it has value 96, in documentation 15 or 30
Please sync these values, outdated documentation is very painful to work with.
As an idea, is it possible to use core-default.xml directly in documentation, or generate this documentation from docstrings in Java code?