[HADOOP-18948] S3A. Add option fs.s3a.directory.operations.purge.uploads to purge on rename/delete - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 3.4.0
Fix Version/s: 3.4.0, 3.3.7-aws
Component/s: fs/s3
Labels:
- pull-request-available

Target Version/s:

3.4.0
Hadoop Flags:

Reviewed
Release Note:

Hide
S3A directory delete and rename will optionally abort all pending uploads
under the to-be-deleted paths when
fs.s3a.directory.operations.purge.upload is true
It is off by default.

Show
S3A directory delete and rename will optionally abort all pending uploads under the to-be-deleted paths when fs.s3a.directory.operations.purge.upload is true It is off by default.

Description

On third-party stores without lifecycle rules its possible to accrue many GB of pending multipart uploads, including from

magic committer jobs where spark driver/MR AM failed before commit/abort
distcp jobs which timeout and get aborted
any client code writing datasets which are interrupted before close.

Although there's a purge pending uploads option, that's dangerous because if any fs is instantiated with it, it can destroy in-flight work

otherwise, the "hadoop s3guard uploads" command does work but needs scheduling/manual execution

proposed: add a new property fs.s3a.directory.operations.purge.uploads which will automatically cancel all pending uploads under a path

delete: everything under the dir
rename: all under the source dir

This will be done in parallel to the normal operation, but no attempt to post abortMultipartUploads in different threads. The assumption here is that this is rare. And it'll be off by default as in AWS people should have rules for these things.

+ doc (third_party?)
+ add new counter/metric for abort operations, count and duration
+ test to include cost assertions

Attachments

Issue Links

is depended upon by

SPARK-47008 Spark to support S3 Express One Zone Storage

Open

links to

GitHub Pull Request #6218

Activity

People

Assignee:: Steve Loughran

Reporter:: Steve Loughran

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Oct/23 20:50

Updated:: 08/Feb/24 14:53

Resolved:: 25/Oct/23 16:40