[HADOOP-12107] long running apps may have a huge number of StatisticsData instances under FileSystem - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.7.0
Fix Version/s: 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1
Component/s: fs
Labels:
None

Target Version/s:

2.8.0, 2.6.4

Description

We observed with some of our apps (non-mapreduce apps that use filesystems) that they end up accumulating a huge memory footprint coming from FileSystem$Statistics$StatisticsData (in the allData list of Statistics).

Although the thread reference from StatisticsData is a weak reference, and thus can get cleared once a thread goes away, the actual StatisticsData instances in the list won't get cleared until any of these following methods is called on Statistics:

getBytesRead()
getBytesWritten()
getReadOps()
getLargeReadOps()
getWriteOps()
toString()

It is quite possible to have an application that interacts with a filesystem but does not call any of these methods on the Statistics. If such an application runs for a long time and has a large amount of thread churn, the memory footprint will grow significantly.

The current workaround is either to limit the thread churn or to invoke these operations occasionally to pare down the memory. However, this is still a deficiency with FileSystem$Statistics itself in that the memory is controlled only as a side effect of those operations.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-12107.001.patch
19/Jun/15 23:14
7 kB
Sangjin Lee
HADOOP-12107.002.patch
23/Jun/15 02:01
11 kB
Sangjin Lee
HADOOP-12107.003.patch
23/Jun/15 03:21
11 kB
Sangjin Lee
HADOOP-12107.004.patch
23/Jun/15 22:15
11 kB
Sangjin Lee
HADOOP-12107.005.patch
24/Jun/15 21:36
11 kB
Sangjin Lee

Issue Links

breaks

HADOOP-12706 TestLocalFsFCStatistics#testStatisticsThreadLocalDataCleanUp times out occasionally

Closed

HADOOP-12958 PhantomReference for filesystem statistics can trigger OOM

Closed

is related to

MAPREDUCE-6735 Performance degradation caused by MAPREDUCE-5465 and HADOOP-12107

Open

relates to

HADOOP-12829 StatisticsDataReferenceCleaner swallows interrupt exceptions

Resolved

Activity

People

Assignee:: Sangjin Lee

Reporter:: Sangjin Lee

Votes:: 0 Vote for this issue

Watchers:: 16 Start watching this issue

Dates

Created:: 19/Jun/15 22:32

Updated:: 30/Aug/16 01:26

Resolved:: 29/Jun/15 22:09