[MAPREDUCE-5949] Tasktracker's java threads hunging - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.2.1
Fix Version/s: None
Component/s: None
Labels:
None
Environment:

FreeBSD-10/stable
openjdk version "1.7.0_60"
OpenJDK Runtime Environment (build 1.7.0_60-b19)
OpenJDK 64-Bit Server VM (build 24.60-b09, mixed mode)

Description

I set up hadoop-1.2.1 (from ports) on FreeBSD-10/stable with openjdk version 1.7.0_60.

On the first glance it is doing well except one annoying thing: after executing some tasks, tasktracker process starts to eat CPU when idle.
Sometimes it is 10-20% (numbers from top(1) output), sometimes it is 100-150%.

In tasktrackers's log I see numerious records like this:

2014-06-09 13:08:29,858 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - injecting delay59 times
2014-06-09 13:08:29,859 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - recreating selector 59 times, canceled keys 944 times
2014-06-09 13:09:29,862 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - injecting delay58 times
2014-06-09 13:09:29,862 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - recreating selector 58 times, canceled keys 928 times
2014-06-09 13:10:29,901 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - injecting delay58 times
2014-06-09 13:10:29,901 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet@abdcc1c JVM BUG(s) - recreating selector 58 times, canceled keys 928 times
<...>

The more jobs I run, more java threads start to consume CPU after all tasks finished. After several job execution, top(1) output looks like this (splitted by thread, the same PID):

PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
79045 hadoop 47 0 1948M 867M uwait 2 20:49 37.50% java

{java}
79045 hadoop 31 0 1948M 867M uwait 31 1:45 19.29% java{java}

79045 hadoop 33 0 1948M 867M uwait 21 2:51 19.19% java

{java}
79045 hadoop 30 0 1948M 867M uwait 17 2:51 18.65% java{java}

79045 hadoop 30 0 1948M 867M uwait 11 1:52 18.36% java

{java}
79045 hadoop 30 0 1948M 867M uwait 22 1:45 18.36% java{java}

79045 hadoop 31 0 1948M 867M uwait 29 2:50 18.26% java

{java}
79045 hadoop 31 0 1948M 867M uwait 6 1:57 18.16% java{java}

79045 hadoop 31 0 1948M 867M uwait 13 4:55 17.97% java

{java}
79045 hadoop 31 0 1948M 867M uwait 26 3:39 17.77% java{java}

79045 hadoop 33 0 1948M 867M uwait 8 1:21 17.48% java

{java}
79045 hadoop 30 0 1948M 867M uwait 1 3:32 16.70% java{java}

79045 hadoop 32 0 1948M 867M uwait 24 3:12 16.70% java

{java}
79045 hadoop 26 0 1948M 867M uwait 4 1:27 10.35% java{java}

72417 root 20 0 19828K 3252K CPU21 21 0:00 0.29% top
836 root 20 0 36104K 1952K select 14 6:51 0.00% snmpd
79045 hadoop 20 0 1948M 867M uwait 20 6:51 0.00% java

{java}
79045 hadoop 20 0 1948M 867M uwait 27 3:45 0.00% java{java}

79045 hadoop 20 0 1948M 867M uwait 30 2:37 0.00% java

{java}
79045 hadoop 20 0 1948M 867M uwait 15 0:54 0.00% java{java}

79045 hadoop 20 0 1948M 867M uwait 2 0:48 0.00% java

{java}
79045 hadoop 20 0 1948M 867M uwait 14 0:48 0.00% java{java}

79045 hadoop 20 0 1948M 867M uwait 2 0:48 0.00% java

{java}

<....>

This is on absolutely idle cluster, no single task is running.
I am attaching truss(1) output for that java process:

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

task1.txt
28/Jun/14 09:03
110 kB
Dmitry Sivachenko

Issue Links

duplicates

MAPREDUCE-2386 TT jetty server stuck in tight loop around epoll_wait

Open

Activity

People

Assignee:: Unassigned

Reporter:: Dmitry Sivachenko

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 28/Jun/14 09:02

Updated:: 29/Jun/14 11:44

Resolved:: 28/Jun/14 15:25