Details
Description
When running a Giraph job using the GiraphRunner if the vertex input data specified by the -vip argument references a non-existent file the MapReduce job will hand indefinitely.
Example command invocation:
bin/hadoop jar /Users/rvesse/Documents/Work/Code/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/rvesse/giraph_input/nosuchfile.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/rvesse/giraph_output/6 -w 1
And I get the following output on the command line:
2013-11-18 12:07:04.118 java[7995:1203] Unable to load realm info from SCDynamicStore 13/11/18 12:07:05 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one. 13/11/18 12:07:05 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one. 13/11/18 12:07:05 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4) 13/11/18 12:07:06 INFO job.GiraphJob: run: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201311181156_0003 13/11/18 12:08:03 INFO job.HaltApplicationUtils$DefaultHaltInstructionsWriter: writeHaltInstructions: To halt after next superstep execute: 'bin/halt-application --zkServer mbp-rvesse.home:22181 --zkNode /_hadoopBsp/job_201311181156_0003/_haltComputation' 13/11/18 12:08:03 INFO mapred.JobClient: Running job: job_201311181156_0003 13/11/18 12:08:04 INFO mapred.JobClient: map 50% reduce 0%
And in the Hadoop Job tracker viewing this job I see this for the first map attempt:
2013-11-18 12:07:18,589 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-11-18 12:07:19,046 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2013-11-18 12:07:19,245 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : null 2013-11-18 12:07:19,348 INFO org.apache.hadoop.mapred.MapTask: Processing split: 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1 2013-11-18 12:07:19,686 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar. 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsComputation 2013-11-18 12:07:20,475 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201311181156_0003 2013-11-18 12:07:20,477 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201311181156_0003/_task/mbp-rvesse.home 0 2013-11-18 12:07:20,490 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [mbp-rvesse.home] 1 hosts from 2 candidates when 1 required (polling period is 3000) on attempt 0 2013-11-18 12:07:20,490 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201311181156_0003/zkServerList_mbp-rvesse.home 0 ' 2013-11-18 12:07:20,495 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_mbp-rvesse.home 0 ' (polling period is 3000) 2013-11-18 12:07:20,495 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [mbp-rvesse.home, 0] 2 hosts in filename 'zkServerList_mbp-rvesse.home 0 ' 2013-11-18 12:07:20,496 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper 2013-11-18 12:07:20,517 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper/zoo.cfg in /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper with base port 22181 2013-11-18 12:07:20,517 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true 2013-11-18 12:07:20,517 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false 2013-11-18 12:07:20,528 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper/zoo.cfg] in directory /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper 2013-11-18 12:07:20,571 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added. 2013-11-18 12:07:20,572 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to mbp-rvesse.home:22181 with poll msecs = 3000 2013-11-18 12:07:20,588 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431) at java.net.Socket.connect(Socket.java:527) at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:703) at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:369) at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:202) at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:59) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:89) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:07:23,589 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to mbp-rvesse.home:22181 with poll msecs = 3000 2013-11-18 12:07:23,590 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to mbp-rvesse.home/192.168.1.65:22181! 2013-11-18 12:07:23,590 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201311181156_0003/_zkServer/mbp-rvesse.home 0 2013-11-18 12:07:23,597 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper... 2013-11-18 12:07:23,597 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)... 2013-11-18 12:07:23,739 INFO org.apache.giraph.bsp.BspService: BspService: Path to create to halt is /_hadoopBsp/job_201311181156_0003/_haltComputation 2013-11-18 12:07:23,739 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201311181156_0003, 0 on mbp-rvesse.home:22181 2013-11-18 12:07:23,841 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT 2013-11-18 12:07:23,841 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=mbp-rvesse.home 2013-11-18 12:07:23,841 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_65 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Apple Inc. 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/classes:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../conf:/Library/Java/Home/lib/tools.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/..:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../hadoop-core-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/asm-3.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjrt-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjtools-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-1.7.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-core-1.8.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-cli-1.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-codec-1.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-collections-3.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-configuration-1.6.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-daemon-1.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-digester-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-el-1.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-httpclient-3.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-io-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-lang-2.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-1.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-api-1.0.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-math-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-net-3.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/core-3.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-capacity-scheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-fairscheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-thriftfs-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hsqldb-1.8.0.10.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-core-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-compiler-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-runtime-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jdeb-0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-core-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-json-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-server-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jets3t-0.6.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-util-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsch-0.1.42.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/junit-4.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/kfs-0.2.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/log4j-1.2.15.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/mockito-all-1.8.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/oro-2.0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/servlet-api-2.5-20081211.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-api-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-log4j12-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/xmlenc-0.52.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-api-2.1.jar 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/native/Mac_OS_X-x86_64-64:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work/tmp 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Mac OS X 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=x86_64 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=10.8.5 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=rvesse 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/homes/ 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work 2013-11-18 12:07:23,868 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=mbp-rvesse.home:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@637050f5 2013-11-18 12:07:23,971 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:23,973 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to mbp-rvesse.home/192.168.1.65:22181, initiating session 2013-11-18 12:07:24,067 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server mbp-rvesse.home/192.168.1.65:22181, sessionid = 0x1426b1b9c7f0000, negotiated timeout = 600000 2013-11-18 12:07:24,091 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete. 2013-11-18 12:07:24,171 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker 2013-11-18 12:07:24,171 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY 2013-11-18 12:07:24,392 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201311181156_0003/_masterElectionDir/mbp-rvesse.home_00000000000' and my bid is '/_hadoopBsp/job_201311181156_0003/_masterElectionDir/mbp-rvesse.home_00000000000' 2013-11-18 12:07:25,130 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder. 2013-11-18 12:07:25,440 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: mbp-rvesse.home/192.168.1.65:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1 2013-11-18 12:07:25,691 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder. 2013-11-18 12:07:25,749 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master! 2013-11-18 12:07:25,793 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled 2013-11-18 12:07:25,801 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected) 2013-11-18 12:07:27,498 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with IllegalStateException java.lang.IllegalStateException: generateVertexInputSplits: Got IOException at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:316) at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:627) at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:694) at org.apache.giraph.master.MasterThread.run(MasterThread.java:100) Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost/user/rvesse/giraph_input/nosuchfile.txt at org.apache.giraph.io.formats.GiraphFileInputFormat.listStatus(GiraphFileInputFormat.java:271) at org.apache.giraph.io.formats.GiraphFileInputFormat.listVertexStatus(GiraphFileInputFormat.java:286) at org.apache.giraph.io.formats.GiraphFileInputFormat.getVertexSplits(GiraphFileInputFormat.java:357) at org.apache.giraph.io.formats.TextVertexInputFormat.getSplits(TextVertexInputFormat.java:60) at org.apache.giraph.io.internal.WrappedVertexInputFormat.getSplits(WrappedVertexInputFormat.java:72) at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:314) ... 3 more 2013-11-18 12:07:27,498 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.IllegalStateException: generateVertexInputSplits: Got IOException, exiting... java.lang.IllegalStateException: java.lang.IllegalStateException: generateVertexInputSplits: Got IOException at org.apache.giraph.master.MasterThread.run(MasterThread.java:185) Caused by: java.lang.IllegalStateException: generateVertexInputSplits: Got IOException at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:316) at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:627) at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:694) at org.apache.giraph.master.MasterThread.run(MasterThread.java:100) Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost/user/rvesse/giraph_input/nosuchfile.txt at org.apache.giraph.io.formats.GiraphFileInputFormat.listStatus(GiraphFileInputFormat.java:271) at org.apache.giraph.io.formats.GiraphFileInputFormat.listVertexStatus(GiraphFileInputFormat.java:286) at org.apache.giraph.io.formats.GiraphFileInputFormat.getVertexSplits(GiraphFileInputFormat.java:357) at org.apache.giraph.io.formats.TextVertexInputFormat.getSplits(TextVertexInputFormat.java:60) at org.apache.giraph.io.internal.WrappedVertexInputFormat.getSplits(WrappedVertexInputFormat.java:72) at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:314) ... 3 more 2013-11-18 12:07:27,499 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started. 2013-11-18 12:07:27,499 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process. 2013-11-18 12:07:28,065 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed). 2013-11-18 12:07:28,065 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x1426b1b9c7f0000, likely server has closed socket, closing socket connection and attempting reconnect
And this for the second map attempt:
2013-11-18 12:07:18,998 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-11-18 12:07:19,441 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2013-11-18 12:07:19,612 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : null 2013-11-18 12:07:19,632 INFO org.apache.hadoop.mapred.MapTask: Processing split: 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1 2013-11-18 12:07:19,692 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar. 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsComputation 2013-11-18 12:07:20,475 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201311181156_0003 2013-11-18 12:07:20,478 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201311181156_0003/_task/mbp-rvesse.home 1 2013-11-18 12:07:20,489 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'null' (polling period is 3000) 2013-11-18 12:07:23,491 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_mbp-rvesse.home 0 ' (polling period is 3000) 2013-11-18 12:07:23,491 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [mbp-rvesse.home, 0] 2 hosts in filename 'zkServerList_mbp-rvesse.home 0 ' 2013-11-18 12:07:23,492 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperSErvers: Empty directory _bsp/_defaultZkManagerDir/job_201311181156_0003/_zkServer, waiting 3000 msecs. 2013-11-18 12:07:26,495 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [mbp-rvesse.home] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 1 2013-11-18 12:07:26,496 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker... 2013-11-18 12:07:26,583 INFO org.apache.giraph.bsp.BspService: BspService: Path to create to halt is /_hadoopBsp/job_201311181156_0003/_haltComputation 2013-11-18 12:07:26,583 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201311181156_0003, 1 on mbp-rvesse.home:22181 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=mbp-rvesse.home 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_65 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Apple Inc. 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home 2013-11-18 12:07:26,595 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/classes:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../conf:/Library/Java/Home/lib/tools.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/..:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../hadoop-core-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/asm-3.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjrt-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjtools-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-1.7.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-core-1.8.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-cli-1.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-codec-1.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-collections-3.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-configuration-1.6.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-daemon-1.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-digester-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-el-1.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-httpclient-3.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-io-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-lang-2.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-1.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-api-1.0.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-math-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-net-3.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/core-3.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-capacity-scheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-fairscheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-thriftfs-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hsqldb-1.8.0.10.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-core-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-compiler-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-runtime-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jdeb-0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-core-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-json-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-server-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jets3t-0.6.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-util-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsch-0.1.42.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/junit-4.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/kfs-0.2.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/log4j-1.2.15.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/mockito-all-1.8.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/oro-2.0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/servlet-api-2.5-20081211.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-api-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-log4j12-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/xmlenc-0.52.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-api-2.1.jar 2013-11-18 12:07:26,595 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/native/Mac_OS_X-x86_64-64:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work 2013-11-18 12:07:26,595 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work/tmp 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Mac OS X 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=x86_64 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=10.8.5 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=rvesse 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/homes/ 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work 2013-11-18 12:07:26,598 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=mbp-rvesse.home:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@14d964af 2013-11-18 12:07:26,609 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:26,609 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to mbp-rvesse.home/192.168.1.65:22181, initiating session 2013-11-18 12:07:26,615 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server mbp-rvesse.home/192.168.1.65:22181, sessionid = 0x1426b1b9c7f0001, negotiated timeout = 600000 2013-11-18 12:07:26,616 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete. 2013-11-18 12:07:26,936 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder. 2013-11-18 12:07:26,968 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: mbp-rvesse.home/192.168.1.65:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1 2013-11-18 12:07:26,976 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder. 2013-11-18 12:07:27,162 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker... 2013-11-18 12:07:27,421 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201311181156_0003/_masterJobState) 2013-11-18 12:07:27,424 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir already exists! 2013-11-18 12:07:27,426 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir already exists! 2013-11-18 12:07:27,442 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 and workerInfo= Worker(hostname=mbp-rvesse.home, MRtaskID=1, port=30001) 2013-11-18 12:07:27,842 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x1426b1b9c7f0001, likely server has closed socket, closing socket connection and attempting reconnect 2013-11-18 12:07:27,944 WARN org.apache.giraph.bsp.BspService: process: Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent state:Disconnected type:None path:null 2013-11-18 12:07:29,332 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:29,333 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:29,443 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 0, waiting 5000 msecs before retrying. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360) at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:07:31,144 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:31,145 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:33,003 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:33,004 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:34,276 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:34,277 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:35,980 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:35,981 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:36,082 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 1, waiting 5000 msecs before retrying. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360) at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:07:37,345 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:37,346 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:38,543 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:38,544 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:40,141 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:40,141 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:41,826 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:41,827 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:41,928 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 2, waiting 5000 msecs before retrying. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360) at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:07:43,279 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:43,280 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:44,513 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:44,514 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:46,383 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:46,384 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:46,929 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 on superstep -1 2013-11-18 12:07:47,936 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:47,936 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:48,037 WARN org.apache.giraph.zk.ZooKeeperExt: deleteExt: Connection loss on attempt 0, waiting 5000 msecs before retrying. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728) at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302) at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650) at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664) at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:07:49,210 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:49,210 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:50,851 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:50,852 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:52,704 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:52,705 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:54,744 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:54,744 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:54,846 WARN org.apache.giraph.zk.ZooKeeperExt: deleteExt: Connection loss on attempt 1, waiting 5000 msecs before retrying. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728) at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302) at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650) at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664) at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:07:56,259 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:56,260 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:57,672 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:57,673 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:07:59,265 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:07:59,266 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:08:01,207 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:08:01,207 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:08:01,309 WARN org.apache.giraph.zk.ZooKeeperExt: deleteExt: Connection loss on attempt 2, waiting 5000 msecs before retrying. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728) at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302) at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650) at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664) at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:08:02,441 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:08:02,441 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:08:04,086 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:08:04,087 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:08:05,945 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181 2013-11-18 12:08:05,945 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119) 2013-11-18 12:08:06,310 ERROR org.apache.giraph.graph.GraphTaskManager: run: Worker failure failed on another RuntimeException, original expection will be rethrown java.lang.IllegalStateException: deleteExt: Failed to delete /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 after 3 tries! at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:333) at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650) at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664) at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013-11-18 12:08:06,313 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2013-11-18 12:08:06,357 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.IllegalStateException: run: Caught an unrecoverable exception exists: Failed to check /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions after 3 tries! at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.IllegalStateException: exists: Failed to check /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions after 3 tries! at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369) at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) ... 7 more 2013-11-18 12:08:06,361 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Eventually the job times out and Hadoop kills it off but really I would expect a job to fail fast (preferably before ever launching the job) if the input does not exist.
I'll attach full log files for reference