Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
Sometimes, when cron configured with active "After execution stop the interpreter" setting, last paragraphs marks as ABORT with no reason. I found out that reason of this behavior is that Scheduler.getJobsRunning() returns finished jobs. Has anyone ever faced this problem?
Short log (with additional log info from TinkoffCreditSystems fork):
INFO [2018-08-10 00:08:00,000] ({DefaultQuartzScheduler_Worker-47} Notebook.java[execute]:945) - Start schedule run note: 2C68U586U, cronExpr:"0 8 0 * * ?" INFO [2018-08-10 00:08:00,047] ({pool-2-thread-266} SchedulerFactory.java[jobStarted]:109) - Job 20170814-171621_1685490119 started by scheduler INFO [2018-08-10 00:10:35,387] ({pool-2-thread-266} SchedulerFactory.java[jobFinished]:115) - Job 20170814-171621_1685490119 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-greenplum_pd:user:2C68U586U-shared_session INFO [2018-08-10 00:10:35,417] ({pool-2-thread-3838} SchedulerFactory.java[jobStarted]:109) - Job 20180402-171122_400058927 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session INFO [2018-08-10 00:11:57,428] ({pool-2-thread-3838} SchedulerFactory.java[jobFinished]:115) - Job 20180402-171122_400058927 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session INFO [2018-08-10 00:11:57,445] ({pool-2-thread-996} SchedulerFactory.java[jobStarted]:109) - Job 20180413-191933_1545337614 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session INFO [2018-08-10 00:11:57,527] ({pool-2-thread-996} NotebookServer.java[afterStatusChange]:2631) - Job 20180413-191933_1545337614 is finished successfully, status: FINISHED INFO [2018-08-10 00:11:57,547] ({DefaultQuartzScheduler_Worker-47} Paragraph.java[execute]:343) - skip to run blank paragraph. 20180423-134725_1702290212 INFO [2018-08-10 00:11:57,547] ({DefaultQuartzScheduler_Worker-47} Notebook.java[execute]:947) - End schedule run note: 2C68U586U INFO [2018-08-10 00:11:57,548] ({DefaultQuartzScheduler_Worker-47} ManagedInterpreterGroup.java[close]:100) - Close Session: shared_session for interpreter setting: spark INFO [2018-08-10 00:11:57,553] ({pool-2-thread-996} VFSNotebookRepo.java[save]:196) - Saving note:2C68U586U Third job status from FINISHED becomes ABORT WARN [2018-08-10 00:11:57,555] ({DefaultQuartzScheduler_Worker-47} NotebookServer.java[afterStatusChange]:2633) - Job 20180413-191933_1545337614 is finished, status: ABORT, exception: null, result: %text 'sometext' INFO [2018-08-10 00:11:57,577] ({pool-2-thread-996} SchedulerFactory.java[jobFinished]:115) - Job 20180413-191933_1545337614 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark:user:2C68U586U-shared_session INFO [2018-08-10 00:11:57,585] ({DefaultQuartzScheduler_Worker-47} ManagedInterpreterGroup.java[close]:130) - Job paragraph_1523636373190_-1466164905 aborted
Full log with debug messages:
INFO [2018-08-10 17:31:37,193] ({pool-2-thread-123} NotebookServer.java[afterStatusChange]:2631) - Job 20180810-124513_1104099490 is finished successfully, status: FINISHED INFO [2018-08-10 17:31:37,215] ({pool-2-thread-123} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2 INFO [2018-08-10 17:31:37,216] ({pool-2-thread-123} SchedulerFactory.java[jobFinished]:115) - Job 20180810-124513_1104099490 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session INFO [2018-08-10 17:31:37,228] ({pool-2-thread-131} SchedulerFactory.java[jobStarted]:109) - Job 20180810-132950_1064210956 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session INFO [2018-08-10 17:31:37,229] ({pool-2-thread-131} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20180810-132950_1064210956, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1] INFO [2018-08-10 17:31:38,207] ({pool-2-thread-35} Paragraph.java[jobRun]:460) - End of Run paragraph [paragraph_id: 20180810-124513_1104099490, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1] INFO [2018-08-10 17:31:38,207] ({pool-2-thread-35} SchedulerFactory.java[jobFinished]:115) - Job 20180810-124513_1104099490 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session INFO [2018-08-10 17:31:38,224] ({pool-2-thread-131} Paragraph.java[jobRun]:460) - End of Run paragraph [paragraph_id: 20180810-132950_1064210956, interpreter: spark.pyspark, note_id: 2DNHBQ5N2, user: user1] INFO [2018-08-10 17:31:38,227] ({pool-2-thread-131} NotebookServer.java[afterStatusChange]:2631) - Job 20180810-132950_1064210956 is finished successfully, status: FINISHED INFO [2018-08-10 17:31:38,229] ({MyScheduler_Worker-5} Paragraph.java[execute]:343) - skip to run blank paragraph. 20180810-133022_784315150 INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} Notebook.java[execute]:947) - End schedule run note: 2DNHBQ5N2 INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:102) - Close Session: shared_session for interpreter setting: spark INFO [2018-08-10 17:31:38,230] ({MyScheduler_Worker-5} RemoteScheduler.java[getJobsRunning]:135) - [DEBUG] RemoteScheduler adds paragraph_1533896990379_-679637373 to running list, job status is FINISHED [DEBUG] INFO [2018-08-10 17:31:38,231] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:132) - [DEBUG] job paragraph_1533896990379_-679637373 is instanceof paragraph [DEBUG] INFO [2018-08-10 17:31:38,231] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:133) - [DEBUG] Job description before aborting: ParagraphId: 20180810-132950_1064210956 Status: FINISHED Config: {colWidth=12.0, fontSize=9.0, enabled=true, results={}, editorSetting={language=python, editOnDblClick=false, completionKey=TAB, completionSupport=true}, editorMode=ace/mode/python, editorHide=false, tableHide=true} Json: { "text": "%spark.pyspark\nimport os\nimport shutil \n\nshutil.copy2(\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/zeppelin-egklimov-ubuntu013.log\u0027, \u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027)", "user": "user1", "dateUpdated": "2018-08-10 16:01:58.663", "config": { "colWidth": 12.0, "fontSize": 9.0, "enabled": true, "results": {}, "editorSetting": { "language": "python", "editOnDblClick": false, "completionKey": "TAB", "completionSupport": true }, "editorMode": "ace/mode/python", "editorHide": false, "tableHide": true }, "settings": { "params": {}, "forms": {} }, "results": { "code": "SUCCESS", "msg": [ { "type": "TEXT", "data": "\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027\n" } ] }, "apps": [], "jobName": "paragraph_1533896990379_-679637373", "id": "20180810-132950_1064210956", "dateCreated": "2018-08-10 13:29:50.379", "dateStarted": "2018-08-10 17:31:37.229", "dateFinished": "2018-08-10 17:31:38.225", "status": "FINISHED", "progressUpdateIntervalMs": 500 } [DEBUG] INFO [2018-08-10 17:31:38,253] ({pool-2-thread-131} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2 INFO [2018-08-10 17:31:38,254] ({pool-2-thread-131} SchedulerFactory.java[jobFinished]:115) - Job 20180810-132950_1064210956 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-spark::2DNHBQ5N2-shared_session WARN [2018-08-10 17:31:38,262] ({MyScheduler_Worker-5} NotebookServer.java[afterStatusChange]:2633) - Job 20180810-132950_1064210956 is finished, status: ABORT, exception: null, result: %text '/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log' INFO [2018-08-10 17:31:38,275] ({MyScheduler_Worker-5} VFSNotebookRepo.java[save]:196) - Saving note:2DNHBQ5N2 INFO [2018-08-10 17:31:38,276] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:165) - Job paragraph_1533896990379_-679637373 aborted INFO [2018-08-10 17:31:38,277] ({MyScheduler_Worker-5} ManagedInterpreterGroup.java[close]:167) - [DEBUG] Job description after aborting: ParagraphId: 20180810-132950_1064210956 Status: ABORT Config: {colWidth=12.0, fontSize=9.0, enabled=true, results={}, editorSetting={language=python, editOnDblClick=false, completionKey=TAB, completionSupport=true}, editorMode=ace/mode/python, editorHide=false, tableHide=true} Json: { "text": "%spark.pyspark\nimport os\nimport shutil \n\nshutil.copy2(\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/zeppelin-egklimov-ubuntu013.log\u0027, \u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027)", "user": "user1", "dateUpdated": "2018-08-10 16:01:58.663", "config": { "colWidth": 12.0, "fontSize": 9.0, "enabled": true, "results": {}, "editorSetting": { "language": "python", "editOnDblClick": false, "completionKey": "TAB", "completionSupport": true }, "editorMode": "ace/mode/python", "editorHide": false, "tableHide": true }, "settings": { "params": {}, "forms": {} }, "results": { "code": "SUCCESS", "msg": [ { "type": "TEXT", "data": "\u0027/home/egklimov/IdeaProjects/tcs-zeppelin/logs/tmp/zeppelin-egklimov-ubuntu013.log\u0027\n" } ] }, "apps": [], "jobName": "paragraph_1533896990379_-679637373", "id": "20180810-132950_1064210956", "dateCreated": "2018-08-10 13:29:50.379", "dateStarted": "2018-08-10 17:31:37.229", "dateFinished": "2018-08-10 17:31:38.225", "status": "ABORT", "progressUpdateIntervalMs": 500 } [DEBUG]
Attachments
Issue Links
- links to