Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Hi there,
As there is an issue that is still not handled occurs in use, I would like to suggest the following fix for the source code of Generic repository connector.
For details about this issue, please refer to the information below:
1. Connector name
Generic Repository Connector
2. Issue
When Generic Repository is calling REST API with action=seed and an error occurs, corresponding error handling is not executed, which results in that crawling job of ManifoldCF is frozen at status Starting up and no error message is outputted.
- When this issue happens in the Generic Repository, seed phase of jobs in other repositories also freezes (perhaps, seed thread is also frozen)
- Even after ManifoldCF is restarted, as jobs are automatically executed, the same issue happens again.
- A temporary solution is to aborting the job and recheck the connection.
3. Reproduction
Reproduction method:
- At setting step for Generic repository connection, set a non-existent entry point (e.g. http://localhost/no*exist/). Then, define a job that uses that entry point and run that job.
- 10 minutes or more after the job gets started, its status is still Starting up and abnormal end does not occur due to connection error and time-out.
Reproduction steps:
- Create a Generic repository connection with the following settings:
- On the Entry Point tab, set a non-existent entry point (e.g. http://localhost/no*exist/)
- Create a job using above Generic repository connection
- Start the created job and keep track of its status
- Job is going to be frozen with the following information:
- Status: Starting up
- Start Time: Not started
- Documents: 0
- No new events appear in Document Status
- No errors get logged in manifoldcf.log
- Job is going to be frozen with the following information:
4. Cause
In GenericConnector$ExecuteSeedingThread class, seedBuffer.signalDone() method is only called when returned HTTP status code is 200.
- When the connector is not able to connect to REST API, which means that returned HTTP status code is not 200, seedBuffer.signalDone() method is not called.
- This results in that complete flag is not reassigned as true
- As complete flag is not reassigned as true and buffer.size() is 0, job is stuck in the wait() process, inside the while loop of XThreadStringBuffer#fetch() method.
while (buffer.size() == 0 && !complete)
wait();
⇒ These are the reasons why job is frozen at status Starting up
5. Solution
In order to resolve this issue, we suggest the following things:
- seedBuffer.signalDone() method should be called for all cases of HTTP response status.
- Moreover, when HTTP status code is not 200, ManifoldCFException is thrown. There is no process to handle ManifoldCFException in finishUp() method of GenericConnector$ExecuteSeedingThread class, so process to handle this exception should be added.
6. Suggested source code (based on release 2.22.1)
- seedBuffer.signalDone();
} finally {
EntityUtils.consume(response.getEntity());
method.releaseConnection();
+ seedBuffer.signalDone();
}
if (thr instanceof RuntimeException) { throw (RuntimeException) thr; } else if (thr instanceof Error) { throw (Error) thr; + } else if (thr instanceof ManifoldCFException) { + throw (ManifoldCFException) thr; } else { throw new RuntimeException("Unhandled exception of type: " + thr.getClass().getName(), thr); }