Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Won't Fix
-
ManifoldCF 2.9.1
-
None
Description
I am crawling a file system mounted on linux machine. So the Repository Connection is of type "File System". For some files which has some special characters, Manifold Cf is not picking such files.
File ex: a_XY-SMnA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf
exception: java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_151]
at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513) ~[?:?]
at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76) ~[?:?]
at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503) ~[mcf-agents.jar:?]
at org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47) ~[mcf-pull-agent.jar:?]
at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308) [mcf-pull-agent.jar:?]
FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input string: ""