Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2514

Create alternate ForkParser that doesn't require serialization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The ForkParser is a great option for handling oom/permanent hangs; and from a code/design perspective, IMHO, it is a thing of beauty.

      On the user list JimIdle, recently pointed out that the ForkParser can't work with custom parsers that depend on non-serializable components.

      It would be great to allow users to specify a TIKA_HOME variable or pass in a directory with the tika-related jars and run the server as a separate process with that dir as the class path. This would also make adding optional jars much easier and could prevent jar hell with the calling application.

      Bonus points for enabling the RecursiveParserWrapper to work.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: