[CONNECTORS-946] Add support for pipeline connector - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: ManifoldCF 1.7
Fix Version/s: ManifoldCF 1.7
Component/s: Framework crawler agent
Labels:
None

Description

In the Amazon Search Connector, we finally found an example of an output connector that needed to do full document processing in order to work. This ticket represents work in the framework to create a concept of "pipeline connector". Pipeline connections would receive RepositoryDocument objects, and transform them to new RepositoryDocument objects. There would be a single important method:

public void transformDocument(RepositoryDocument rd, ITransformationActivities activities) throws ServiceInterruption, ManifoldCFException;

... where ITransformationActivities would include a method that would send a RepositoryDocument object onward to either the output connection or to the next pipeline connection.

Each pipeline connection would have:

A name
A description
Configuration data
An optional prerequisite pipeline connection

Every output connection would have a new field, which is an optional prerequisite pipeline connection.

This design is based loosely on how mapping connections and authority connections interrelate. An alternate design would involve having per-job specification information, but I think this would wind up being way too complex for very little benefit, since each pipeline connection/stage would be expected to do relatively simple/granular things, not usually involving interaction with an external system.

Attachments

Issue Links

is depended upon by

CONNECTORS-954 Amazon Cloud Search connector's use of Tika should be revisited after pipelines are added

Resolved

CONNECTORS-955 Forced Metadata transformation connector would be a nice addition

Resolved

Activity

People

Assignee:: Karl Wright

Reporter:: Karl Wright

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 27/May/14 23:12

Updated:: 09/Jun/14 23:20

Resolved:: 09/Jun/14 23:20