Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
ManifoldCF 0.1, ManifoldCF 0.2
-
None
Description
If html files are excluded for a job, links in these files will not be followed. If we add inclusion and exclusion filters based on post-extraction, it will be possible to fetch only certain types of documents, such as PDFs.