Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7, 0.8
-
None
Description
Doing 'forrest' starts at the virtual document called linkmap.html where the Cocoon crawler gathers the initial set of links, then starts crawling and generating pages. Any new links are pushed onto the linkmap. However, for some sites, such as our own "seed-sample" and our "site-author", there is a sudden jump in the number of URIs remaining to be processed.
This is due to a URI with a leading slash (e.g. /samples/faq.html). When that URI is processed, it gains a whole new set of links all with leading slashes, and so the list of URIs is potentially doubled.
This issue could be due to a user error, i.e. adding a link that deliberately begins with a slash. Sometimes, that is unavoidable.
However, we do have a sitemap transformer to "relativize" and "absolutize" the links. Should it always trim the leading slash? Or are there cases where that should not happen, so cannot generalise?
This is due to a URI with a leading slash (e.g. /samples/faq.html). When that URI is processed, it gains a whole new set of links all with leading slashes, and so the list of URIs is potentially doubled.
This issue could be due to a user error, i.e. adding a link that deliberately begins with a slash. Sometimes, that is unavoidable.
However, we do have a sitemap transformer to "relativize" and "absolutize" the links. Should it always trim the leading slash? Or are there cases where that should not happen, so cannot generalise?
Attachments
Issue Links
- is related to
-
FOR-271 extra leading slash when absolute path URIs used in site.xml
- Closed