Details
Description
When any23 is asked to extract semantics from a web document which is not in UTF-8 and where TITLE precedes encoding declaration, any23 fails with error "Invalid content '"
Example of such an URL:
http://www.kinopoisk.ru/film/565993/
Compressed dump of this page is attached.
any23 http://www.kinopoisk.ru/film/565993/
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
------------------------------------------------------------------------
Apache Any23 :: rover
------------------------------------------------------------------------
@prefix dcterms: <http://purl.org/dc/terms/> .
<http://www.kinopoisk.ru/film/565993/> dcterms:title "Ïèðàíüè 3DD" .
------------------------------------------------------------------------
Apache Any23 FAILURE
Execution terminated with errors: Invalid content ''
Total time: 1s
Finished at: Mon Jul 15 20:31:14 MSK 2013
Final Memory: 67M/479M
------------------------------------------------------------------------
Attachments
Attachments
Issue Links
- duplicates
-
ANY23-115 Empty spans seem to break ANY23
- Resolved