[STANBOL-1417] Create Language Annotation for parsed "Content-Language" header - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.12.0
Fix Version/s: 1.0.0, 0.12.1
Component/s: Enhancement Engines
Labels:
None

Description

Stanbol supports parsing the language of the content by using the "Content-Language" header since ~~STANBOL-660~~. However currently only the `dc:language` property is set for the ContentItem.

However based on the specification of ~~STANBOL-613~~ this information is only used as fallback if no language annotation is present in the ContentItem. So as soon as any Language Identification Engine is present in the Chain the "Content-Language" as parsed by the User will get ignored. This is not the intention of a user explicitly parsing the language.

To force Stanbol to use the parsed language a Language Annotation with the confidence 1.0 needs to be added to the metadata of the ContentItem instead.

Attachments

Issue Links

is related to

STANBOL-613 Define a standard way on how to obtain the extracted language

Closed

supercedes

STANBOL-660 Stanbol Enhancer should allow to manually parse the langage of the Content

Resolved

Activity

People

Assignee:: Rupert Westenthaler

Reporter:: Rupert Westenthaler

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Apr/15 05:44

Updated:: 23/Apr/15 12:44

Resolved:: 23/Apr/15 12:44