Details
-
Documentation
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
1.7.0
Description
The chunker training data format is described as follows: The train data consist of three columns separated by spaces. Each word has been put on a separate line and there is an empty line after each sentence. However, in the example, several spaces are between tokens and tag. First, it looks like tabs (which are not allowed), second several spaces are not allowed as well (apparently, the line String is splitted(" ")). Suggestion: emphasize that columns are separated by one space and tabs are not allowed.
Attachments
Issue Links
- links to