Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
2.3.1, 2.3.2
-
None
-
None
-
Java 5
-
New
Description
As of 2.3.1 the documentation for the StandardTokenizer states that it "Recognizes email addresses and internet hostnames as one token."
However hostnames such as "my-host.com" are recognized as two tokens "my" and "host.com".
Any host with a dash in the name is not recognized properly.
Attachments
Issue Links
- is related to
-
LUCENE-1373 Most of the contributed Analyzers suffer from invalid recognition of acronyms.
- Resolved