Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
-
None
Description
Italian stemmer does not work.
Consider this code:
var englishAnalyzer = new SnowballAnalyzer("English");
var tk = englishAnalyzer.TokenStream("text", new StringReader("horses"));
var ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute));
tk.IncrementToken();
Console.WriteLine("English stemmer: horses -> " + ta.Term());
var italianAnalyzer = new SnowballAnalyzer("Italian");
tk = italianAnalyzer.TokenStream("text", new StringReader("abbandonata"));
ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute));
tk.IncrementToken();
Console.WriteLine("Italian stemmer: abbandonata -> " + ta.Term());
It outputs:
English stemmer: horses -> hors
Italian stemmer: abbandonata -> abbandonata
While Java Lucene 2.9.4 outputs:
English stemmer: horses -> hors
Italian stemmer: abbandonata -> abbandon