Uploaded image for project: 'Lucene.Net'
  1. Lucene.Net
  2. LUCENENET-459

Italian stemmer (from SnowballAnalyzer) does not work

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
    • Lucene.Net 2.9.4g
    • Lucene.Net Contrib
    • None

    Description

      Italian stemmer does not work.

      Consider this code:

      var englishAnalyzer = new SnowballAnalyzer("English");
      var tk = englishAnalyzer.TokenStream("text", new StringReader("horses"));
      var ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute));
      tk.IncrementToken();
      Console.WriteLine("English stemmer: horses -> " + ta.Term());

      var italianAnalyzer = new SnowballAnalyzer("Italian");
      tk = italianAnalyzer.TokenStream("text", new StringReader("abbandonata"));
      ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute));
      tk.IncrementToken();
      Console.WriteLine("Italian stemmer: abbandonata -> " + ta.Term());

      It outputs:

      English stemmer: horses -> hors
      Italian stemmer: abbandonata -> abbandonata

      While Java Lucene 2.9.4 outputs:

      English stemmer: horses -> hors
      Italian stemmer: abbandonata -> abbandon

      Attachments

        Activity

          People

            digydigy Digy
            smolav Santiago M. Mola
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: