Namespace Lucene.Net.Analysis.Snowball

Classes

SnowballAnalyzer

Filters StandardTokenizer with StandardFilter, LowerCaseFilter, StopFilter and SnowballFilter.

Available stemmers are listed in org.tartarus.snowball.ext. The name of a stemmer is the part of the class name before "Stemmer", e.g., the stemmer in EnglishStemmer is named "English".

NOTE: This class uses the same LuceneVersion dependent settings as StandardAnalyzer, with the following addition:

As of 3.1, uses TurkishLowerCaseFilter for Turkish language.

SnowballFilter

A filter that stems words using a Snowball-generated stemmer.

Available stemmers are listed in Lucene.Net.Tartarus.Snowball.Ext.

NOTE: SnowballFilter expects lowercased text.

For the Turkish language, see TurkishLowerCaseFilter.
For other languages, see LowerCaseFilter.

Note: This filter is aware of the KeywordAttribute. To prevent certain terms from being passed to the stemmer IsKeyword should be set to true in a previous TokenStream. Note: For including the original term as well as the stemmed version, see KeywordRepeatFilterFactory

SnowballPorterFilterFactory

Factory for SnowballFilter, with configurable language

Note: Use of the "Lovins" stemmer is not recommended, as it is implemented with reflection.

<fieldType name="text_snowballstem" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" protected="protectedkeyword.txt" language="English"/>
  </analyzer>
</fieldType>