Namespace Lucene.Net.Analysis.Snowball
Classes
SnowballAnalyzer
Filters StandardTokenizer with StandardFilter, LowerCaseFilter, StopFilter and SnowballFilter.
Available stemmers are listed in org.tartarus.snowball.ext. The name of a stemmer is the part of the class name before "Stemmer", e.g., the stemmer in EnglishStemmer is named "English".
NOTE: This class uses the same LuceneVersion dependent settings as StandardAnalyzer, with the following addition:
- As of 3.1, uses TurkishLowerCaseFilter for Turkish language.
SnowballFilter
A filter that stems words using a Snowball-generated stemmer.
Available stemmers are listed in Lucene.Net.Tartarus.Snowball.Ext.
NOTE: SnowballFilter expects lowercased text.
- For the Turkish language, see TurkishLowerCaseFilter.
- For other languages, see LowerCaseFilter.
Note: This filter is aware of the KeywordAttribute. To prevent
certain terms from being passed to the stemmer
IsKeyword should be set to true
in a previous TokenStream.
Note: For including the original term as well as the stemmed version, see
KeywordRepeatFilterFactory
SnowballPorterFilterFactory
Factory for SnowballFilter, with configurable language
Note: Use of the "Lovins" stemmer is not recommended, as it is implemented with reflection.
<fieldType name="text_snowballstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" protected="protectedkeyword.txt" language="English"/>
</analyzer>
</fieldType>