Namespace Lucene.Net.Analysis.En
Analyzer for English.
Classes
EnglishAnalyzer
Lucene.
EnglishMinimalStemFilter
A Lucene.
To prevent terms from being stemmed use an instance of
Set
EnglishMinimalStemFilterFactory
Factory for English
<fieldType name="text_enminstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishMinimalStemFilterFactory"/>
</analyzer>
</fieldType>
EnglishMinimalStemmer
Minimal plural stemmer for English.
This stemmer implements the "S-Stemmer" from
How Effective Is Suffixing?
Donna Harman.
EnglishPossessiveFilter
TokenFilter that removes possessives (trailing 's) from words.
You must specify the required Lucene.
- As of 3.6, U+2019 RIGHT SINGLE QUOTATION MARK and U+FF07 FULLWIDTH APOSTROPHE are also treated as quotation marks.
EnglishPossessiveFilterFactory
Factory for English
<fieldType name="text_enpossessive" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
</analyzer>
</fieldType>
KStemFilter
A high-performance kstem filter for english.
See "Viewing Morphology as an Inference Process" (Krovetz, R., Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 191-203, 1993).
All terms must already be lowercased for this filter to work correctly.
Note: This filter is aware of the Keyword
true
in a previous Lucene.KStemFilterFactory
Factory for KStem
<fieldType name="text_kstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KStemFilterFactory"/>
</analyzer>
</fieldType>
KStemmer
This class implements the Kstem algorithm
PorterStemFilter
Transforms the token stream as per the Porter stemming algorithm.
Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly!
To use this filter with other analyzers, you'll want to write an Analyzer class that sets up the TokenStream chain as you want it. To use this with LowerCaseTokenizer, for example, you'd write an analyzer like this:
class MyAnalyzer : Analyzer {
protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader) {
Tokenizer source = new LowerCaseTokenizer(version, reader);
return new TokenStreamComponents(source, new PorterStemFilter(source));
}
}
Note: This filter is aware of the Keyword
true
in a previous Lucene.PorterStemFilterFactory
Factory for Porter
<fieldType name="text_porterstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>