Namespace Lucene.Net.Analysis.Fr
Analyzer for French.
Classes
FrenchAnalyzer
Lucene.Net.Analysis.Analyzer for French language.
Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating FrenchAnalyzer:
- As of 3.6, FrenchLightStemFilter is used for less aggressive stemming.
- As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and ElisionFilter and Snowball stopwords are used by default.
- As of 2.9, StopFilter preserves position increments
NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.
FrenchLightStemFilter
A Lucene.Net.Analysis.TokenFilter that applies FrenchLightStemmer to stem French words.
To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the Lucene.Net.Analysis.TokenAttributes.KeywordAttribute before this Lucene.Net.Analysis.TokenStream.
FrenchLightStemFilterFactory
Factory for FrenchLightStemFilter.
<fieldType name="text_frlgtstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ElisionFilterFactory"/>
<filter class="solr.FrenchLightStemFilterFactory"/>
</analyzer>
</fieldType>
FrenchLightStemmer
Light Stemmer for French.
This stemmer implements the "UniNE" algorithm in:
Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages
Jacques Savoy
FrenchMinimalStemFilter
A Lucene.Net.Analysis.TokenFilter that applies FrenchMinimalStemmer to stem French words.
To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the Lucene.Net.Analysis.TokenAttributes.KeywordAttribute before this Lucene.Net.Analysis.TokenStream.
FrenchMinimalStemFilterFactory
Factory for FrenchMinimalStemFilter.
<fieldType name="text_frminstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ElisionFilterFactory"/>
<filter class="solr.FrenchMinimalStemFilterFactory"/>
</analyzer>
</fieldType>
FrenchMinimalStemmer
Light Stemmer for French.
This stemmer implements the following algorithm:
A Stemming procedure and stopword list for general French corpora.
Jacques Savoy.
FrenchStemFilter
A Lucene.Net.Analysis.TokenFilter that stems french words.
The used stemmer can be changed at runtime after the filter object is created (as long as it is a FrenchStemmer).
To prevent terms from being stemmed use an instance of KeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the Lucene.Net.Analysis.TokenAttributes.KeywordAttribute before this Lucene.Net.Analysis.TokenStream.
FrenchStemmer
A stemmer for French words.
The algorithm is based on the work of Dr Martin Porter on his snowball project
refer to http://snowball.sourceforge.net/french/stemmer.html (French stemming algorithm) for details