Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.Fr

    Analyzer for French.

    Classes

    FrenchAnalyzer

    Analyzer for French language.

    Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

    You must specify the required LuceneVersion compatibility when creating FrenchAnalyzer:

    • As of 3.6, FrenchLightStemFilter is used for less aggressive stemming.
    • As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and ElisionFilter and Snowball stopwords are used by default.
    • As of 2.9, StopFilter preserves position increments

    NOTE: This class uses the same LuceneVersion dependent settings as StandardAnalyzer.

    FrenchLightStemFilter

    A TokenFilter that applies FrenchLightStemmer to stem French words.

    To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom TokenFilter that sets the KeywordAttribute before this TokenStream.

    FrenchLightStemFilterFactory

    Factory for FrenchLightStemFilter.

    <fieldType name="text_frlgtstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ElisionFilterFactory"/>
        <filter class="solr.FrenchLightStemFilterFactory"/>
      </analyzer>
    </fieldType>

    FrenchLightStemmer

    Light Stemmer for French.

    This stemmer implements the "UniNE" algorithm in: Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages Jacques Savoy

    FrenchMinimalStemFilter

    A TokenFilter that applies FrenchMinimalStemmer to stem French words.

    To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom TokenFilter that sets the KeywordAttribute before this TokenStream.

    FrenchMinimalStemFilterFactory

    Factory for FrenchMinimalStemFilter.

    <fieldType name="text_frminstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ElisionFilterFactory"/>
        <filter class="solr.FrenchMinimalStemFilterFactory"/>
      </analyzer>
    </fieldType>

    FrenchMinimalStemmer

    Light Stemmer for French.

    This stemmer implements the following algorithm: A Stemming procedure and stopword list for general French corpora. Jacques Savoy.

    FrenchStemFilter

    A TokenFilter that stems french words.

    The used stemmer can be changed at runtime after the filter object is created (as long as it is a FrenchStemmer).

    To prevent terms from being stemmed use an instance of KeywordMarkerFilter or a custom TokenFilter that sets the KeywordAttribute before this TokenStream.

    FrenchStemmer

    A stemmer for French words.

    The algorithm is based on the work of Dr Martin Porter on his snowball project

    refer to http://snowball.sourceforge.net/french/stemmer.html (French stemming algorithm) for details

    • Improve this Doc
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)