Class FrenchAnalyzer

Analyzer for French language.

Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

You must specify the required LuceneVersion compatibility when creating FrenchAnalyzer:

As of 3.6, FrenchLightStemFilter is used for less aggressive stemming.
As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and ElisionFilter and Snowball stopwords are used by default.
As of 2.9, StopFilter preserves position increments

NOTE: This class uses the same LuceneVersion dependent settings as StandardAnalyzer.

Inheritance

System.Object

Analyzer

StopwordAnalyzerBase

FrenchAnalyzer

Inherited Members

StopwordAnalyzerBase.m_stopwords

StopwordAnalyzerBase.m_matchVersion

StopwordAnalyzerBase.StopwordSet

StopwordAnalyzerBase.LoadStopwordSet(Boolean, Type, String, String)

StopwordAnalyzerBase.LoadStopwordSet(FileInfo, LuceneVersion)

StopwordAnalyzerBase.LoadStopwordSet(TextReader, LuceneVersion)

Lucene.Net.Analysis.Analyzer.NewAnonymous(Func<, , >)

Lucene.Net.Analysis.Analyzer.NewAnonymous(Func<, , >, Lucene.Net.Analysis.ReuseStrategy)

Lucene.Net.Analysis.Analyzer.NewAnonymous(Func<, , >, Func<, , >)

Lucene.Net.Analysis.Analyzer.NewAnonymous(Func<, , >, Func<, , >, Lucene.Net.Analysis.ReuseStrategy)

Analyzer.GetTokenStream(String, TextReader)

Analyzer.GetTokenStream(String, String)

Analyzer.InitReader(String, TextReader)

Analyzer.GetPositionIncrementGap(String)

Analyzer.GetOffsetGap(String)

Analyzer.Strategy

Analyzer.Dispose()

Analyzer.Dispose(Boolean)

Analyzer.GLOBAL_REUSE_STRATEGY

Analyzer.PER_FIELD_REUSE_STRATEGY

Namespace: Lucene.Net.Analysis.Fr

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

public sealed class FrenchAnalyzer : StopwordAnalyzerBase

Constructors

| Improve this Doc View Source

FrenchAnalyzer(LuceneVersion)

Builds an analyzer with the default stop words (DefaultStopSet).

Declaration

public FrenchAnalyzer(LuceneVersion matchVersion)

Parameters

Type	Name	Description
LuceneVersion	matchVersion

| Improve this Doc View Source

FrenchAnalyzer(LuceneVersion, CharArraySet)

Builds an analyzer with the given stop words

Declaration

public FrenchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	lucene compatibility version
CharArraySet	stopwords	a stopword set

| Improve this Doc View Source

FrenchAnalyzer(LuceneVersion, CharArraySet, CharArraySet)

Builds an analyzer with the given stop words

Declaration

public FrenchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclutionSet)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	lucene compatibility version
CharArraySet	stopwords	a stopword set
CharArraySet	stemExclutionSet	a stemming exclusion set

Fields

| Improve this Doc View Source

DEFAULT_ARTICLES

Default set of articles for ElisionFilter

Declaration

public static readonly CharArraySet DEFAULT_ARTICLES

Field Value

Type	Description
CharArraySet

| Improve this Doc View Source

DEFAULT_STOPWORD_FILE

File containing default French stopwords.

Declaration

public const string DEFAULT_STOPWORD_FILE = null

Field Value

Type	Description
System.String

Properties

| Improve this Doc View Source

DefaultStopSet

Returns an unmodifiable instance of the default stop-words set.

Declaration

public static CharArraySet DefaultStopSet { get; }

Property Value

Type	Description
CharArraySet	an unmodifiable instance of the default stop-words set.

Methods

| Improve this Doc View Source

CreateComponents(String, TextReader)

Creates TokenStreamComponents used to tokenize all the text in the provided .

Declaration

protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)

Parameters

Type	Name	Description
System.String	fieldName
TextReader	reader

Returns

Type	Description
TokenStreamComponents	TokenStreamComponents built from a StandardTokenizer filtered with StandardFilter, ElisionFilter, LowerCaseFilter, StopFilter, SetKeywordMarkerFilter if a stem exclusion set is provided, and FrenchLightStemFilter