Class DutchAnalyzer
Lucene.Net.Analysis.Analyzer for Dutch language.
Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:
- As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
- As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
- As of 2.9, StopFilter preserves position increments
NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.
Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Nl
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public sealed class DutchAnalyzer : Analyzer, IDisposable
Constructors
DutchAnalyzer(LuceneVersion)
Builds an analyzer with the default stop words (DefaultStopSet) and a few default entries for the stem exclusion table.
Declaration
public DutchAnalyzer(LuceneVersion matchVersion)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion |
DutchAnalyzer(LuceneVersion, CharArraySet)
Lucene.Net.Analysis.Analyzer for Dutch language.
Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:
- As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
- As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
- As of 2.9, StopFilter preserves position increments
NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.
Declaration
public DutchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | |
CharArraySet | stopwords |
DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet)
Lucene.Net.Analysis.Analyzer for Dutch language.
Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:
- As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
- As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
- As of 2.9, StopFilter preserves position increments
NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.
Declaration
public DutchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | |
CharArraySet | stopwords | |
CharArraySet | stemExclusionTable |
DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet, CharArrayDictionary<string>)
Lucene.Net.Analysis.Analyzer for Dutch language.
Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:
- As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
- As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
- As of 2.9, StopFilter preserves position increments
NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.
Declaration
public DutchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable, CharArrayDictionary<string> stemOverrideDict)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | |
CharArraySet | stopwords | |
CharArraySet | stemExclusionTable | |
CharArrayDictionary<string> | stemOverrideDict |
Fields
DEFAULT_STOPWORD_FILE
File containing default Dutch stopwords.
Declaration
public const string DEFAULT_STOPWORD_FILE = "dutch_stop.txt"
Field Value
Type | Description |
---|---|
string |
Properties
DefaultStopSet
Returns an unmodifiable instance of the default stop-words set.
Declaration
public static CharArraySet DefaultStopSet { get; }
Property Value
Type | Description |
---|---|
CharArraySet | an unmodifiable instance of the default stop-words set. |
Methods
CreateComponents(string, TextReader)
Returns a (possibly reused) Lucene.Net.Analysis.TokenStream which tokenizes all the text in the provided TextReader.
Declaration
protected override TokenStreamComponents CreateComponents(string fieldName, TextReader aReader)
Parameters
Type | Name | Description |
---|---|---|
string | fieldName | |
TextReader | aReader |
Returns
Type | Description |
---|---|
TokenStreamComponents | A Lucene.Net.Analysis.TokenStream built from a StandardTokenizer filtered with StandardFilter, LowerCaseFilter, StopFilter, SetKeywordMarkerFilter if a stem exclusion set is provided, StemmerOverrideFilter, and SnowballFilter |