Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class DutchAnalyzer

    Lucene.Net.Analysis.Analyzer for Dutch language.

    Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:

    • As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
    • As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
    • As of 2.9, StopFilter preserves position increments

    NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.

    Inheritance
    object
    Analyzer
    DutchAnalyzer
    Implements
    IDisposable
    Inherited Members
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>)
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, ReuseStrategy)
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, Func<string, TextReader, TextReader>)
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, Func<string, TextReader, TextReader>, ReuseStrategy)
    Analyzer.GetTokenStream(string, TextReader)
    Analyzer.GetTokenStream(string, string)
    Analyzer.GetPositionIncrementGap(string)
    Analyzer.GetOffsetGap(string)
    Analyzer.Strategy
    Analyzer.Dispose()
    Analyzer.GLOBAL_REUSE_STRATEGY
    Analyzer.PER_FIELD_REUSE_STRATEGY
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Analysis.Nl
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public sealed class DutchAnalyzer : Analyzer, IDisposable

    Constructors

    DutchAnalyzer(LuceneVersion)

    Builds an analyzer with the default stop words (DefaultStopSet) and a few default entries for the stem exclusion table.

    Declaration
    public DutchAnalyzer(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    LuceneVersion matchVersion

    DutchAnalyzer(LuceneVersion, CharArraySet)

    Lucene.Net.Analysis.Analyzer for Dutch language.

    Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:

    • As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
    • As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
    • As of 2.9, StopFilter preserves position increments

    NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.

    Declaration
    public DutchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)
    Parameters
    Type Name Description
    LuceneVersion matchVersion
    CharArraySet stopwords

    DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet)

    Lucene.Net.Analysis.Analyzer for Dutch language.

    Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:

    • As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
    • As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
    • As of 2.9, StopFilter preserves position increments

    NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.

    Declaration
    public DutchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable)
    Parameters
    Type Name Description
    LuceneVersion matchVersion
    CharArraySet stopwords
    CharArraySet stemExclusionTable

    DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet, CharArrayDictionary<string>)

    Lucene.Net.Analysis.Analyzer for Dutch language.

    Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:

    • As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
    • As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
    • As of 2.9, StopFilter preserves position increments

    NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.

    Declaration
    public DutchAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable, CharArrayDictionary<string> stemOverrideDict)
    Parameters
    Type Name Description
    LuceneVersion matchVersion
    CharArraySet stopwords
    CharArraySet stemExclusionTable
    CharArrayDictionary<string> stemOverrideDict

    Fields

    DEFAULT_STOPWORD_FILE

    File containing default Dutch stopwords.

    Declaration
    public const string DEFAULT_STOPWORD_FILE = "dutch_stop.txt"
    Field Value
    Type Description
    string

    Properties

    DefaultStopSet

    Returns an unmodifiable instance of the default stop-words set.

    Declaration
    public static CharArraySet DefaultStopSet { get; }
    Property Value
    Type Description
    CharArraySet

    an unmodifiable instance of the default stop-words set.

    Methods

    CreateComponents(string, TextReader)

    Returns a (possibly reused) Lucene.Net.Analysis.TokenStream which tokenizes all the text in the provided TextReader.

    Declaration
    protected override TokenStreamComponents CreateComponents(string fieldName, TextReader aReader)
    Parameters
    Type Name Description
    string fieldName
    TextReader aReader
    Returns
    Type Description
    TokenStreamComponents

    A Lucene.Net.Analysis.TokenStream built from a StandardTokenizer filtered with StandardFilter, LowerCaseFilter, StopFilter, SetKeywordMarkerFilter if a stem exclusion set is provided, StemmerOverrideFilter, and SnowballFilter

    Overrides
    Analyzer.CreateComponents(string, TextReader)

    Implements

    IDisposable
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.