Namespace Lucene.Net.Analysis.Nl

Analyzer for Dutch.

Classes

DutchAnalyzer

Lucene.Net.Analysis.Analyzer for Dutch language.

Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating DutchAnalyzer:

As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
As of 2.9, StopFilter preserves position increments

NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.

DutchStemFilter

A Lucene.Net.Analysis.TokenFilter that stems Dutch words.

It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a DutchStemmer).

To prevent terms from being stemmed use an instance of KeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the Lucene.Net.Analysis.TokenAttributes.IKeywordAttribute before this Lucene.Net.Analysis.TokenStream.

DutchStemmer

A stemmer for Dutch words.

The algorithm is an implementation of the dutch stemming algorithm in Martin Porter's snowball project.