Class UkrainianMorfologikAnalyzer
A dictionary-based Lucene.Net.Analysis.Analyzer for Ukrainian.
Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Uk
Assembly: Lucene.Net.Analysis.Morfologik.dll
Syntax
public sealed class UkrainianMorfologikAnalyzer : StopwordAnalyzerBase, IDisposable
Constructors
UkrainianMorfologikAnalyzer(LuceneVersion)
Builds an analyzer with the default stop words: DEFAULT_STOPWORD_FILE.
Declaration
public UkrainianMorfologikAnalyzer(LuceneVersion matchVersion)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | Lucene.Net.Util.LuceneVersion to match. |
UkrainianMorfologikAnalyzer(LuceneVersion, CharArraySet)
Builds an analyzer with the given stop words.
Declaration
public UkrainianMorfologikAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | Lucene.Net.Util.LuceneVersion to match. |
CharArraySet | stopwords | A stopword set. |
UkrainianMorfologikAnalyzer(LuceneVersion, CharArraySet, CharArraySet)
Builds an analyzer with the given stop words. If a non-empty stem exclusion set is provided this analyzer will add a Lucene.Net.Analysis.Miscellaneous.SetKeywordMarkerFilter before stemming.
Declaration
public UkrainianMorfologikAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionSet)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | Lucene.Net.Util.LuceneVersion to match. |
CharArraySet | stopwords | A stopword set. |
CharArraySet | stemExclusionSet | A set of terms not to be stemmed. |
Fields
DEFAULT_STOPWORD_FILE
File containing default Ukrainian stopwords.
Declaration
public const string DEFAULT_STOPWORD_FILE = "stopwords.txt"
Field Value
Type | Description |
---|---|
string |
Properties
DefaultStopSet
Returns an unmodifiable instance of the default stop words set.
Declaration
public static CharArraySet DefaultStopSet { get; }
Property Value
Type | Description |
---|---|
CharArraySet | Default stop words set. |
Methods
CreateComponents(string, TextReader)
Creates a Lucene.Net.Analysis.TokenStreamComponents which tokenizes all the text in the provided TextReader.
Declaration
protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
Parameters
Type | Name | Description |
---|---|---|
string | fieldName | |
TextReader | reader |
Returns
Type | Description |
---|---|
TokenStreamComponents | A Lucene.Net.Analysis.TokenStreamComponents built from a Lucene.Net.Analysis.Standard.StandardTokenizer filtered with Lucene.Net.Analysis.Core.LowerCaseFilter, Lucene.Net.Analysis.Core.StopFilter, Lucene.Net.Analysis.Miscellaneous.SetKeywordMarkerFilter if a stem exclusion set is provided and MorfologikFilter on the Ukrainian dictionary. |
Overrides
InitReader(string, TextReader)
Override this if you want to add a Lucene.Net.Analysis.CharFilter chain.
The default implementation returnsreader
unchanged.
Declaration
protected override TextReader InitReader(string fieldName, TextReader reader)
Parameters
Type | Name | Description |
---|---|---|
string | fieldName | Lucene.Net.Index.IIndexableField name being indexed |
TextReader | reader | original TextReader |
Returns
Type | Description |
---|---|
TextReader | reader, optionally decorated with Lucene.Net.Analysis.CharFilter(s) |