Class CJKAnalyzer
An Lucene.Net.Analysis.Analyzer that tokenizes text with StandardTokenizer, normalizes content with CJKWidthFilter, folds case with LowerCaseFilter, forms bigrams of CJK with CJKBigramFilter, and filters stopwords with StopFilter
Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Cjk
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public sealed class CJKAnalyzer : StopwordAnalyzerBase, IDisposable
Constructors
CJKAnalyzer(LuceneVersion)
Builds an analyzer which removes words in DefaultStopSet.
Declaration
public CJKAnalyzer(LuceneVersion matchVersion)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion |
CJKAnalyzer(LuceneVersion, CharArraySet)
Builds an analyzer with the given stop words
Declaration
public CJKAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | lucene compatibility version |
CharArraySet | stopwords | a stopword set |
Fields
DEFAULT_STOPWORD_FILE
File containing default CJK stopwords.
Currently it contains some common English words that are not usually useful for searching and some double-byte interpunctions.Declaration
public const string DEFAULT_STOPWORD_FILE = "stopwords.txt"
Field Value
Type | Description |
---|---|
string |
Properties
DefaultStopSet
Returns an unmodifiable instance of the default stop-words set.
Declaration
public static CharArraySet DefaultStopSet { get; }
Property Value
Type | Description |
---|---|
CharArraySet | an unmodifiable instance of the default stop-words set. |
Methods
CreateComponents(string, TextReader)
Creates a new Lucene.Net.Analysis.TokenStreamComponents instance for this analyzer.
Declaration
protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
Parameters
Type | Name | Description |
---|---|---|
string | fieldName | the name of the fields content passed to the Lucene.Net.Analysis.TokenStreamComponents sink as a reader |
TextReader | reader | the reader passed to the Lucene.Net.Analysis.Tokenizer constructor |
Returns
Type | Description |
---|---|
TokenStreamComponents | the Lucene.Net.Analysis.TokenStreamComponents for this analyzer. |