Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class CJKAnalyzer

    An Lucene.Net.Analysis.Analyzer that tokenizes text with StandardTokenizer, normalizes content with CJKWidthFilter, folds case with LowerCaseFilter, forms bigrams of CJK with CJKBigramFilter, and filters stopwords with StopFilter

    Inheritance
    object
    Analyzer
    StopwordAnalyzerBase
    CJKAnalyzer
    Implements
    IDisposable
    Inherited Members
    StopwordAnalyzerBase.StopwordSet
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>)
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, ReuseStrategy)
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, Func<string, TextReader, TextReader>)
    Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, Func<string, TextReader, TextReader>, ReuseStrategy)
    Analyzer.GetTokenStream(string, TextReader)
    Analyzer.GetTokenStream(string, string)
    Analyzer.GetPositionIncrementGap(string)
    Analyzer.GetOffsetGap(string)
    Analyzer.Strategy
    Analyzer.Dispose()
    Analyzer.GLOBAL_REUSE_STRATEGY
    Analyzer.PER_FIELD_REUSE_STRATEGY
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Analysis.Cjk
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public sealed class CJKAnalyzer : StopwordAnalyzerBase, IDisposable

    Constructors

    CJKAnalyzer(LuceneVersion)

    Builds an analyzer which removes words in DefaultStopSet.

    Declaration
    public CJKAnalyzer(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    LuceneVersion matchVersion

    CJKAnalyzer(LuceneVersion, CharArraySet)

    Builds an analyzer with the given stop words

    Declaration
    public CJKAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)
    Parameters
    Type Name Description
    LuceneVersion matchVersion

    lucene compatibility version

    CharArraySet stopwords

    a stopword set

    Fields

    DEFAULT_STOPWORD_FILE

    File containing default CJK stopwords.

    Currently it contains some common English words that are not usually useful for searching and some double-byte interpunctions.
    Declaration
    public const string DEFAULT_STOPWORD_FILE = "stopwords.txt"
    Field Value
    Type Description
    string

    Properties

    DefaultStopSet

    Returns an unmodifiable instance of the default stop-words set.

    Declaration
    public static CharArraySet DefaultStopSet { get; }
    Property Value
    Type Description
    CharArraySet

    an unmodifiable instance of the default stop-words set.

    Methods

    CreateComponents(string, TextReader)

    Creates a new Lucene.Net.Analysis.TokenStreamComponents instance for this analyzer.

    Declaration
    protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
    Parameters
    Type Name Description
    string fieldName

    the name of the fields content passed to the Lucene.Net.Analysis.TokenStreamComponents sink as a reader

    TextReader reader

    the reader passed to the Lucene.Net.Analysis.Tokenizer constructor

    Returns
    Type Description
    TokenStreamComponents

    the Lucene.Net.Analysis.TokenStreamComponents for this analyzer.

    Overrides
    Analyzer.CreateComponents(string, TextReader)

    Implements

    IDisposable
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.