Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class CzechAnalyzer

    Lucene.Net.Analysis.Analyzer for Czech language.

    Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating CzechAnalyzer:

    • As of 3.1, words are stemmed with CzechStemFilter
    • As of 2.9, StopFilter preserves position increments
    • As of 2.4, Tokens incorrectly identified as acronyms are corrected (see LUCENE-1068)

    Inheritance
    System.Object
    Lucene.Net.Analysis.Analyzer
    StopwordAnalyzerBase
    CzechAnalyzer
    Implements
    System.IDisposable
    Inherited Members
    StopwordAnalyzerBase.m_stopwords
    StopwordAnalyzerBase.m_matchVersion
    StopwordAnalyzerBase.StopwordSet
    StopwordAnalyzerBase.LoadStopwordSet(Boolean, Type, String, String)
    StopwordAnalyzerBase.LoadStopwordSet(FileInfo, LuceneVersion)
    StopwordAnalyzerBase.LoadStopwordSet(TextReader, LuceneVersion)
    Analyzer.NewAnonymous(Func<String, TextReader, TokenStreamComponents>)
    Analyzer.NewAnonymous(Func<String, TextReader, TokenStreamComponents>, ReuseStrategy)
    Analyzer.NewAnonymous(Func<String, TextReader, TokenStreamComponents>, Func<String, TextReader, TextReader>)
    Analyzer.NewAnonymous(Func<String, TextReader, TokenStreamComponents>, Func<String, TextReader, TextReader>, ReuseStrategy)
    Analyzer.GetTokenStream(String, TextReader)
    Analyzer.GetTokenStream(String, String)
    Analyzer.InitReader(String, TextReader)
    Analyzer.GetPositionIncrementGap(String)
    Analyzer.GetOffsetGap(String)
    Lucene.Net.Analysis.Analyzer.Strategy
    Lucene.Net.Analysis.Analyzer.Dispose()
    Analyzer.Dispose(Boolean)
    Lucene.Net.Analysis.Analyzer.GLOBAL_REUSE_STRATEGY
    Lucene.Net.Analysis.Analyzer.PER_FIELD_REUSE_STRATEGY
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: Lucene.Net.Analysis.Cz
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public sealed class CzechAnalyzer : StopwordAnalyzerBase, IDisposable

    Constructors

    | Improve this Doc View Source

    CzechAnalyzer(LuceneVersion)

    Builds an analyzer with the default stop words (DefaultStopSet).

    Declaration
    public CzechAnalyzer(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion matchVersion

    Lucene.Net.Util.LuceneVersion to match

    | Improve this Doc View Source

    CzechAnalyzer(LuceneVersion, CharArraySet)

    Builds an analyzer with the given stop words.

    Declaration
    public CzechAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion matchVersion

    Lucene.Net.Util.LuceneVersion to match

    CharArraySet stopwords

    a stopword set

    | Improve this Doc View Source

    CzechAnalyzer(LuceneVersion, CharArraySet, CharArraySet)

    Builds an analyzer with the given stop words and a set of work to be excluded from the CzechStemFilter.

    Declaration
    public CzechAnalyzer(LuceneVersion matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion matchVersion

    Lucene.Net.Util.LuceneVersion to match

    CharArraySet stopwords

    a stopword set

    CharArraySet stemExclusionTable

    a stemming exclusion set

    Fields

    | Improve this Doc View Source

    DEFAULT_STOPWORD_FILE

    File containing default Czech stopwords.

    Declaration
    public const string DEFAULT_STOPWORD_FILE = "stopwords.txt"
    Field Value
    Type Description
    System.String

    Properties

    | Improve this Doc View Source

    DefaultStopSet

    Returns a set of default Czech-stopwords

    Declaration
    public static CharArraySet DefaultStopSet { get; }
    Property Value
    Type Description
    CharArraySet

    a set of default Czech-stopwords

    Methods

    | Improve this Doc View Source

    CreateComponents(String, TextReader)

    Creates Lucene.Net.Analysis.TokenStreamComponents used to tokenize all the text in the provided System.IO.TextReader.

    Declaration
    protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
    Parameters
    Type Name Description
    System.String fieldName
    System.IO.TextReader reader
    Returns
    Type Description
    Lucene.Net.Analysis.TokenStreamComponents

    Lucene.Net.Analysis.TokenStreamComponents built from a StandardTokenizer filtered with StandardFilter, LowerCaseFilter, StopFilter, and CzechStemFilter (only if version is >= LUCENE_31). If a version is >= LUCENE_31 and a stem exclusion set is provided via CzechAnalyzer(LuceneVersion, CharArraySet, CharArraySet) a SetKeywordMarkerFilter is added before CzechStemFilter.

    Overrides
    Analyzer.CreateComponents(String, TextReader)

    Implements

    System.IDisposable
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.