Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class HMMChineseTokenizerFactory

    Factory for HMMChineseTokenizer

    Note: this class will currently emit tokens for punctuation. So you should either add a Lucene.Net.Analysis.Miscellaneous.WordDelimiterFilter after to remove these (with concatenate off), or use the SmartChinese stoplist with a StopFilterFactory via:
    words="org/apache/lucene/analysis/cn/smart/stopwords.txt"

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    object
    AbstractAnalysisFactory
    TokenizerFactory
    HMMChineseTokenizerFactory
    Inherited Members
    TokenizerFactory.ForName(string, IDictionary<string, string>)
    TokenizerFactory.LookupClass(string)
    TokenizerFactory.AvailableTokenizers
    TokenizerFactory.ReloadTokenizers()
    TokenizerFactory.Create(TextReader)
    AbstractAnalysisFactory.LUCENE_MATCH_VERSION_PARAM
    AbstractAnalysisFactory.OriginalArgs
    AbstractAnalysisFactory.LuceneMatchVersion
    AbstractAnalysisFactory.Require(IDictionary<string, string>, string)
    AbstractAnalysisFactory.Require(IDictionary<string, string>, string, ICollection<string>)
    AbstractAnalysisFactory.Require(IDictionary<string, string>, string, ICollection<string>, bool)
    AbstractAnalysisFactory.Get(IDictionary<string, string>, string, string)
    AbstractAnalysisFactory.Get(IDictionary<string, string>, string, ICollection<string>)
    AbstractAnalysisFactory.Get(IDictionary<string, string>, string, ICollection<string>, string)
    AbstractAnalysisFactory.Get(IDictionary<string, string>, string, ICollection<string>, string, bool)
    AbstractAnalysisFactory.RequireChar(IDictionary<string, string>, string)
    AbstractAnalysisFactory.GetChar(IDictionary<string, string>, string, char)
    AbstractAnalysisFactory.GetSet(IDictionary<string, string>, string)
    AbstractAnalysisFactory.GetClassArg()
    AbstractAnalysisFactory.IsExplicitLuceneMatchVersion
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Analysis.Cn.Smart
    Assembly: Lucene.Net.Analysis.SmartCn.dll
    Syntax
    public sealed class HMMChineseTokenizerFactory : TokenizerFactory

    Constructors

    HMMChineseTokenizerFactory(IDictionary<string, string>)

    Creates a new HMMChineseTokenizerFactory

    Declaration
    public HMMChineseTokenizerFactory(IDictionary<string, string> args)
    Parameters
    Type Name Description
    IDictionary<string, string> args

    Methods

    Create(AttributeFactory, TextReader)

    Creates a Lucene.Net.Analysis.TokenStream of the specified input using the given Lucene.Net.Util.AttributeSource.AttributeFactory

    Declaration
    public override Tokenizer Create(AttributeSource.AttributeFactory factory, TextReader reader)
    Parameters
    Type Name Description
    AttributeSource.AttributeFactory factory
    TextReader reader
    Returns
    Type Description
    Tokenizer
    Overrides
    TokenizerFactory.Create(AttributeSource.AttributeFactory, TextReader)
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.