Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class ICUTokenizerConfig

    Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    object
    ICUTokenizerConfig
    DefaultICUTokenizerConfig
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Analysis.Icu.Segmentation
    Assembly: Lucene.Net.ICU.dll
    Syntax
    public abstract class ICUTokenizerConfig

    Constructors

    ICUTokenizerConfig()

    Sole constructor. (For invocation by subclass constructors, typically implicit.)

    Declaration
    protected ICUTokenizerConfig()

    Fields

    EMOJI_SEQUENCE_STATUS

    Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    public const int EMOJI_SEQUENCE_STATUS = 299
    Field Value
    Type Description
    int

    Properties

    CombineCJ

    true if Han, Hiragana, and Katakana scripts should all be returned as Japanese

    Declaration
    public abstract bool CombineCJ { get; }
    Property Value
    Type Description
    bool

    Methods

    GetBreakIterator(int)

    Return a breakiterator capable of processing a given script.

    Declaration
    public abstract RuleBasedBreakIterator GetBreakIterator(int script)
    Parameters
    Type Name Description
    int script
    Returns
    Type Description
    RuleBasedBreakIterator

    GetType(int, int)

    Return a token type value for a given script and BreakIterator rule status.

    Declaration
    public abstract string GetType(int script, int ruleStatus)
    Parameters
    Type Name Description
    int script
    int ruleStatus
    Returns
    Type Description
    string
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.