Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class ICUTokenizerConfig

    Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    System.Object
    ICUTokenizerConfig
    DefaultICUTokenizerConfig
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: Lucene.Net.Analysis.Icu.Segmentation
    Assembly: Lucene.Net.ICU.dll
    Syntax
    public abstract class ICUTokenizerConfig

    Constructors

    | Improve this Doc View Source

    ICUTokenizerConfig()

    Sole constructor. (For invocation by subclass constructors, typically implicit.)

    Declaration
    protected ICUTokenizerConfig()

    Fields

    | Improve this Doc View Source

    EMOJI_SEQUENCE_STATUS

    Declaration
    public const int EMOJI_SEQUENCE_STATUS = 299
    Field Value
    Type Description
    System.Int32

    Properties

    | Improve this Doc View Source

    CombineCJ

    true if Han, Hiragana, and Katakana scripts should all be returned as Japanese

    Declaration
    public abstract bool CombineCJ { get; }
    Property Value
    Type Description
    System.Boolean

    Methods

    | Improve this Doc View Source

    GetBreakIterator(Int32)

    Return a breakiterator capable of processing a given script.

    Declaration
    public abstract RuleBasedBreakIterator GetBreakIterator(int script)
    Parameters
    Type Name Description
    System.Int32 script
    Returns
    Type Description
    ICU4N.Text.RuleBasedBreakIterator
    | Improve this Doc View Source

    GetType(Int32, Int32)

    Return a token type value for a given script and BreakIterator rule status.

    Declaration
    public abstract string GetType(int script, int ruleStatus)
    Parameters
    Type Name Description
    System.Int32 script
    System.Int32 ruleStatus
    Returns
    Type Description
    System.String
    • Improve this Doc
    • View Source
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.