Class ICUTokenizerConfig

Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

System.Object

ICUTokenizerConfig

System.Object.Equals(System.Object)

System.Object.Equals(System.Object, System.Object)

System.Object.GetHashCode()

System.Object.GetType()

System.Object.MemberwiseClone()

System.Object.ReferenceEquals(System.Object, System.Object)

System.Object.ToString()

public abstract class ICUTokenizerConfig

| Improve this Doc View Source

Sole constructor. (For invocation by subclass constructors, typically implicit.)

public ICUTokenizerConfig()

| Improve this Doc View Source

true if Han, Hiragana, and Katakana scripts should all be returned as Japanese

public abstract bool CombineCJ { get; }

Type	Description
System.Boolean

| Improve this Doc View Source

Return a breakiterator capable of processing a given script.

public abstract BreakIterator GetBreakIterator(int script)

Type	Name	Description
System.Int32	script

Type	Description
ICU4N.Text.BreakIterator

| Improve this Doc View Source

Return a token type value for a given script and BreakIterator rule status.

public abstract string GetType(int script, int ruleStatus)

Type	Name	Description
System.Int32	script
System.Int32	ruleStatus

Type	Description
System.String