Class ICUTokenizerConfig

Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

System.Object

ICUTokenizerConfig

public abstract class ICUTokenizerConfig : object

| Improve this Doc View Source

Sole constructor. (For invocation by subclass constructors, typically implicit.)

public ICUTokenizerConfig()

| Improve this Doc View Source

true if Han, Hiragana, and Katakana scripts should all be returned as Japanese

public abstract bool CombineCJ { get; }

Type	Description
System.Boolean

| Improve this Doc View Source

Return a breakiterator capable of processing a given script.

public abstract BreakIterator GetBreakIterator(int script)

Type	Name	Description
System.Int32	script

Type	Description
BreakIterator

| Improve this Doc View Source

Return a token type value for a given script and BreakIterator rule status.

public abstract string GetType(int script, int ruleStatus)

Type	Name	Description
System.Int32	script
System.Int32	ruleStatus

Type	Description
System.String