Enum JapaneseTokenizerMode
Tokenization mode: this determines how the tokenizer handles compound and unknown words.
Namespace: Lucene.Net.Analysis.Ja
Assembly: Lucene.Net.Analysis.Kuromoji.dll
Syntax
public enum JapaneseTokenizerMode : int
Fields
Name | Description |
---|---|
EXTENDED | Extended mode outputs unigrams for unknown words.
This is a Lucene.NET EXPERIMENTAL API, use at your own risk |
NORMAL | Ordinary segmentation: no decomposition for compounds, |
SEARCH | Segmentation geared towards search: this includes a decompounding process for long nouns, also including the full compound token as a synonym. |