Enum JapaneseTokenizerMode
Tokenization mode: this determines how the tokenizer handles compound and unknown words.
Namespace: Lucene.Net.Analysis.Ja
Assembly: Lucene.Net.Analysis.Kuromoji.dll
Syntax
public enum JapaneseTokenizerMode
Fields
| Name | Description |
|---|---|
| EXTENDED | Extended mode outputs unigrams for unknown words.
This is a Lucene.NET EXPERIMENTAL API, use at your own risk |
| NORMAL | Ordinary segmentation: no decomposition for compounds, |
| SEARCH | Segmentation geared towards search: this includes a decompounding process for long nouns, also including the full compound token as a synonym. |