Namespace Lucene.Net.Analysis.Ja.Dict
Kuromoji dictionary implementation.
Classes
BinaryDictionary
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.
CharacterDefinition
Character category data.
ConnectionCosts
n-gram connection cost data
Dictionary
TokenInfoDictionary
Binary dictionary implementation for a known-word dictionary model: Words are encoded into an FST mapping to a list of wordIDs.
TokenInfoFST
Thin wrapper around an FST with root-arc caching for Japanese.
Depending upon fasterButMoreRam, either just kana (191 arcs), or kana and han (28,607 arcs) are cached. The latter offers additional performance at the cost of more RAM.
UnknownDictionary
Dictionary for unknown-word handling.
UserDictionary
Class for building a User Dictionary. This class allows for custom segmentation of phrases.
Interfaces
IDictionary
Dictionary interface for retrieving morphological data by id.