Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.Ja.Dict

    Kuromoji dictionary implementation.

    Classes

    BinaryDictionary

    Base class for a binary-encoded in-memory dictionary.

    NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.

    CharacterDefinition

    Character category data.

    ConnectionCosts

    n-gram connection cost data

    Dictionary

    Kuromoji dictionary implementation.

    TokenInfoDictionary

    Binary dictionary implementation for a known-word dictionary model: Words are encoded into an FST mapping to a list of wordIDs.

    TokenInfoFST

    Thin wrapper around an FST with root-arc caching for Japanese.

    Depending upon fasterButMoreRam, either just kana (191 arcs), or kana and han (28,607 arcs) are cached. The latter offers additional performance at the cost of more RAM.

    UnknownDictionary

    Dictionary for unknown-word handling.

    UserDictionary

    Class for building a User Dictionary. This class allows for custom segmentation of phrases.

    Interfaces

    IDictionary

    Dictionary interface for retrieving morphological data by id.

    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.