Class BinaryDictionary
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Ja.Dict
Assembly: Lucene.Net.Analysis.Kuromoji.dll
Syntax
public abstract class BinaryDictionary : IDictionary
Constructors
BinaryDictionary()
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
protected BinaryDictionary()
Fields
DICT_FILENAME_SUFFIX
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly string DICT_FILENAME_SUFFIX
Field Value
Type | Description |
---|---|
string |
DICT_HEADER
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly string DICT_HEADER
Field Value
Type | Description |
---|---|
string |
HAS_BASEFORM
flag that the entry has baseform data. otherwise its not inflected (same as surface form)
Declaration
public static readonly int HAS_BASEFORM
Field Value
Type | Description |
---|---|
int |
HAS_PRONUNCIATION
flag that the entry has pronunciation data. otherwise pronunciation is the reading
Declaration
public static readonly int HAS_PRONUNCIATION
Field Value
Type | Description |
---|---|
int |
HAS_READING
flag that the entry has reading data. otherwise reading is surface form converted to katakana
Declaration
public static readonly int HAS_READING
Field Value
Type | Description |
---|---|
int |
POSDICT_FILENAME_SUFFIX
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly string POSDICT_FILENAME_SUFFIX
Field Value
Type | Description |
---|---|
string |
POSDICT_HEADER
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly string POSDICT_HEADER
Field Value
Type | Description |
---|---|
string |
TARGETMAP_FILENAME_SUFFIX
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly string TARGETMAP_FILENAME_SUFFIX
Field Value
Type | Description |
---|---|
string |
TARGETMAP_HEADER
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly string TARGETMAP_HEADER
Field Value
Type | Description |
---|---|
string |
VERSION
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static readonly int VERSION
Field Value
Type | Description |
---|---|
int |
Methods
GetBaseForm(int, char[], int, int)
Get base form of word.
Declaration
public virtual string GetBaseForm(int wordId, char[] surfaceForm, int off, int len)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
char[] | surfaceForm | |
int | off | |
int | len |
Returns
Type | Description |
---|---|
string | Base form (only different for inflected words, otherwise null). |
GetInflectionForm(int)
Get inflection form of tokens.
Declaration
public virtual string GetInflectionForm(int wordId)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
Returns
Type | Description |
---|---|
string | Inflection form, or null. |
GetInflectionType(int)
Get inflection type of tokens.
Declaration
public virtual string GetInflectionType(int wordId)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
Returns
Type | Description |
---|---|
string | Inflection type, or null. |
GetLeftId(int)
Get left id of specified word.
Declaration
public virtual int GetLeftId(int wordId)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
Returns
Type | Description |
---|---|
int | Left id. |
GetPartOfSpeech(int)
Get Part-Of-Speech of tokens
Declaration
public virtual string GetPartOfSpeech(int wordId)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
Returns
Type | Description |
---|---|
string | Part-Of-Speech of the token. |
GetPronunciation(int, char[], int, int)
Get pronunciation of tokens
Declaration
public virtual string GetPronunciation(int wordId, char[] surface, int off, int len)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
char[] | surface | |
int | off | |
int | len |
Returns
Type | Description |
---|---|
string | Pronunciation of the token. |
GetReading(int, char[], int, int)
Get reading of tokens.
Declaration
public virtual string GetReading(int wordId, char[] surface, int off, int len)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
char[] | surface | |
int | off | |
int | len |
Returns
Type | Description |
---|---|
string | Reading of the token. |
GetResource(string)
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
protected Stream GetResource(string suffix)
Parameters
Type | Name | Description |
---|---|---|
string | suffix |
Returns
Type | Description |
---|---|
Stream |
GetRightId(int)
Get right id of specified word.
Declaration
public virtual int GetRightId(int wordId)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
Returns
Type | Description |
---|---|
int | Right id. |
GetTypeResource(Type, string)
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public static Stream GetTypeResource(Type clazz, string suffix)
Parameters
Type | Name | Description |
---|---|---|
Type | clazz | |
string | suffix |
Returns
Type | Description |
---|---|
Stream |
GetWordCost(int)
Get word cost of specified word
Declaration
public virtual int GetWordCost(int wordId)
Parameters
Type | Name | Description |
---|---|---|
int | wordId | Word ID of token. |
Returns
Type | Description |
---|---|
int | Word's cost. |
LookupWordIds(int, Int32sRef)
Base class for a binary-encoded in-memory dictionary.
NOTE: To use an alternate dicationary than the built-in one, put the data files in a subdirectory of your application named "kuromoji-data". This subdirectory can be placed in any directory up to and including the root directory (if the OS permission allows). To place the files in an alternate location, set an environment variable named "kuromoji.data.dir" with the name of the directory the data files can be located within.Declaration
public virtual void LookupWordIds(int sourceId, Int32sRef @ref)
Parameters
Type | Name | Description |
---|---|---|
int | sourceId | |
Int32sRef | ref |