Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class SpellChecker

    Spell Checker class (Main class)
    (initially inspired by the David Spencer code).

    Example Usage (C#):

    SpellChecker spellchecker = new SpellChecker(spellIndexDirectory);
     // To index a field of a user index:
     spellchecker.IndexDictionary(new LuceneDictionary(my_lucene_reader, a_field));
     // To index a file containing words:
     spellchecker.IndexDictionary(new PlainTextDictionary(new FileInfo("myfile.txt")));
     string[] suggestions = spellchecker.SuggestSimilar("misspelt", 5);
    Inheritance
    object
    SpellChecker
    Implements
    IDisposable
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Search.Spell
    Assembly: Lucene.Net.Suggest.dll
    Syntax
    public class SpellChecker : IDisposable

    Constructors

    SpellChecker(Directory)

    Use the given directory as a spell checker index with a LevensteinDistance as the default StringDistance. The directory is created if it doesn't exist yet.

    Declaration
    public SpellChecker(Directory spellIndex)
    Parameters
    Type Name Description
    Directory spellIndex

    the spell index directory

    Exceptions
    Type Condition
    IOException

    if spellchecker can not open the directory

    SpellChecker(Directory, IStringDistance)

    Use the given directory as a spell checker index. The directory is created if it doesn't exist yet.

    Declaration
    public SpellChecker(Directory spellIndex, IStringDistance sd)
    Parameters
    Type Name Description
    Directory spellIndex

    the spell index directory

    IStringDistance sd

    the StringDistance measurement to use

    Exceptions
    Type Condition
    IOException

    if Spellchecker can not open the directory

    SpellChecker(Directory, IStringDistance, IComparer<SuggestWord>)

    Use the given directory as a spell checker index with the given IStringDistance measure and the given IComparer<T> for sorting the results.

    Declaration
    public SpellChecker(Directory spellIndex, IStringDistance sd, IComparer<SuggestWord> comparer)
    Parameters
    Type Name Description
    Directory spellIndex

    The spelling index

    IStringDistance sd

    The distance

    IComparer<SuggestWord> comparer

    The comparer

    Exceptions
    Type Condition
    IOException

    if there is a problem opening the index

    Fields

    DEFAULT_ACCURACY

    The default minimum score to use, if not specified by setting Accuracy or overriding with SuggestSimilar(string, int, IndexReader, string, SuggestMode, float) .

    Declaration
    public const float DEFAULT_ACCURACY = 0.5
    Field Value
    Type Description
    float

    F_WORD

    Field name for each word in the ngram index.

    Declaration
    public const string F_WORD = "word"
    Field Value
    Type Description
    string

    Properties

    Accuracy

    Gets or sets the accuracy (minimum score) to be used, unless overridden in SuggestSimilar(string, int, IndexReader, string, SuggestMode, float), to decide whether a suggestion is included or not. Sets the accuracy 0 < minScore < 1; default DEFAULT_ACCURACY

    Declaration
    public virtual float Accuracy { get; set; }
    Property Value
    Type Description
    float

    Comparer

    Gets or sets the IComparer<T> for the SuggestWordQueue.

    Declaration
    public virtual IComparer<SuggestWord> Comparer { get; set; }
    Property Value
    Type Description
    IComparer<SuggestWord>

    StringDistance

    Gets or sets the IStringDistance implementation for this SpellChecker instance.

    Declaration
    public virtual IStringDistance StringDistance { get; set; }
    Property Value
    Type Description
    IStringDistance

    Methods

    ClearIndex()

    Removes all terms from the spell check index.

    Declaration
    public virtual void ClearIndex()
    Exceptions
    Type Condition
    IOException

    If there is a low-level I/O error.

    ObjectDisposedException

    if the Spellchecker is already closed

    Dispose()

    Dispose the underlying Lucene.Net.Search.IndexSearcher used by this SpellChecker.

    Declaration
    public void Dispose()
    Exceptions
    Type Condition
    IOException

    if the close operation causes an IOException

    ObjectDisposedException

    if the SpellChecker is already disposed

    Dispose(bool)

    Releases resources used by the SpellChecker and if overridden in a derived class, optionally releases unmanaged resources.

    Declaration
    protected virtual void Dispose(bool disposing)
    Parameters
    Type Name Description
    bool disposing

    true to release both managed and unmanaged resources; false to release only unmanaged resources.

    Exist(string)

    Check whether the word exists in the index.

    Declaration
    public virtual bool Exist(string word)
    Parameters
    Type Name Description
    string word

    word to check

    Returns
    Type Description
    bool

    true if the word exists in the index

    Exceptions
    Type Condition
    IOException

    If there is a low-level I/O error.

    ObjectDisposedException

    if the SpellChecker is already disposed

    IndexDictionary(IDictionary, IndexWriterConfig, bool)

    Indexes the data from the given IDictionary.

    Declaration
    public void IndexDictionary(IDictionary dict, IndexWriterConfig config, bool fullMerge)
    Parameters
    Type Name Description
    IDictionary dict

    Dictionary to index

    IndexWriterConfig config

    Lucene.Net.Index.IndexWriterConfig to use

    bool fullMerge

    whether or not the spellcheck index should be fully merged

    Exceptions
    Type Condition
    ObjectDisposedException

    if the SpellChecker is already disposed

    IOException

    If there is a low-level I/O error.

    SetSpellIndex(Directory)

    Sets a different index as the spell checker index or re-open the existing index if

    spellIndex
    is the same value as given in the constructor.
    Declaration
    public void SetSpellIndex(Directory spellIndexDir)
    Parameters
    Type Name Description
    Directory spellIndexDir

    the spell directory to use

    Exceptions
    Type Condition
    ObjectDisposedException

    if the Spellchecker is already closed

    IOException

    if spellchecker can not open the directory

    SuggestSimilar(string, int)

    Suggest similar words.

    As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.

    I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.

    Declaration
    public virtual string[] SuggestSimilar(string word, int numSug)
    Parameters
    Type Name Description
    string word

    the word you want a spell check done on

    int numSug

    the number of suggested words

    Returns
    Type Description
    string[]

    string[] the sorted list of the suggest words with these 2 criteria: first criteria: the edit distance, second criteria (only if restricted mode): the popularity of the suggest words in the field of the user index

    Exceptions
    Type Condition
    IOException

    if the underlying index throws an IOException

    ObjectDisposedException

    if the Spellchecker is already disposed

    See Also
    SuggestSimilar(string, int, IndexReader, string, SuggestMode, float)

    SuggestSimilar(string, int, IndexReader, string, SuggestMode)

    Calls SuggestSimilar(string, int, IndexReader, string, SuggestMode, float) SuggestSimilar(word, numSug, ir, suggestMode, field, this.accuracy)

    Declaration
    public virtual string[] SuggestSimilar(string word, int numSug, IndexReader ir, string field, SuggestMode suggestMode)
    Parameters
    Type Name Description
    string word
    int numSug
    IndexReader ir
    string field
    SuggestMode suggestMode
    Returns
    Type Description
    string[]

    SuggestSimilar(string, int, IndexReader, string, SuggestMode, float)

    Suggest similar words (optionally restricted to a field of an index).

    As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.

    I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.

    Declaration
    public virtual string[] SuggestSimilar(string word, int numSug, IndexReader ir, string field, SuggestMode suggestMode, float accuracy)
    Parameters
    Type Name Description
    string word

    the word you want a spell check done on

    int numSug

    the number of suggested words

    IndexReader ir

    the indexReader of the user index (can be null see field param)

    string field

    the field of the user index: if field is not null, the suggested words are restricted to the words present in this field.

    SuggestMode suggestMode

    (NOTE: if indexReader==null and/or field==null, then this is overridden with SuggestMode.SUGGEST_ALWAYS)

    float accuracy

    The minimum score a suggestion must have in order to qualify for inclusion in the results

    Returns
    Type Description
    string[]

    string[] the sorted list of the suggest words with these 2 criteria: first criteria: the edit distance, second criteria (only if restricted mode): the popularity of the suggest words in the field of the user index

    Exceptions
    Type Condition
    IOException

    if the underlying index throws an IOException

    ObjectDisposedException

    if the SpellChecker is already disposed

    SuggestSimilar(string, int, float)

    Suggest similar words.

    As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.

    I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.

    Declaration
    public virtual string[] SuggestSimilar(string word, int numSug, float accuracy)
    Parameters
    Type Name Description
    string word

    the word you want a spell check done on

    int numSug

    the number of suggested words

    float accuracy

    The minimum score a suggestion must have in order to qualify for inclusion in the results

    Returns
    Type Description
    string[]

    string[] the sorted list of the suggest words with these 2 criteria: first criteria: the edit distance, second criteria (only if restricted mode): the popularity of the suggest words in the field of the user index

    Exceptions
    Type Condition
    IOException

    if the underlying index throws an IOException

    ObjectDisposedException

    if the Spellchecker is already disposed

    See Also
    SuggestSimilar(string, int, IndexReader, string, SuggestMode, float)

    Implements

    IDisposable
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.