Class SpellChecker
Spell Checker class (Main class)
(initially inspired by the David Spencer code).
Example Usage (C#):
SpellChecker spellchecker = new SpellChecker(spellIndexDirectory);
// To index a field of a user index:
spellchecker.IndexDictionary(new LuceneDictionary(my_lucene_reader, a_field));
// To index a file containing words:
spellchecker.IndexDictionary(new PlainTextDictionary(new FileInfo("myfile.txt")));
string[] suggestions = spellchecker.SuggestSimilar("misspelt", 5);
Inheritance
Namespace: Lucene.Net.Search.Spell
Assembly: Lucene.Net.Suggest.dll
Syntax
public class SpellChecker : IDisposable
Constructors
| Improve this Doc View SourceSpellChecker(Store.Directory)
Use the given directory as a spell checker index with a
Levenstein
Declaration
public SpellChecker(Store.Directory spellIndex)
Parameters
Type | Name | Description |
---|---|---|
Store. |
spellIndex | the spell index directory |
SpellChecker(Store.Directory, IStringDistance)
Use the given directory as a spell checker index. The directory is created if it doesn't exist yet.
Declaration
public SpellChecker(Store.Directory spellIndex, IStringDistance sd)
Parameters
Type | Name | Description |
---|---|---|
Store. |
spellIndex | the spell index directory |
IString |
sd | the String |
SpellChecker(Store.Directory, IStringDistance, IComparer<SuggestWord>)
Use the given directory as a spell checker index with the given IString
Declaration
public SpellChecker(Store.Directory spellIndex, IStringDistance sd, IComparer<SuggestWord> comparer)
Parameters
Type | Name | Description |
---|---|---|
Store. |
spellIndex | The spelling index |
IString |
sd | The distance |
IComparer<Suggest |
comparer | The comparer |
Fields
| Improve this Doc View SourceDEFAULT_ACCURACY
The default minimum score to use, if not specified by setting Accuracy
or overriding with Suggest
Declaration
public const float DEFAULT_ACCURACY = null
Field Value
Type | Description |
---|---|
System. |
F_WORD
Field name for each word in the ngram index.
Declaration
public const string F_WORD = null
Field Value
Type | Description |
---|---|
System. |
Properties
| Improve this Doc View SourceAccuracy
Gets or sets the accuracy (minimum score) to be used, unless overridden in
Suggest
Declaration
public virtual float Accuracy { get; set; }
Property Value
Type | Description |
---|---|
System. |
Comparer
Gets or sets the
Declaration
public virtual IComparer<SuggestWord> Comparer { get; set; }
Property Value
Type | Description |
---|---|
IComparer<Suggest |
StringDistance
Gets or sets the IString
Declaration
public virtual IStringDistance StringDistance { get; set; }
Property Value
Type | Description |
---|---|
IString |
Methods
| Improve this Doc View SourceClearIndex()
Removes all terms from the spell check index.
Declaration
public virtual void ClearIndex()
Dispose()
Dispose the underlying IndexSearcher used by this SpellChecker
Declaration
public void Dispose()
Exist(String)
Check whether the word exists in the index.
Declaration
public virtual bool Exist(string word)
Parameters
Type | Name | Description |
---|---|---|
System. |
word | word to check |
Returns
Type | Description |
---|---|
System. |
true if the word exists in the index |
IndexDictionary(IDictionary, IndexWriterConfig, Boolean)
Indexes the data from the given IDictionary.
Declaration
public void IndexDictionary(IDictionary dict, IndexWriterConfig config, bool fullMerge)
Parameters
Type | Name | Description |
---|---|---|
IDictionary | dict | Dictionary to index |
Index |
config | |
System. |
fullMerge | whether or not the spellcheck index should be fully merged |
SetSpellIndex(Store.Directory)
Sets a different index as the spell checker index or re-open the existing index if
spellIndex
is the same value
as given in the constructor.
Declaration
public virtual void SetSpellIndex(Store.Directory spellIndexDir)
Parameters
Type | Name | Description |
---|---|---|
Store. |
spellIndexDir | the spell directory to use |
SuggestSimilar(String, Int32)
Suggest similar words.
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
Declaration
public virtual string[] SuggestSimilar(string word, int numSug)
Parameters
Type | Name | Description |
---|---|---|
System. |
word | the word you want a spell check done on |
System. |
numSug | the number of suggested words |
Returns
Type | Description |
---|---|
System. |
string[] the sorted list of the suggest words with these 2 criteria: first criteria: the edit distance, second criteria (only if restricted mode): the popularity of the suggest words in the field of the user index |
See Also
| Improve this Doc View SourceSuggestSimilar(String, Int32, IndexReader, String, SuggestMode)
Calls Suggest
Declaration
public virtual string[] SuggestSimilar(string word, int numSug, IndexReader ir, string field, SuggestMode suggestMode)
Parameters
Type | Name | Description |
---|---|---|
System. |
word | |
System. |
numSug | |
Index |
ir | |
System. |
field | |
Suggest |
suggestMode |
Returns
Type | Description |
---|---|
System. |
SuggestSimilar(String, Int32, IndexReader, String, SuggestMode, Single)
Suggest similar words (optionally restricted to a field of an index).
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
Declaration
public virtual string[] SuggestSimilar(string word, int numSug, IndexReader ir, string field, SuggestMode suggestMode, float accuracy)
Parameters
Type | Name | Description |
---|---|---|
System. |
word | the word you want a spell check done on |
System. |
numSug | the number of suggested words |
Index |
ir | the indexReader of the user index (can be null see field param) |
System. |
field | the field of the user index: if field is not null, the suggested words are restricted to the words present in this field. |
Suggest |
suggestMode | (NOTE: if indexReader==null and/or field==null, then this is overridden with SuggestMode.SUGGEST_ALWAYS) |
System. |
accuracy | The minimum score a suggestion must have in order to qualify for inclusion in the results |
Returns
Type | Description |
---|---|
System. |
string[] the sorted list of the suggest words with these 2 criteria: first criteria: the edit distance, second criteria (only if restricted mode): the popularity of the suggest words in the field of the user index |
SuggestSimilar(String, Int32, Single)
Suggest similar words.
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
Declaration
public virtual string[] SuggestSimilar(string word, int numSug, float accuracy)
Parameters
Type | Name | Description |
---|---|---|
System. |
word | the word you want a spell check done on |
System. |
numSug | the number of suggested words |
System. |
accuracy | The minimum score a suggestion must have in order to qualify for inclusion in the results |
Returns
Type | Description |
---|---|
System. |
string[] the sorted list of the suggest words with these 2 criteria: first criteria: the edit distance, second criteria (only if restricted mode): the popularity of the suggest words in the field of the user index |