Class FuzzySuggester
Implements a fuzzy AnalyzingSuggester. The similarity measurement is
based on the Damerau-Levenshtein (optimal string alignment) algorithm, though
you can explicitly choose classic Levenshtein by passing false
for the transpositions parameter.
At most, this query will match terms up to Lucene.Net.Util.Automaton.LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances are not supported. Note that the fuzzy distance is measured in "byte space" on the bytes returned by the Lucene.Net.Analysis.TokenStream's Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute, usually UTF8. By default the analyzed bytes must be at least 3 DEFAULT_MIN_FUZZY_LENGTH bytes before any edits are considered. Furthermore, the first 1 DEFAULT_NON_FUZZY_PREFIX byte is not allowed to be edited. We allow up to 1 DEFAULT_MAX_EDITS edit. If unicodeAware parameter in the constructor is set to true, maxEdits, minFuzzyLength, transpositions and nonFuzzyPrefix are measured in Unicode code points (actual letters) instead of bytes.
NOTE: This suggester does not boost suggestions that required no edits over suggestions that did require edits. This is a known limitation.
Note: complex query analyzers can have a significant impact on the lookup performance. It's recommended to not use analyzers that drop or inject terms like synonyms to keep the complexity of the prefix intersection low for good lookup performance. At index time, complex analyzers can safely be used.
Note
This API is experimental and might change in incompatible ways in the next release.
Inherited Members
Namespace: Lucene.Net.Search.Suggest.Analyzing
Assembly: Lucene.Net.Suggest.dll
Syntax
public sealed class FuzzySuggester : AnalyzingSuggester
Constructors
FuzzySuggester(Analyzer)
Creates a FuzzySuggester instance initialized with default values.
Declaration
public FuzzySuggester(Analyzer analyzer)
Parameters
Type | Name | Description |
---|---|---|
Analyzer | analyzer | The Lucene.Net.Analysis.Analyzer used for this suggester. |
FuzzySuggester(Analyzer, Analyzer)
Creates a FuzzySuggester instance with an index & a query analyzer initialized with default values.
Declaration
public FuzzySuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer)
Parameters
Type | Name | Description |
---|---|---|
Analyzer | indexAnalyzer | Lucene.Net.Analysis.Analyzer that will be used for analyzing suggestions while building the index. |
Analyzer | queryAnalyzer | Lucene.Net.Analysis.Analyzer that will be used for analyzing query text during lookup |
FuzzySuggester(Analyzer, Analyzer, SuggesterOptions, int, int, bool, int, bool, int, int, bool)
Creates a FuzzySuggester instance.
Declaration
public FuzzySuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer, SuggesterOptions options, int maxSurfaceFormsPerAnalyzedForm, int maxGraphExpansions, bool preservePositionIncrements, int maxEdits, bool transpositions, int nonFuzzyPrefix, int minFuzzyLength, bool unicodeAware)
Parameters
Type | Name | Description |
---|---|---|
Analyzer | indexAnalyzer | The Lucene.Net.Analysis.Analyzer that will be used for analyzing suggestions while building the index. |
Analyzer | queryAnalyzer | The Lucene.Net.Analysis.Analyzer that will be used for analyzing query text during lookup |
SuggesterOptions | options | see EXACT_FIRST, PRESERVE_SEP |
int | maxSurfaceFormsPerAnalyzedForm | Maximum number of surface forms to keep for a single analyzed form. When there are too many surface forms we discard the lowest weighted ones. |
int | maxGraphExpansions | Maximum number of graph paths to expand from the analyzed form. Set this to -1 for no limit. |
bool | preservePositionIncrements | Whether position holes should appear in the automaton |
int | maxEdits | must be >= 0 and <= Lucene.Net.Util.Automaton.LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE. |
bool | transpositions |
|
int | nonFuzzyPrefix | length of common (non-fuzzy) prefix (see default DEFAULT_NON_FUZZY_PREFIX |
int | minFuzzyLength | minimum length of lookup key before any edits are allowed (see default DEFAULT_MIN_FUZZY_LENGTH) |
bool | unicodeAware | operate Unicode code points instead of bytes. |
Fields
DEFAULT_MAX_EDITS
The default maximum number of edits for fuzzy suggestions.
Declaration
public const int DEFAULT_MAX_EDITS = 1
Field Value
Type | Description |
---|---|
int |
DEFAULT_MIN_FUZZY_LENGTH
The default minimum length of the key passed to Lookup before any edits are allowed.
Declaration
public const int DEFAULT_MIN_FUZZY_LENGTH = 3
Field Value
Type | Description |
---|---|
int |
DEFAULT_NON_FUZZY_PREFIX
The default prefix length where edits are not allowed.
Declaration
public const int DEFAULT_NON_FUZZY_PREFIX = 1
Field Value
Type | Description |
---|---|
int |
DEFAULT_TRANSPOSITIONS
The default transposition value passed to Lucene.Net.Util.Automaton.LevenshteinAutomata
Declaration
public const bool DEFAULT_TRANSPOSITIONS = true
Field Value
Type | Description |
---|---|
bool |
DEFAULT_UNICODE_AWARE
Measure maxEdits, minFuzzyLength, transpositions, and nonFuzzyPrefix parameters in Unicode code points (actual letters) instead of bytes.
Declaration
public const bool DEFAULT_UNICODE_AWARE = false
Field Value
Type | Description |
---|---|
bool |
Methods
ConvertAutomaton(Automaton)
Used by subclass to change the lookup automaton, if necessary.
Declaration
protected override Automaton ConvertAutomaton(Automaton a)
Parameters
Type | Name | Description |
---|---|---|
Automaton | a |
Returns
Type | Description |
---|---|
Automaton |
Overrides
GetFullPrefixPaths(IList<Path<Pair>>, Automaton, FST<Pair>)
Returns all prefix paths to initialize the search.
Declaration
protected override IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> GetFullPrefixPaths(IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> prefixPaths, Automaton lookupAutomaton, FST<PairOutputs<Int64, BytesRef>.Pair> fst)
Parameters
Type | Name | Description |
---|---|---|
IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> | prefixPaths | |
Automaton | lookupAutomaton | |
FST<PairOutputs<Int64, BytesRef>.Pair> | fst |
Returns
Type | Description |
---|---|
IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> |