Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class FuzzySuggester

    Implements a fuzzy AnalyzingSuggester. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false for the transpositions parameter.

    At most, this query will match terms up to Lucene.Net.Util.Automaton.LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances are not supported. Note that the fuzzy distance is measured in "byte space" on the bytes returned by the Lucene.Net.Analysis.TokenStream's Lucene.Net.Analysis.TokenAttributes.ITermToBytesRefAttribute, usually UTF8. By default the analyzed bytes must be at least 3 DEFAULT_MIN_FUZZY_LENGTH bytes before any edits are considered. Furthermore, the first 1 DEFAULT_NON_FUZZY_PREFIX byte is not allowed to be edited. We allow up to 1 DEFAULT_MAX_EDITS edit. If unicodeAware parameter in the constructor is set to true, maxEdits, minFuzzyLength, transpositions and nonFuzzyPrefix are measured in Unicode code points (actual letters) instead of bytes.

    NOTE: This suggester does not boost suggestions that required no edits over suggestions that did require edits. This is a known limitation.

    Note: complex query analyzers can have a significant impact on the lookup performance. It's recommended to not use analyzers that drop or inject terms like synonyms to keep the complexity of the prefix intersection low for good lookup performance. At index time, complex analyzers can safely be used.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    object
    Lookup
    AnalyzingSuggester
    FuzzySuggester
    Inherited Members
    AnalyzingSuggester.GetSizeInBytes()
    AnalyzingSuggester.Build(IInputEnumerator)
    AnalyzingSuggester.Store(DataOutput)
    AnalyzingSuggester.Load(DataInput)
    AnalyzingSuggester.DoLookup(string, IEnumerable<BytesRef>, bool, int)
    AnalyzingSuggester.Count
    AnalyzingSuggester.Get(string)
    Lookup.CHARSEQUENCE_COMPARER
    Lookup.Build(IDictionary)
    Lookup.Load(Stream)
    Lookup.Store(Stream)
    Lookup.DoLookup(string, bool, int)
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Search.Suggest.Analyzing
    Assembly: Lucene.Net.Suggest.dll
    Syntax
    public sealed class FuzzySuggester : AnalyzingSuggester

    Constructors

    FuzzySuggester(Analyzer)

    Creates a FuzzySuggester instance initialized with default values.

    Declaration
    public FuzzySuggester(Analyzer analyzer)
    Parameters
    Type Name Description
    Analyzer analyzer

    The Lucene.Net.Analysis.Analyzer used for this suggester.

    FuzzySuggester(Analyzer, Analyzer)

    Creates a FuzzySuggester instance with an index & a query analyzer initialized with default values.

    Declaration
    public FuzzySuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer)
    Parameters
    Type Name Description
    Analyzer indexAnalyzer

    Lucene.Net.Analysis.Analyzer that will be used for analyzing suggestions while building the index.

    Analyzer queryAnalyzer

    Lucene.Net.Analysis.Analyzer that will be used for analyzing query text during lookup

    FuzzySuggester(Analyzer, Analyzer, SuggesterOptions, int, int, bool, int, bool, int, int, bool)

    Creates a FuzzySuggester instance.

    Declaration
    public FuzzySuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer, SuggesterOptions options, int maxSurfaceFormsPerAnalyzedForm, int maxGraphExpansions, bool preservePositionIncrements, int maxEdits, bool transpositions, int nonFuzzyPrefix, int minFuzzyLength, bool unicodeAware)
    Parameters
    Type Name Description
    Analyzer indexAnalyzer

    The Lucene.Net.Analysis.Analyzer that will be used for analyzing suggestions while building the index.

    Analyzer queryAnalyzer

    The Lucene.Net.Analysis.Analyzer that will be used for analyzing query text during lookup

    SuggesterOptions options

    see EXACT_FIRST, PRESERVE_SEP

    int maxSurfaceFormsPerAnalyzedForm

    Maximum number of surface forms to keep for a single analyzed form. When there are too many surface forms we discard the lowest weighted ones.

    int maxGraphExpansions

    Maximum number of graph paths to expand from the analyzed form. Set this to -1 for no limit.

    bool preservePositionIncrements

    Whether position holes should appear in the automaton

    int maxEdits

    must be >= 0 and <= Lucene.Net.Util.Automaton.LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE.

    bool transpositions

    true if transpositions should be treated as a primitive edit operation. If this is false, comparisons will implement the classic Levenshtein algorithm.

    int nonFuzzyPrefix

    length of common (non-fuzzy) prefix (see default DEFAULT_NON_FUZZY_PREFIX

    int minFuzzyLength

    minimum length of lookup key before any edits are allowed (see default DEFAULT_MIN_FUZZY_LENGTH)

    bool unicodeAware

    operate Unicode code points instead of bytes.

    Fields

    DEFAULT_MAX_EDITS

    The default maximum number of edits for fuzzy suggestions.

    Declaration
    public const int DEFAULT_MAX_EDITS = 1
    Field Value
    Type Description
    int

    DEFAULT_MIN_FUZZY_LENGTH

    The default minimum length of the key passed to Lookup before any edits are allowed.

    Declaration
    public const int DEFAULT_MIN_FUZZY_LENGTH = 3
    Field Value
    Type Description
    int

    DEFAULT_NON_FUZZY_PREFIX

    The default prefix length where edits are not allowed.

    Declaration
    public const int DEFAULT_NON_FUZZY_PREFIX = 1
    Field Value
    Type Description
    int

    DEFAULT_TRANSPOSITIONS

    The default transposition value passed to Lucene.Net.Util.Automaton.LevenshteinAutomata

    Declaration
    public const bool DEFAULT_TRANSPOSITIONS = true
    Field Value
    Type Description
    bool

    DEFAULT_UNICODE_AWARE

    Measure maxEdits, minFuzzyLength, transpositions, and nonFuzzyPrefix parameters in Unicode code points (actual letters) instead of bytes.

    Declaration
    public const bool DEFAULT_UNICODE_AWARE = false
    Field Value
    Type Description
    bool

    Methods

    ConvertAutomaton(Automaton)

    Used by subclass to change the lookup automaton, if necessary.

    Declaration
    protected override Automaton ConvertAutomaton(Automaton a)
    Parameters
    Type Name Description
    Automaton a
    Returns
    Type Description
    Automaton
    Overrides
    AnalyzingSuggester.ConvertAutomaton(Automaton)

    GetFullPrefixPaths(IList<Path<Pair>>, Automaton, FST<Pair>)

    Returns all prefix paths to initialize the search.

    Declaration
    protected override IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> GetFullPrefixPaths(IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> prefixPaths, Automaton lookupAutomaton, FST<PairOutputs<Int64, BytesRef>.Pair> fst)
    Parameters
    Type Name Description
    IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>> prefixPaths
    Automaton lookupAutomaton
    FST<PairOutputs<Int64, BytesRef>.Pair> fst
    Returns
    Type Description
    IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>>
    Overrides
    AnalyzingSuggester.GetFullPrefixPaths(IList<FSTUtil.Path<PairOutputs<Int64, BytesRef>.Pair>>, Automaton, FST<PairOutputs<Int64, BytesRef>.Pair>)
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.