Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class FuzzyQuery

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Inheritance
    object
    Query
    MultiTermQuery
    FuzzyQuery
    Inherited Members
    MultiTermQuery.m_field
    MultiTermQuery.m_rewriteMethod
    MultiTermQuery.CONSTANT_SCORE_FILTER_REWRITE
    MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE
    MultiTermQuery.CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE
    MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT
    MultiTermQuery.Field
    MultiTermQuery.GetTermsEnum(Terms)
    MultiTermQuery.Rewrite(IndexReader)
    MultiTermQuery.MultiTermRewriteMethod
    Query.Boost
    Query.ToString()
    Query.CreateWeight(IndexSearcher)
    Query.ExtractTerms(ISet<Term>)
    Query.Clone()
    object.Equals(object, object)
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    Namespace: Lucene.Net.Search
    Assembly: Lucene.Net.dll
    Syntax
    public class FuzzyQuery : MultiTermQuery

    Constructors

    FuzzyQuery(Term)

    Calls FuzzyQuery(term, defaultMaxEdits).

    Declaration
    public FuzzyQuery(Term term)
    Parameters
    Type Name Description
    Term term

    FuzzyQuery(Term, int)

    Calls FuzzyQuery(term, maxEdits, defaultPrefixLength).

    Declaration
    public FuzzyQuery(Term term, int maxEdits)
    Parameters
    Type Name Description
    Term term
    int maxEdits

    FuzzyQuery(Term, int, int)

    Calls FuzzyQuery(term, maxEdits, prefixLength, defaultMaxExpansions, defaultTranspositions).

    Declaration
    public FuzzyQuery(Term term, int maxEdits, int prefixLength)
    Parameters
    Type Name Description
    Term term
    int maxEdits
    int prefixLength

    FuzzyQuery(Term, int, int, int, bool)

    Create a new FuzzyQuery that will match terms with an edit distance of at most maxEdits to term. If a prefixLength > 0 is specified, a common prefix of that length is also required.

    Declaration
    public FuzzyQuery(Term term, int maxEdits, int prefixLength, int maxExpansions, bool transpositions)
    Parameters
    Type Name Description
    Term term

    The term to search for

    int maxEdits

    Must be >= 0 and <= MAXIMUM_SUPPORTED_DISTANCE.

    int prefixLength

    Length of common (non-fuzzy) prefix

    int maxExpansions

    The maximum number of terms to match. If this number is greater than MaxClauseCount when the query is rewritten, then the maxClauseCount will be used instead.

    bool transpositions

    true if transpositions should be treated as a primitive edit operation. If this is false, comparisons will implement the classic Levenshtein algorithm.

    Fields

    DefaultMaxEdits

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Declaration
    public const int DefaultMaxEdits = 2
    Field Value
    Type Description
    int

    DefaultMaxExpansions

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Declaration
    public const int DefaultMaxExpansions = 50
    Field Value
    Type Description
    int

    DefaultMinSimilarity

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Declaration
    [Obsolete("pass integer edit distances instead.")]
    public const float DefaultMinSimilarity = 2
    Field Value
    Type Description
    float

    DefaultPrefixLength

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Declaration
    public const int DefaultPrefixLength = 0
    Field Value
    Type Description
    int

    DefaultTranspositions

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Declaration
    public const bool DefaultTranspositions = true
    Field Value
    Type Description
    bool

    Properties

    MaxEdits

    Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.

    this query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

    At most, this query will match terms up to MAXIMUM_SUPPORTED_DISTANCE edits. Higher distances (especially with transpositions enabled), are generally not useful and will match a significant amount of the term dictionary. If you really want this, consider using an n-gram indexing technique (such as the SpellChecker in the suggest module) instead.

    NOTE: terms of length 1 or 2 will sometimes not match because of how the scaled distance between two terms is computed. For a term to match, the edit distance between the terms must be less than the minimum length term (either the input term, or the candidate term). For example, FuzzyQuery on term "abcd" with maxEdits=2 will not match an indexed term "ab", and FuzzyQuery on term "a" with maxEdits=2 will not match an indexed term "abc".
    Declaration
    public virtual int MaxEdits { get; }
    Property Value
    Type Description
    int

    The maximum number of edit distances allowed for this query to match.

    PrefixLength

    Returns the non-fuzzy prefix length. This is the number of characters at the start of a term that must be identical (not fuzzy) to the query term if the query is to match that term.

    Declaration
    public virtual int PrefixLength { get; }
    Property Value
    Type Description
    int

    Term

    Returns the pattern term.

    Declaration
    public virtual Term Term { get; }
    Property Value
    Type Description
    Term

    Transpositions

    Returns true if transpositions should be treated as a primitive edit operation. If this is false, comparisons will implement the classic Levenshtein algorithm.

    Declaration
    public virtual bool Transpositions { get; }
    Property Value
    Type Description
    bool

    Methods

    Equals(object)

    Determines whether the specified object is equal to the current object.

    Declaration
    public override bool Equals(object obj)
    Parameters
    Type Name Description
    object obj

    The object to compare with the current object.

    Returns
    Type Description
    bool

    true if the specified object is equal to the current object; otherwise, false.

    Overrides
    MultiTermQuery.Equals(object)

    GetHashCode()

    Serves as the default hash function.

    Declaration
    public override int GetHashCode()
    Returns
    Type Description
    int

    A hash code for the current object.

    Overrides
    MultiTermQuery.GetHashCode()

    GetTermsEnum(Terms, AttributeSource)

    Construct the enumeration to be used, expanding the pattern term. this method should only be called if the field exists (ie, implementations can assume the field does exist). this method should not return null (should instead return EMPTY if no terms match). The TermsEnum must already be positioned to the first matching term. The given AttributeSource is passed by the MultiTermQuery.RewriteMethod to provide attributes, the rewrite method uses to inform about e.g. maximum competitive boosts. this is currently only used by TopTermsRewrite<Q>.

    Declaration
    protected override TermsEnum GetTermsEnum(Terms terms, AttributeSource atts)
    Parameters
    Type Name Description
    Terms terms
    AttributeSource atts
    Returns
    Type Description
    TermsEnum
    Overrides
    MultiTermQuery.GetTermsEnum(Terms, AttributeSource)

    SingleToEdits(float, int)

    Helper function to convert from deprecated "minimumSimilarity" fractions to raw edit distances.

    NOTE: this was floatToEdits() in Lucene
    Declaration
    [Obsolete("pass integer edit distances instead.")]
    public static int SingleToEdits(float minimumSimilarity, int termLen)
    Parameters
    Type Name Description
    float minimumSimilarity

    Scaled similarity

    int termLen

    Length (in unicode codepoints) of the term.

    Returns
    Type Description
    int

    Equivalent number of maxEdits

    ToString(string)

    Prints a query to a string, with field assumed to be the default field and omitted.

    Declaration
    public override string ToString(string field)
    Parameters
    Type Name Description
    string field
    Returns
    Type Description
    string
    Overrides
    Query.ToString(string)
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.