Show / Hide Table of Contents

    Class CommonTermsQuery

    A query that executes high-frequency terms in a optional sub-query to prevent slow queries due to "common" terms like stopwords. This query builds 2 queries off the Add(Term) added terms: low-frequency terms are added to a required boolean clause and high-frequency terms are added to an optional boolean clause. The optional clause is only executed if the required "low-frequency" clause matches. Scores produced by this query will be slightly different than plain BooleanQuery scorer mainly due to differences in the number of leaf queries in the required boolean clause. In most cases, high-frequency terms are unlikely to significantly contribute to the document score unless at least one of the low-frequency terms are matched. This query can improve query execution times significantly if applicable.

    CommonTermsQuery has several advantages over stopword filtering at index or query time since a term can be "classified" based on the actual document frequency in the index and can prevent slow queries even across domains without specialized stopword files.

    Note: if the query only contains high-frequency terms the query is rewritten into a plain conjunction query ie. all high-frequency terms need to match in order to match a document.

    Collection initializer note: To create and populate a CommonTermsQuery in a single statement, you can use the following example as a guide:

    var query = new CommonTermsQuery() {
        new Term("field", "microsoft"), 
        new Term("field", "office")
    };
    Inheritance
    System.Object
    Query
    CommonTermsQuery
    Implements
    IEnumerable<Term>
    Inherited Members
    Query.Boost
    Query.ToString()
    Query.CreateWeight(IndexSearcher)
    Lucene.Net.Search.Query.ExtractTerms(ISet<>)
    Query.Clone()
    Namespace: Lucene.Net.Queries
    Assembly: Lucene.Net.Queries.dll
    Syntax
    public class CommonTermsQuery : Query, IEnumerable<Term>

    Constructors

    | Improve this Doc View Source

    CommonTermsQuery(Occur, Occur, Single)

    Creates a new CommonTermsQuery

    Declaration
    public CommonTermsQuery(Occur highFreqOccur, Occur lowFreqOccur, float maxTermFrequency)
    Parameters
    Type Name Description
    Occur highFreqOccur

    Occur used for high frequency terms

    Occur lowFreqOccur

    Occur used for low frequency terms

    System.Single maxTermFrequency

    a value in [0..1) (or absolute number >=1) representing the maximum threshold of a terms document frequency to be considered a low frequency term.

    | Improve this Doc View Source

    CommonTermsQuery(Occur, Occur, Single, Boolean)

    Creates a new CommonTermsQuery

    Declaration
    public CommonTermsQuery(Occur highFreqOccur, Occur lowFreqOccur, float maxTermFrequency, bool disableCoord)
    Parameters
    Type Name Description
    Occur highFreqOccur

    Occur used for high frequency terms

    Occur lowFreqOccur

    Occur used for low frequency terms

    System.Single maxTermFrequency

    a value in [0..1) (or absolute number >=1) representing the maximum threshold of a terms document frequency to be considered a low frequency term.

    System.Boolean disableCoord

    disables in scoring for the low / high frequency sub-queries

    Fields

    | Improve this Doc View Source

    m_disableCoord

    Declaration
    protected readonly bool m_disableCoord
    Field Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    m_highFreqBoost

    Declaration
    protected float m_highFreqBoost
    Field Value
    Type Description
    System.Single
    | Improve this Doc View Source

    m_highFreqMinNrShouldMatch

    Declaration
    protected float m_highFreqMinNrShouldMatch
    Field Value
    Type Description
    System.Single
    | Improve this Doc View Source

    m_highFreqOccur

    Declaration
    protected readonly Occur m_highFreqOccur
    Field Value
    Type Description
    Occur
    | Improve this Doc View Source

    m_lowFreqBoost

    Declaration
    protected float m_lowFreqBoost
    Field Value
    Type Description
    System.Single
    | Improve this Doc View Source

    m_lowFreqMinNrShouldMatch

    Declaration
    protected float m_lowFreqMinNrShouldMatch
    Field Value
    Type Description
    System.Single
    | Improve this Doc View Source

    m_lowFreqOccur

    Declaration
    protected readonly Occur m_lowFreqOccur
    Field Value
    Type Description
    Occur
    | Improve this Doc View Source

    m_maxTermFrequency

    Declaration
    protected readonly float m_maxTermFrequency
    Field Value
    Type Description
    System.Single
    | Improve this Doc View Source

    m_terms

    Declaration
    protected readonly IList<Term> m_terms
    Field Value
    Type Description
    IList<Term>

    Properties

    | Improve this Doc View Source

    HighFreqMinimumNumberShouldMatch

    Gets or Sets a minimum number of the high frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part. This method accepts a float value in the range [0..1) as a fraction of the actual query terms in the low frequent clause or a number >=1 as an absolut number of clauses that need to match.

    By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required.

    Declaration
    public virtual float HighFreqMinimumNumberShouldMatch { get; set; }
    Property Value
    Type Description
    System.Single
    | Improve this Doc View Source

    IsCoordDisabled

    Returns true iff is disabled in scoring for the high and low frequency query instance. The top level query will always disable coords.

    Declaration
    public virtual bool IsCoordDisabled { get; }
    Property Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    LowFreqMinimumNumberShouldMatch

    Gets or Sets a minimum number of the low frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part. This method accepts a float value in the range [0..1) as a fraction of the actual query terms in the low frequent clause or a number >=1 as an absolut number of clauses that need to match.

    By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required.

    Declaration
    public virtual float LowFreqMinimumNumberShouldMatch { get; set; }
    Property Value
    Type Description
    System.Single

    Methods

    | Improve this Doc View Source

    Add(Term)

    Adds a term to the CommonTermsQuery

    Declaration
    public virtual void Add(Term term)
    Parameters
    Type Name Description
    Term term

    the term to add

    | Improve this Doc View Source

    BuildQuery(Int32, TermContext[], Term[])

    Declaration
    protected virtual Query BuildQuery(int maxDoc, TermContext[] contextArray, Term[] queryTerms)
    Parameters
    Type Name Description
    System.Int32 maxDoc
    TermContext[] contextArray
    Term[] queryTerms
    Returns
    Type Description
    Query
    | Improve this Doc View Source

    CalcHighFreqMinimumNumberShouldMatch(Int32)

    Declaration
    protected virtual int CalcHighFreqMinimumNumberShouldMatch(int numOptional)
    Parameters
    Type Name Description
    System.Int32 numOptional
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    CalcLowFreqMinimumNumberShouldMatch(Int32)

    Declaration
    protected virtual int CalcLowFreqMinimumNumberShouldMatch(int numOptional)
    Parameters
    Type Name Description
    System.Int32 numOptional
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    CollectTermContext(IndexReader, IList<AtomicReaderContext>, TermContext[], Term[])

    Declaration
    public virtual void CollectTermContext(IndexReader reader, IList<AtomicReaderContext> leaves, TermContext[] contextArray, Term[] queryTerms)
    Parameters
    Type Name Description
    IndexReader reader
    IList<AtomicReaderContext> leaves
    TermContext[] contextArray
    Term[] queryTerms
    | Improve this Doc View Source

    Equals(Object)

    Declaration
    public override bool Equals(object obj)
    Parameters
    Type Name Description
    System.Object obj
    Returns
    Type Description
    System.Boolean
    | Improve this Doc View Source

    ExtractTerms(ISet<Term>)

    Declaration
    public override void ExtractTerms(ISet<Term> terms)
    Parameters
    Type Name Description
    ISet<Term> terms
    | Improve this Doc View Source

    GetEnumerator()

    Returns an enumerator that iterates through the m_terms collection.

    Declaration
    public IEnumerator<Term> GetEnumerator()
    Returns
    Type Description
    IEnumerator<Term>

    An enumerator that can be used to iterate through the m_terms collection.

    | Improve this Doc View Source

    GetHashCode()

    Declaration
    public override int GetHashCode()
    Returns
    Type Description
    System.Int32
    Overrides
    Query.GetHashCode()
    | Improve this Doc View Source

    NewTermQuery(Term, TermContext)

    Builds a new TermQuery instance.

    This is intended for subclasses that wish to customize the generated queries.

    Declaration
    protected virtual Query NewTermQuery(Term term, TermContext context)
    Parameters
    Type Name Description
    Term term

    term

    TermContext context

    the TermContext to be used to create the low level term query. Can be null.

    Returns
    Type Description
    Query

    new TermQuery instance

    | Improve this Doc View Source

    Rewrite(IndexReader)

    Declaration
    public override Query Rewrite(IndexReader reader)
    Parameters
    Type Name Description
    IndexReader reader
    Returns
    Type Description
    Query
    Overrides
    Query.Rewrite(IndexReader)
    | Improve this Doc View Source

    ToString(String)

    Declaration
    public override string ToString(string field)
    Parameters
    Type Name Description
    System.String field
    Returns
    Type Description
    System.String

    Implements

    IEnumerable<>
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)