Class CommonTermsQuery
A query that executes high-frequency terms in a optional sub-query to prevent
slow queries due to "common" terms like stopwords. This query
builds 2 queries off the Add(Term) added terms: low-frequency
terms are added to a required boolean clause and high-frequency terms are
added to an optional boolean clause. The optional clause is only executed if
the required "low-frequency" clause matches. Scores produced by this query
will be slightly different than plain BooleanQuery scorer mainly due to
differences in the Coord(Int32, Int32) number of leaf queries
in the required boolean clause. In most cases, high-frequency terms are
unlikely to significantly contribute to the document score unless at least
one of the low-frequency terms are matched. This query can improve
query execution times significantly if applicable.
CommonTermsQuery has several advantages over stopword filtering at
index or query time since a term can be "classified" based on the actual
document frequency in the index and can prevent slow queries even across
domains without specialized stopword files.
Note: if the query only contains high-frequency terms the query is
rewritten into a plain conjunction query ie. all high-frequency terms need to
match in order to match a document.
Collection initializer note: To create and populate a CommonTermsQuery
in a single statement, you can use the following example as a guide:
var query = new CommonTermsQuery() {
new Term("field", "microsoft"),
new Term("field", "office")
};
Inheritance
System.Object
Lucene.Net.Search.Query
CommonTermsQuery
Implements
System.Collections.Generic.IEnumerable<Lucene.Net.Index.Term>
System.Collections.IEnumerable
Inherited Members
Lucene.Net.Search.Query.Boost
Lucene.Net.Search.Query.ToString()
Lucene.Net.Search.Query.CreateWeight(Lucene.Net.Search.IndexSearcher)
Lucene.Net.Search.Query.Clone()
System.Object.Equals(System.Object, System.Object)
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
Assembly: Lucene.Net.Queries.dll
Syntax
public class CommonTermsQuery : Query, IEnumerable<Term>, IEnumerable
Constructors
|
Improve this Doc
View Source
CommonTermsQuery(Occur, Occur, Single)
Declaration
public CommonTermsQuery(Occur highFreqOccur, Occur lowFreqOccur, float maxTermFrequency)
Parameters
| Type |
Name |
Description |
| Lucene.Net.Search.Occur |
highFreqOccur |
Lucene.Net.Search.Occur used for high frequency terms
|
| Lucene.Net.Search.Occur |
lowFreqOccur |
Lucene.Net.Search.Occur used for low frequency terms
|
| System.Single |
maxTermFrequency |
a value in [0..1) (or absolute number >=1) representing the
maximum threshold of a terms document frequency to be considered a
low frequency term.
|
Exceptions
| Type |
Condition |
| System.ArgumentException |
if MUST_NOT is pass as lowFreqOccur or
highFreqOccur
|
|
Improve this Doc
View Source
CommonTermsQuery(Occur, Occur, Single, Boolean)
Declaration
public CommonTermsQuery(Occur highFreqOccur, Occur lowFreqOccur, float maxTermFrequency, bool disableCoord)
Parameters
| Type |
Name |
Description |
| Lucene.Net.Search.Occur |
highFreqOccur |
Lucene.Net.Search.Occur used for high frequency terms
|
| Lucene.Net.Search.Occur |
lowFreqOccur |
Lucene.Net.Search.Occur used for low frequency terms
|
| System.Single |
maxTermFrequency |
a value in [0..1) (or absolute number >=1) representing the
maximum threshold of a terms document frequency to be considered a
low frequency term.
|
| System.Boolean |
disableCoord |
disables Coord(Int32, Int32) in scoring for the low
/ high frequency sub-queries
|
Exceptions
| Type |
Condition |
| System.ArgumentException |
if MUST_NOT is pass as lowFreqOccur or
highFreqOccur
|
Fields
|
Improve this Doc
View Source
m_disableCoord
Declaration
protected readonly bool m_disableCoord
Field Value
| Type |
Description |
| System.Boolean |
|
|
Improve this Doc
View Source
m_highFreqBoost
Declaration
protected float m_highFreqBoost
Field Value
| Type |
Description |
| System.Single |
|
|
Improve this Doc
View Source
m_highFreqMinNrShouldMatch
Declaration
protected float m_highFreqMinNrShouldMatch
Field Value
| Type |
Description |
| System.Single |
|
|
Improve this Doc
View Source
m_highFreqOccur
Declaration
protected readonly Occur m_highFreqOccur
Field Value
| Type |
Description |
| Lucene.Net.Search.Occur |
|
|
Improve this Doc
View Source
m_lowFreqBoost
Declaration
protected float m_lowFreqBoost
Field Value
| Type |
Description |
| System.Single |
|
|
Improve this Doc
View Source
m_lowFreqMinNrShouldMatch
Declaration
protected float m_lowFreqMinNrShouldMatch
Field Value
| Type |
Description |
| System.Single |
|
|
Improve this Doc
View Source
m_lowFreqOccur
Declaration
protected readonly Occur m_lowFreqOccur
Field Value
| Type |
Description |
| Lucene.Net.Search.Occur |
|
|
Improve this Doc
View Source
m_maxTermFrequency
Declaration
protected readonly float m_maxTermFrequency
Field Value
| Type |
Description |
| System.Single |
|
|
Improve this Doc
View Source
m_terms
Declaration
protected readonly IList<Term> m_terms
Field Value
| Type |
Description |
| System.Collections.Generic.IList<Lucene.Net.Index.Term> |
|
Properties
|
Improve this Doc
View Source
HighFreqMinimumNumberShouldMatch
Gets or Sets a minimum number of the high frequent optional BooleanClauses which must be
satisfied in order to produce a match on the low frequency terms query
part. This method accepts a float value in the range [0..1) as a fraction
of the actual query terms in the low frequent clause or a number
>=1 as an absolut number of clauses that need to match.
By default no optional clauses are necessary for a match (unless there are
no required clauses). If this method is used, then the specified number of
clauses is required.
Declaration
public virtual float HighFreqMinimumNumberShouldMatch { get; set; }
Property Value
| Type |
Description |
| System.Single |
|
|
Improve this Doc
View Source
IsCoordDisabled
Returns true iff Coord(Int32, Int32) is disabled in scoring
for the high and low frequency query instance. The top level query will
always disable coords.
Declaration
public virtual bool IsCoordDisabled { get; }
Property Value
| Type |
Description |
| System.Boolean |
|
|
Improve this Doc
View Source
LowFreqMinimumNumberShouldMatch
Gets or Sets a minimum number of the low frequent optional BooleanClauses which must be
satisfied in order to produce a match on the low frequency terms query
part. This method accepts a float value in the range [0..1) as a fraction
of the actual query terms in the low frequent clause or a number
>=1 as an absolut number of clauses that need to match.
By default no optional clauses are necessary for a match (unless there are
no required clauses). If this method is used, then the specified number of
clauses is required.
Declaration
public virtual float LowFreqMinimumNumberShouldMatch { get; set; }
Property Value
| Type |
Description |
| System.Single |
|
Methods
|
Improve this Doc
View Source
Add(Term)
Declaration
public virtual void Add(Term term)
Parameters
| Type |
Name |
Description |
| Lucene.Net.Index.Term |
term |
the term to add
|
|
Improve this Doc
View Source
BuildQuery(Int32, TermContext[], Term[])
Declaration
protected virtual Query BuildQuery(int maxDoc, TermContext[] contextArray, Term[] queryTerms)
Parameters
| Type |
Name |
Description |
| System.Int32 |
maxDoc |
|
| Lucene.Net.Index.TermContext[] |
contextArray |
|
| Lucene.Net.Index.Term[] |
queryTerms |
|
Returns
| Type |
Description |
| Lucene.Net.Search.Query |
|
|
Improve this Doc
View Source
CalcHighFreqMinimumNumberShouldMatch(Int32)
Declaration
protected virtual int CalcHighFreqMinimumNumberShouldMatch(int numOptional)
Parameters
| Type |
Name |
Description |
| System.Int32 |
numOptional |
|
Returns
| Type |
Description |
| System.Int32 |
|
|
Improve this Doc
View Source
CalcLowFreqMinimumNumberShouldMatch(Int32)
Declaration
protected virtual int CalcLowFreqMinimumNumberShouldMatch(int numOptional)
Parameters
| Type |
Name |
Description |
| System.Int32 |
numOptional |
|
Returns
| Type |
Description |
| System.Int32 |
|
|
Improve this Doc
View Source
CollectTermContext(IndexReader, IList<AtomicReaderContext>, TermContext[], Term[])
Declaration
public virtual void CollectTermContext(IndexReader reader, IList<AtomicReaderContext> leaves, TermContext[] contextArray, Term[] queryTerms)
Parameters
| Type |
Name |
Description |
| Lucene.Net.Index.IndexReader |
reader |
|
| System.Collections.Generic.IList<Lucene.Net.Index.AtomicReaderContext> |
leaves |
|
| Lucene.Net.Index.TermContext[] |
contextArray |
|
| Lucene.Net.Index.Term[] |
queryTerms |
|
|
Improve this Doc
View Source
Equals(Object)
Declaration
public override bool Equals(object obj)
Parameters
| Type |
Name |
Description |
| System.Object |
obj |
|
Returns
| Type |
Description |
| System.Boolean |
|
Overrides
|
Improve this Doc
View Source
Declaration
public override void ExtractTerms(ISet<Term> terms)
Parameters
| Type |
Name |
Description |
| System.Collections.Generic.ISet<Lucene.Net.Index.Term> |
terms |
|
Overrides
|
Improve this Doc
View Source
GetEnumerator()
Returns an enumerator that iterates through the m_terms collection.
Declaration
public IEnumerator<Term> GetEnumerator()
Returns
| Type |
Description |
| System.Collections.Generic.IEnumerator<Lucene.Net.Index.Term> |
An enumerator that can be used to iterate through the m_terms collection.
|
|
Improve this Doc
View Source
GetHashCode()
Declaration
public override int GetHashCode()
Returns
| Type |
Description |
| System.Int32 |
|
Overrides
Lucene.Net.Search.Query.GetHashCode()
|
Improve this Doc
View Source
NewTermQuery(Term, TermContext)
Builds a new TermQuery instance.
This is intended for subclasses that wish to customize the generated queries.
Declaration
protected virtual Query NewTermQuery(Term term, TermContext context)
Parameters
| Type |
Name |
Description |
| Lucene.Net.Index.Term |
term |
term
|
| Lucene.Net.Index.TermContext |
context |
the Lucene.Net.Index.TermContext to be used to create the low level term query. Can be null.
|
Returns
| Type |
Description |
| Lucene.Net.Search.Query |
new TermQuery instance
|
|
Improve this Doc
View Source
Rewrite(IndexReader)
Declaration
public override Query Rewrite(IndexReader reader)
Parameters
| Type |
Name |
Description |
| Lucene.Net.Index.IndexReader |
reader |
|
Returns
| Type |
Description |
| Lucene.Net.Search.Query |
|
Overrides
Lucene.Net.Search.Query.Rewrite(Lucene.Net.Index.IndexReader)
|
Improve this Doc
View Source
ToString(String)
Declaration
public override string ToString(string field)
Parameters
| Type |
Name |
Description |
| System.String |
field |
|
Returns
| Type |
Description |
| System.String |
|
Overrides
Explicit Interface Implementations
|
Improve this Doc
View Source
IEnumerable.GetEnumerator()
Returns an enumerator that iterates through the m_terms collection.
Declaration
IEnumerator IEnumerable.GetEnumerator()
Returns
| Type |
Description |
| System.Collections.IEnumerator |
An enumerator that can be used to iterate through the m_terms collection.
|
Implements
System.Collections.Generic.IEnumerable<T>
System.Collections.IEnumerable