Namespace Lucene.Net.Queries
Filters and Queries that add to core Lucene.
Classes
BooleanFilter
A container Filter that allows Boolean composition of Filters. Filters are allocated into one of three logical constructs; SHOULD, MUST NOT, MUST The results Filter BitSet is constructed as follows: SHOULD Filters are OR'd together The resulting Filter is NOT'd with the NOT Filters The resulting Filter is AND'd with the MUST Filters
BoostingQuery
The BoostingQuery class can be used to effectively demote results that match a given query. Unlike the "NOT" clause, this still selects documents that contain undesirable terms, but reduces their overall score:
Query balancedQuery = new BoostingQuery(positiveQuery, negativeQuery, 0.01f);
In this scenario the positiveQuery contains the mandatory, desirable criteria which is used to select all matching documents, and the negativeQuery contains the undesirable elements which are simply used to lessen the scores. Documents that match the negativeQuery have their score multiplied by the supplied "boost" parameter, so this should be less than 1 to achieve a demoting effect
This code was originally made available here: [WWW] http://marc.theaimsgroup.com/?l=lucene-user&m=108058407130459&w=2
and is documented here: http://wiki.apache.org/lucene-java/CommunityContributions
ChainedFilter
Allows multiple Filters to be chained. Logical operations such as NOT and XOR are applied between filters. One operation can be used for all filters, or a specific operation can be declared for each filter.
Order in which filters are called depends on the position of the filter in the chain. It's probably more efficient to place the most restrictive filters/least computationally-intensive filters first.
CommonTermsQuery
A query that executes high-frequency terms in a optional sub-query to prevent
slow queries due to "common" terms like stopwords. This query
builds 2 queries off the Add(Term) added terms: low-frequency
terms are added to a required boolean clause and high-frequency terms are
added to an optional boolean clause. The optional clause is only executed if
the required "low-frequency" clause matches. Scores produced by this query
will be slightly different than plain BooleanQuery scorer mainly due to
differences in the
CommonTermsQuery has several advantages over stopword filtering at index or query time since a term can be "classified" based on the actual document frequency in the index and can prevent slow queries even across domains without specialized stopword files.
Note: if the query only contains high-frequency terms the query is rewritten into a plain conjunction query ie. all high-frequency terms need to match in order to match a document.
Collection initializer note: To create and populate a CommonTermsQuery in a single statement, you can use the following example as a guide:
var query = new CommonTermsQuery() {
new Term("field", "microsoft"),
new Term("field", "office")
};
CustomScoreProvider
An instance of this subclass should be returned by GetCustomScoreProvider(AtomicReaderContext), if you want to modify the custom score calculation of a CustomScoreQuery.
Since Lucene 2.9, queries operate on each segment of an index separately,
so the protected m_context field can be used to resolve doc IDs,
as the supplied doc
ID is per-segment and without knowledge
of the IndexReader you cannot access the document or IFieldCache.
@lucene.experimental @since 2.9.2
CustomScoreQuery
Query that sets document score as a programmatic function of several (sub) scores:
- the score of its subQuery (any query)
- (optional) the score of its FunctionQuery (or queries).
FilterClause
A Filter that wrapped with an indication of how that filter is used when composed with another filter. (Follows the boolean logic in BooleanClause for composition of queries.)
TermFilter
A filter that includes documents that match with a specific term.
TermsFilter
Constructs a filter for docs matching any of the terms added to this class. Unlike a RangeFilter this can be used for filtering on multiple terms that are not necessarily in a sequence. An example might be a collection of primary keys from a database query result or perhaps a choice of "category" labels picked by the end user. As a filter, this is much faster than the equivalent query (a BooleanQuery with many "should" TermQuerys)