Class NumericRangeQuery<T>
A Query that matches numeric values within a
specified range. To use this, you must first index the
numeric values using Int32Field,
Single
Numeric
You create a new NumericRangeQuery<T> with the static factory methods, eg:
Query q = NumericRangeQuery.NewFloatRange("weight", 0.03f, 0.10f, true, true);
matches all documents whose The performance of NumericRangeQuery<T> is much better
than the corresponding Term
You can optionally specify a Lucene.Numeric*
classes. See below for
details.
This query defaults to
CONSTANT_SCORE_AUTO_REWRITE_DEFAULT.
With precision steps of <=4, this query can be run with
one of the Boolean
How it works
See the publication about panFMP,
where this algorithm was described (referred to as TrieRangeQuery
):
Schindler, U, Diepenbroek, M, 2008. Generic XML-based Framework for Metadata Portals. Computers & Geosciences 34 (12), 1947-1955. doi:10.1016/j.cageo.2008.02.023
A quote from this paper: Because Apache Lucene is a full-text
search engine and not a conventional database, it cannot handle numerical ranges
(e.g., field value is inside user defined bounds, even dates are numerical values).
We have developed an extension to Apache Lucene that stores
the numerical values in a special string-encoded format with variable precision
(all numerical values like
For the variant that stores long values in 8 different precisions (each reduced by 8 bits) that uses a lowest precision of 1 byte, the index contains only a maximum of 256 distinct values in the lowest precision. Overall, a range could consist of a theoretical maximum of
7*255*2 + 255 = 3825
distinct terms (when there is a term for every distinct value of an
8-byte-number in the index and the range covers almost all of them; a maximum of 255 distinct values is used
because it would always be possible to reduce the full 256 values to one term with degraded precision).
In practice, we have seen up to 300 terms in most cases (index with 500,000 metadata records
and a uniform value distribution).
Precision Step
You can choose any Lucene.
indexedTermsPerValue = ceil(bitsPerValue / precisionStep)
As the lower precision terms are shared by many values, the additional terms only slightly grow the term dictionary (approx. 7% forprecisionStep=4
), but have a larger
impact on the postings (the postings file will have more entries, as every document is linked to
indexedTermsPerValue
terms instead of one). The formula to estimate the growth
of the term dictionary in comparison to one term per value:
On the other hand, if the Lucene.
For longs stored using a precision step of 4, maxQueryTerms = 15152 + 15 = 465
, and for a precision
step of 2, maxQueryTerms = 3132 + 3 = 189
. But the faster search speed is reduced by more seeking
in the term enum of the index. Because of this, the ideal Lucene.
Good values for Lucene.
- The default for all data types is 4, which is used, when no
is given.precisionStep
- Ideal value in most cases for 64 bit data types (long, double) is 6 or 8.
- Ideal value in most cases for 32 bit data types (int, float) is 4.
- For low cardinality fields larger precision steps are good. If the cardinality is < 100, it is
fair to use
(see below). - Steps >=64 for long/double and >=32 for int/float produces one token
per value in the index and querying is as slow as a conventional Term
Range . But it can be used to produce fields, that are solely used for sorting (in this case simply useQuery as Lucene. Net. ). Using Int32Field, Int64Field, SingleSearch. Numeric Range Query`1. precision Step Field or DoubleField for sorting is ideal, because building the field cache is much faster than with text-only numbers. These fields have one term per value and therefore also work with term enumeration for building distinct lists (e.g. facets / preselected values to search for). Sorting is also possible with range query optimized fields using one of the above Lucene.Net. s.Search. Numeric Range Query`1. precision Step
Comparisons of the different types of RangeQueries on an index with about 500,000 docs showed
that Term
@since 2.9
Inherited Members
Namespace: Lucene.Net.Search
Assembly: Lucene.Net.dll
Syntax
public sealed class NumericRangeQuery<T> : MultiTermQuery where T : struct, IComparable<T>
Type Parameters
Name | Description |
---|---|
T |
Properties
| Improve this Doc View SourceIncludesMax
Returns true
if the upper endpoint is inclusive
Declaration
public bool IncludesMax { get; }
Property Value
Type | Description |
---|---|
System. |
IncludesMin
Returns true
if the lower endpoint is inclusive
Declaration
public bool IncludesMin { get; }
Property Value
Type | Description |
---|---|
System. |
Max
Returns the upper value of this range query
Declaration
public T? Max { get; }
Property Value
Type | Description |
---|---|
System. |
Min
Returns the lower value of this range query
Declaration
public T? Min { get; }
Property Value
Type | Description |
---|---|
System. |
PrecisionStep
Returns the precision step.
Declaration
public int PrecisionStep { get; }
Property Value
Type | Description |
---|---|
System. |
Methods
| Improve this Doc View SourceEquals(Object)
Declaration
public override bool Equals(object o)
Parameters
Type | Name | Description |
---|---|---|
System. |
o |
Returns
Type | Description |
---|---|
System. |
Overrides
| Improve this Doc View SourceGetHashCode()
Declaration
public override int GetHashCode()
Returns
Type | Description |
---|---|
System. |
Overrides
| Improve this Doc View SourceGetTermsEnum(Terms, AttributeSource)
Declaration
protected override TermsEnum GetTermsEnum(Terms terms, AttributeSource atts)
Parameters
Type | Name | Description |
---|---|---|
Terms | terms | |
Attribute |
atts |
Returns
Type | Description |
---|---|
Terms |
Overrides
| Improve this Doc View SourceToString(String)
Declaration
public override string ToString(string field)
Parameters
Type | Name | Description |
---|---|---|
System. |
field |
Returns
Type | Description |
---|---|
System. |