This class provides a {@link Field} that enables indexing
of numeric values for efficient range filtering and
sorting. Here's an example usage, adding an int value:
document.add(new NumericField(name).setIntValue(value));
For optimal performance, re-use the
CopyC#
and {@link Document} instance for more than
one document:
NumericField field = new NumericField(name);
Document document = new Document();
document.add(field);
for(all documents) {
...
field.setIntValue(value)
writer.addDocument(document);
...
}
The java native types
CopyC#
,
CopyC#
,
CopyC#
and
CopyC#
are
directly supported. However, any value that can be
converted into these native types can also be indexed.
For example, date/time values represented by a
{@link java.util.Date} can be translated into a long
value using the {@link java.util.Date#getTime} method. If you
don't need millisecond precision, you can quantize the
value, either by dividing the result of
{@link java.util.Date#getTime} or using the separate getters
(for year, month, etc.) to construct an
CopyC#
or
CopyC#
value.
To perform range querying or filtering against a
CopyC#
, use {@link NumericRangeQuery} or {@link
NumericRangeFilter}. To sort according to a
CopyC#
, use the normal numeric sort types, eg
{@link SortField#INT} (note that {@link SortField#AUTO}
will not work with these fields).
CopyC#
values
can also be loaded directly from {@link FieldCache}.
By default, a
CopyC#
's value is not stored but
is indexed for range filtering and sorting. You can use
the {@link #NumericField(String,Field.Store,boolean)}
constructor if you need to change these defaults.
You may add the same field name as a
CopyC#
to
the same document more than once. Range querying and
filtering will be the logical OR of all values; so a range query
will hit all documents that have at least one value in
the range. However sort behavior is not defined. If you need to sort,
you should separately index a single-valued
CopyC#
.
A
CopyC#
will consume somewhat more disk space
in the index than an ordinary single-valued field.
However, for a typical index that includes substantial
textual content per document, this increase will likely
be in the noise.
Within Lucene, each numeric value is indexed as a
trie structure, where each term is logically
assigned to larger and larger pre-defined brackets (which
are simply lower-precision representations of the value).
The step size between each successive bracket is called the
CopyC#
, measured in bits. Smaller
CopyC#
values result in larger number
of brackets, which consumes more disk space in the index
but may result in faster range search performance. The
default value, 4, was selected for a reasonable tradeoff
of disk space consumption versus performance. You can
use the expert constructor {@link
#NumericField(String,int,Field.Store,boolean)} if you'd
like to change the value. Note that you must also
specify a congruent value when creating {@link
NumericRangeQuery} or {@link NumericRangeFilter}.
For low cardinality fields larger precision steps are good.
If the cardinality is < 100, it is fair
to use {@link Integer#MAX_VALUE}, which produces one
term per value.
For more information on the internals of numeric trie
indexing, including the
CopyC#
configuration, see {@link NumericRangeQuery}. The format of
indexed values is described in {@link NumericUtils}.
If you only need to sort by numeric value, and never
run range querying/filtering, you can index using a
CopyC#
of {@link Integer#MAX_VALUE}.
This will minimize disk space consumed.
More advanced users can instead use {@link
NumericTokenStream} directly, when indexing numbers. This
class is a wrapper around this token stream type for
easier, more intuitive usage.
NOTE: This class is only used during
indexing. When retrieving the stored field value from a
{@link Document} instance after search, you will get a
conventional {@link Fieldable} instance where the numeric
values are returned as {@link String}s (according to
CopyC#
of the used data type).
NOTE: This API is
experimental and might change in incompatible ways in the
next release.