Show / Hide Table of Contents

    Class NumericUtils

    This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

    To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. this reduces the number of terms dramatically.

    This class generates terms to achieve this: First the numerical integer values need to be converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting byte[] is sortable like the original integer value (even using UTF-8 sort order). Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

    To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: DoubleToSortableInt64(Double), SingleToSortableInt32(Single). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to s or s (e.g. date to long: ).

    For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index , , , and . For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.

    This class can also be used, to generate lexicographically sortable (according to UTF8SortedAsUTF16Comparer) representations of numeric data types for other usages (e.g. sorting).

    @lucene.internal @since 2.9, API changed non backwards-compliant in 4.0

    Inheritance
    System.Object
    NumericUtils
    Namespace: Lucene.Net.Util
    Assembly: Lucene.Net.dll
    Syntax
    public sealed class NumericUtils : object

    Fields

    | Improve this Doc View Source

    BUF_SIZE_INT32

    The maximum term length (used for byte[] buffer size) for encoding values.

    NOTE: This was BUF_SIZE_INT in Lucene

    Declaration
    public const int BUF_SIZE_INT32 = null
    Field Value
    Type Description
    System.Int32
    See Also
    Int32ToPrefixCodedBytes(Int32, Int32, BytesRef)
    | Improve this Doc View Source

    BUF_SIZE_INT64

    The maximum term length (used for byte[] buffer size) for encoding values.

    NOTE: This was BUF_SIZE_LONG in Lucene

    Declaration
    public const int BUF_SIZE_INT64 = null
    Field Value
    Type Description
    System.Int32
    See Also
    Int64ToPrefixCodedBytes(Int64, Int32, BytesRef)
    | Improve this Doc View Source

    PRECISION_STEP_DEFAULT

    The default precision step used by Int32Field, SingleField, Int64Field, DoubleField, NumericTokenStream, NumericRangeQuery, and NumericRangeFilter.

    Declaration
    public const int PRECISION_STEP_DEFAULT = null
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    SHIFT_START_INT32

    Integers are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_INT32+shift in the first byte

    NOTE: This was SHIFT_START_INT in Lucene

    Declaration
    public const byte SHIFT_START_INT32 = null
    Field Value
    Type Description
    System.Byte
    | Improve this Doc View Source

    SHIFT_START_INT64

    Longs are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_INT64+shift in the first byte

    NOTE: This was SHIFT_START_LONG in Lucene

    Declaration
    public const char SHIFT_START_INT64 = null
    Field Value
    Type Description
    System.Char

    Methods

    | Improve this Doc View Source

    DoubleToSortableInt64(Double)

    Converts a value to a sortable signed . The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as . By this the precision is not reduced, but the value can easily used as a . The sort order (including ) is defined by ; NaN is greater than positive infinity.

    NOTE: This was doubleToSortableLong() in Lucene

    Declaration
    public static long DoubleToSortableInt64(double val)
    Parameters
    Type Name Description
    System.Double val
    Returns
    Type Description
    System.Int64
    See Also
    SortableInt64ToDouble(Int64)
    | Improve this Doc View Source

    FilterPrefixCodedInt32s(TermsEnum)

    Filters the given TermsEnum by accepting only prefix coded 32 bit terms with a shift value of 0.

    NOTE: This was filterPrefixCodedInts() in Lucene

    Declaration
    public static TermsEnum FilterPrefixCodedInt32s(TermsEnum termsEnum)
    Parameters
    Type Name Description
    TermsEnum termsEnum

    The terms enum to filter

    Returns
    Type Description
    TermsEnum

    A filtered TermsEnum that only returns prefix coded 32 bit terms with a shift value of 0.

    | Improve this Doc View Source

    FilterPrefixCodedInt64s(TermsEnum)

    Filters the given TermsEnum by accepting only prefix coded 64 bit terms with a shift value of 0.

    NOTE: This was filterPrefixCodedLongs() in Lucene

    Declaration
    public static TermsEnum FilterPrefixCodedInt64s(TermsEnum termsEnum)
    Parameters
    Type Name Description
    TermsEnum termsEnum

    The terms enum to filter

    Returns
    Type Description
    TermsEnum

    A filtered TermsEnum that only returns prefix coded 64 bit terms with a shift value of 0.

    | Improve this Doc View Source

    GetPrefixCodedInt32Shift(BytesRef)

    Returns the shift value from a prefix encoded .

    NOTE: This was getPrefixCodedIntShift() in Lucene

    Declaration
    public static int GetPrefixCodedInt32Shift(BytesRef val)
    Parameters
    Type Name Description
    BytesRef val
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    GetPrefixCodedInt64Shift(BytesRef)

    Returns the shift value from a prefix encoded .

    NOTE: This was getPrefixCodedLongShift() in Lucene

    Declaration
    public static int GetPrefixCodedInt64Shift(BytesRef val)
    Parameters
    Type Name Description
    BytesRef val
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    Int32ToPrefixCoded(Int32, Int32, BytesRef)

    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.Offset will always be 0.

    NOTE: This was intToPrefixCoded() in Lucene

    Declaration
    public static void Int32ToPrefixCoded(int val, int shift, BytesRef bytes)
    Parameters
    Type Name Description
    System.Int32 val

    The numeric value

    System.Int32 shift

    How many bits to strip from the right

    BytesRef bytes

    Will contain the encoded value

    | Improve this Doc View Source

    Int32ToPrefixCodedBytes(Int32, Int32, BytesRef)

    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.Offset will always be 0.

    NOTE: This was intToPrefixCodedBytes() in Lucene

    Declaration
    public static void Int32ToPrefixCodedBytes(int val, int shift, BytesRef bytes)
    Parameters
    Type Name Description
    System.Int32 val

    The numeric value

    System.Int32 shift

    How many bits to strip from the right

    BytesRef bytes

    Will contain the encoded value

    | Improve this Doc View Source

    Int64ToPrefixCoded(Int64, Int32, BytesRef)

    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.Offset will always be 0.

    NOTE: This was longToPrefixCoded() in Lucene

    Declaration
    public static void Int64ToPrefixCoded(long val, int shift, BytesRef bytes)
    Parameters
    Type Name Description
    System.Int64 val

    The numeric value

    System.Int32 shift

    How many bits to strip from the right

    BytesRef bytes

    Will contain the encoded value

    | Improve this Doc View Source

    Int64ToPrefixCodedBytes(Int64, Int32, BytesRef)

    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.Offset will always be 0.

    NOTE: This was longToPrefixCodedBytes() in Lucene

    Declaration
    public static void Int64ToPrefixCodedBytes(long val, int shift, BytesRef bytes)
    Parameters
    Type Name Description
    System.Int64 val

    The numeric value

    System.Int32 shift

    How many bits to strip from the right

    BytesRef bytes

    Will contain the encoded value

    | Improve this Doc View Source

    PrefixCodedToInt32(BytesRef)

    Returns an from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value.

    NOTE: This was prefixCodedToInt() in Lucene

    Declaration
    public static int PrefixCodedToInt32(BytesRef val)
    Parameters
    Type Name Description
    BytesRef val
    Returns
    Type Description
    System.Int32
    See Also
    Int32ToPrefixCodedBytes(Int32, Int32, BytesRef)
    | Improve this Doc View Source

    PrefixCodedToInt64(BytesRef)

    Returns a from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value.

    NOTE: This was prefixCodedToLong() in Lucene

    Declaration
    public static long PrefixCodedToInt64(BytesRef val)
    Parameters
    Type Name Description
    BytesRef val
    Returns
    Type Description
    System.Int64
    See Also
    Int64ToPrefixCodedBytes(Int64, Int32, BytesRef)
    | Improve this Doc View Source

    SingleToSortableInt32(Single)

    Converts a value to a sortable signed . The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as . By this the precision is not reduced, but the value can easily used as an . The sort order (including ) is defined by ; NaN is greater than positive infinity.

    NOTE: This was floatToSortableInt() in Lucene

    Declaration
    public static int SingleToSortableInt32(float val)
    Parameters
    Type Name Description
    System.Single val
    Returns
    Type Description
    System.Int32
    See Also
    SortableInt32ToSingle(Int32)
    | Improve this Doc View Source

    SortableInt32ToSingle(Int32)

    Converts a sortable back to a .

    NOTE: This was sortableIntToFloat() in Lucene

    Declaration
    public static float SortableInt32ToSingle(int val)
    Parameters
    Type Name Description
    System.Int32 val
    Returns
    Type Description
    System.Single
    See Also
    SingleToSortableInt32(Single)
    | Improve this Doc View Source

    SortableInt64ToDouble(Int64)

    Converts a sortable back to a .

    NOTE: This was sortableLongToDouble() in Lucene

    Declaration
    public static double SortableInt64ToDouble(long val)
    Parameters
    Type Name Description
    System.Int64 val
    Returns
    Type Description
    System.Double
    See Also
    DoubleToSortableInt64(Double)
    | Improve this Doc View Source

    SplitInt32Range(NumericUtils.Int32RangeBuilder, Int32, Int32, Int32)

    Splits an range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its AddRange(BytesRef, BytesRef) method.

    This method is used by NumericRangeQuery.

    NOTE: This was splitIntRange() in Lucene

    Declaration
    public static void SplitInt32Range(NumericUtils.Int32RangeBuilder builder, int precisionStep, int minBound, int maxBound)
    Parameters
    Type Name Description
    NumericUtils.Int32RangeBuilder builder
    System.Int32 precisionStep
    System.Int32 minBound
    System.Int32 maxBound
    | Improve this Doc View Source

    SplitInt64Range(NumericUtils.Int64RangeBuilder, Int32, Int64, Int64)

    Splits a long range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its AddRange(BytesRef, BytesRef) method.

    This method is used by NumericRangeQuery.

    NOTE: This was splitLongRange() in Lucene

    Declaration
    public static void SplitInt64Range(NumericUtils.Int64RangeBuilder builder, int precisionStep, long minBound, long maxBound)
    Parameters
    Type Name Description
    NumericUtils.Int64RangeBuilder builder
    System.Int32 precisionStep
    System.Int64 minBound
    System.Int64 maxBound
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)