Lucene.Net  3.0.3
Lucene.Net is a .NET port of the Java Lucene Indexing Library
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Properties
Classes | Static Public Member Functions | Public Attributes | Static Public Attributes | List of all members
Lucene.Net.Util.NumericUtils Class Reference

This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs. More...

Classes

class  IntRangeBuilder
 Expert: Callback for SplitIntRange. You need to overwrite only one of the methods. <font color="red">NOTE: This is a very low-level interface, the method signatures may change in later versions.</font> More...
 
class  LongRangeBuilder
 Expert: Callback for SplitLongRange. You need to overwrite only one of the methods. <font color="red">NOTE: This is a very low-level interface, the method signatures may change in later versions.</font> More...
 

Static Public Member Functions

static int LongToPrefixCoded (long val, int shift, char[] buffer)
 Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.
 
static System.String LongToPrefixCoded (long val, int shift)
 Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by LongRangeBuilder.
 
static System.String LongToPrefixCoded (long val)
 This is a convenience method, that returns prefix coded bits of a long without reducing the precision. It can be used to store the full precision value as a stored field in index. To decode, use PrefixCodedToLong.
 
static int IntToPrefixCoded (int val, int shift, char[] buffer)
 Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.
 
static System.String IntToPrefixCoded (int val, int shift)
 Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by IntRangeBuilder.
 
static System.String IntToPrefixCoded (int val)
 This is a convenience method, that returns prefix coded bits of an int without reducing the precision. It can be used to store the full precision value as a stored field in index. To decode, use PrefixCodedToInt.
 
static long PrefixCodedToLong (System.String prefixCoded)
 Returns a long from prefixCoded characters. Rightmost bits will be zero for lower precision codes. This method can be used to decode e.g. a stored field.
 
static int PrefixCodedToInt (System.String prefixCoded)
 Returns an int from prefixCoded characters. Rightmost bits will be zero for lower precision codes. This method can be used to decode e.g. a stored field.
 
static long DoubleToSortableLong (double val)
 Converts a double value to a sortable signed long. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long.
 
static System.String DoubleToPrefixCoded (double val)
 Convenience method: this just returns: longToPrefixCoded(doubleToSortableLong(val))
 
static double SortableLongToDouble (long val)
 Converts a sortable long back to a double.
 
static double PrefixCodedToDouble (System.String val)
 Convenience method: this just returns: sortableLongToDouble(prefixCodedToLong(val))
 
static int FloatToSortableInt (float val)
 Converts a float value to a sortable signed int. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int.
 
static System.String FloatToPrefixCoded (float val)
 Convenience method: this just returns: intToPrefixCoded(floatToSortableInt(val))
 
static float SortableIntToFloat (int val)
 Converts a sortable int back to a float.
 
static float PrefixCodedToFloat (System.String val)
 Convenience method: this just returns: sortableIntToFloat(prefixCodedToInt(val))
 
static void SplitLongRange (LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)
 Expert: Splits a long range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its LongRangeBuilder.AddRange(String,String) method. This method is used by NumericRangeQuery{T}.
 
static void SplitIntRange (IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)
 Expert: Splits an int range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its IntRangeBuilder.AddRange(String,String) method. This method is used by NumericRangeQuery{T}.
 

Public Attributes

const int PRECISION_STEP_DEFAULT = 4
 The default precision step used by NumericField, NumericTokenStream, NumericRangeQuery{T}, and NumericRangeFilter{T} as default
 
const int BUF_SIZE_LONG = 63 / 7 + 2
 Expert: The maximum term length (used for char[] buffer size) for encoding long values.
 
const int BUF_SIZE_INT = 31 / 7 + 2
 Expert: The maximum term length (used for char[] buffer size) for encoding int values.
 

Static Public Attributes

static char SHIFT_START_LONG = (char) 0x20
 Expert: Longs are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_LONG+shift in the first character
 
static char SHIFT_START_INT = (char) 0x60
 Expert: Integers are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_INT+shift in the first character
 

Detailed Description

This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.

This class generates terms to achive this: First the numerical integer values need to be converted to strings. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting string is sortable like the original integer value. Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: DoubleToSortableLong, FloatToSortableInt. You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to longs or ints (e.g. date to long: DateTime).

For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index int, long, float, and double. For querying, NumericRangeQuery{T} and NumericRangeFilter{T} implement the query part for the same data types.

This class can also be used, to generate lexicographically sortable (according String.CompareTo(String)) representations of numeric data types for other usages (e.g. sorting).

<font color="red">NOTE: This API is experimental and might change in incompatible ways in the next release.</font>

<since> 2.9 </since>

Definition at line 65 of file NumericUtils.cs.

Member Function Documentation

static System.String Lucene.Net.Util.NumericUtils.DoubleToPrefixCoded ( double  val)
static

Convenience method: this just returns: longToPrefixCoded(doubleToSortableLong(val))

Definition at line 285 of file NumericUtils.cs.

static long Lucene.Net.Util.NumericUtils.DoubleToSortableLong ( double  val)
static

Converts a double value to a sortable signed long. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long.

See Also
SortableLongToDouble

Definition at line 274 of file NumericUtils.cs.

static System.String Lucene.Net.Util.NumericUtils.FloatToPrefixCoded ( float  val)
static

Convenience method: this just returns: intToPrefixCoded(floatToSortableInt(val))

Definition at line 326 of file NumericUtils.cs.

static int Lucene.Net.Util.NumericUtils.FloatToSortableInt ( float  val)
static

Converts a float value to a sortable signed int. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int.

See Also
SortableIntToFloat

Definition at line 315 of file NumericUtils.cs.

static int Lucene.Net.Util.NumericUtils.IntToPrefixCoded ( int  val,
int  shift,
char[]  buffer 
)
static

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.

Parameters
valthe numeric value
shifthow many bits to strip from the right
bufferthat will contain the encoded chars, must be at least of BUF_SIZE_INT length
Returns
number of chars written to buffer

Definition at line 168 of file NumericUtils.cs.

static System.String Lucene.Net.Util.NumericUtils.IntToPrefixCoded ( int  val,
int  shift 
)
static

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by IntRangeBuilder.

Parameters
valthe numeric value
shifthow many bits to strip from the right

Definition at line 194 of file NumericUtils.cs.

static System.String Lucene.Net.Util.NumericUtils.IntToPrefixCoded ( int  val)
static

This is a convenience method, that returns prefix coded bits of an int without reducing the precision. It can be used to store the full precision value as a stored field in index. To decode, use PrefixCodedToInt.

Definition at line 206 of file NumericUtils.cs.

static int Lucene.Net.Util.NumericUtils.LongToPrefixCoded ( long  val,
int  shift,
char[]  buffer 
)
static

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.

Parameters
valthe numeric value
shifthow many bits to strip from the right
bufferthat will contain the encoded chars, must be at least of BUF_SIZE_LONG length
Returns
number of chars written to buffer

Definition at line 113 of file NumericUtils.cs.

static System.String Lucene.Net.Util.NumericUtils.LongToPrefixCoded ( long  val,
int  shift 
)
static

Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by LongRangeBuilder.

Parameters
valthe numeric value
shifthow many bits to strip from the right

Definition at line 139 of file NumericUtils.cs.

static System.String Lucene.Net.Util.NumericUtils.LongToPrefixCoded ( long  val)
static

This is a convenience method, that returns prefix coded bits of a long without reducing the precision. It can be used to store the full precision value as a stored field in index. To decode, use PrefixCodedToLong.

Definition at line 151 of file NumericUtils.cs.

static double Lucene.Net.Util.NumericUtils.PrefixCodedToDouble ( System.String  val)
static

Convenience method: this just returns: sortableLongToDouble(prefixCodedToLong(val))

Definition at line 303 of file NumericUtils.cs.

static float Lucene.Net.Util.NumericUtils.PrefixCodedToFloat ( System.String  val)
static

Convenience method: this just returns: sortableIntToFloat(prefixCodedToInt(val))

Definition at line 344 of file NumericUtils.cs.

static int Lucene.Net.Util.NumericUtils.PrefixCodedToInt ( System.String  prefixCoded)
static

Returns an int from prefixCoded characters. Rightmost bits will be zero for lower precision codes. This method can be used to decode e.g. a stored field.

<throws> NumberFormatException if the supplied string is </throws>

not correctly prefix encoded.

See Also
IntToPrefixCoded(int)

Definition at line 248 of file NumericUtils.cs.

static long Lucene.Net.Util.NumericUtils.PrefixCodedToLong ( System.String  prefixCoded)
static

Returns a long from prefixCoded characters. Rightmost bits will be zero for lower precision codes. This method can be used to decode e.g. a stored field.

<throws> NumberFormatException if the supplied string is </throws>

not correctly prefix encoded.

See Also
LongToPrefixCoded(long)

Definition at line 220 of file NumericUtils.cs.

static float Lucene.Net.Util.NumericUtils.SortableIntToFloat ( int  val)
static

Converts a sortable int back to a float.

See Also
FloatToSortableInt

Definition at line 334 of file NumericUtils.cs.

static double Lucene.Net.Util.NumericUtils.SortableLongToDouble ( long  val)
static

Converts a sortable long back to a double.

See Also
DoubleToSortableLong

Definition at line 293 of file NumericUtils.cs.

static void Lucene.Net.Util.NumericUtils.SplitIntRange ( IntRangeBuilder  builder,
int  precisionStep,
int  minBound,
int  maxBound 
)
static

Expert: Splits an int range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its IntRangeBuilder.AddRange(String,String) method. This method is used by NumericRangeQuery{T}.

Definition at line 368 of file NumericUtils.cs.

static void Lucene.Net.Util.NumericUtils.SplitLongRange ( LongRangeBuilder  builder,
int  precisionStep,
long  minBound,
long  maxBound 
)
static

Expert: Splits a long range recursively. You may implement a builder that adds clauses to a Lucene.Net.Search.BooleanQuery for each call to its LongRangeBuilder.AddRange(String,String) method. This method is used by NumericRangeQuery{T}.

Definition at line 356 of file NumericUtils.cs.

Member Data Documentation

const int Lucene.Net.Util.NumericUtils.BUF_SIZE_INT = 31 / 7 + 2

Expert: The maximum term length (used for char[] buffer size) for encoding int values.

See Also
IntToPrefixCoded(int,int,char[])

Definition at line 99 of file NumericUtils.cs.

const int Lucene.Net.Util.NumericUtils.BUF_SIZE_LONG = 63 / 7 + 2

Expert: The maximum term length (used for char[] buffer size) for encoding long values.

See Also
LongToPrefixCoded(long,int,char[])

Definition at line 87 of file NumericUtils.cs.

const int Lucene.Net.Util.NumericUtils.PRECISION_STEP_DEFAULT = 4

The default precision step used by NumericField, NumericTokenStream, NumericRangeQuery{T}, and NumericRangeFilter{T} as default

Definition at line 75 of file NumericUtils.cs.

char Lucene.Net.Util.NumericUtils.SHIFT_START_INT = (char) 0x60
static

Expert: Integers are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_INT+shift in the first character

Definition at line 92 of file NumericUtils.cs.

char Lucene.Net.Util.NumericUtils.SHIFT_START_LONG = (char) 0x20
static

Expert: Longs are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_LONG+shift in the first character

Definition at line 80 of file NumericUtils.cs.


The documentation for this class was generated from the following file: