Class DefaultSimilarity
Expert: Default scoring implementation which encodes (EncodeNormValue(Single))
norm values as a single byte before being stored. At search time,
the norm byte value is read from the index
Directory and
decoded (DecodeNormValue(Int64)) back to a float norm value.
this encoding/decoding, while reducing index size, comes with the price of
precision loss - it is not guaranteed that Decode(Encode(x)) = x. For
instance, Decode(Encode(0.89)) = 0.75.
Compression of norm values to a single byte saves memory at search time,
because once a field is referenced at search time, its norms - for all
documents - are maintained in memory.
The rationale supporting such lossy compression of norm values is that given
the difficulty (and inaccuracy) of users to express their true information
need by a query, only big differences matter.
Last, note that search time is too late to modify this norm part of
scoring, e.g. by using a different Similarity for search.
Inheritance
System.Object
DefaultSimilarity
Inherited Members
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
Assembly: Lucene.Net.dll
Syntax
public class DefaultSimilarity : TFIDFSimilarity
Constructors
|
Improve this Doc
View Source
DefaultSimilarity()
Sole constructor: parameter-free
Declaration
public DefaultSimilarity()
Fields
|
Improve this Doc
View Source
m_discountOverlaps
True
if overlap tokens (tokens with a position of increment of zero) are
discounted from the document's length.
Declaration
protected bool m_discountOverlaps
Field Value
Type |
Description |
System.Boolean |
|
Properties
|
Improve this Doc
View Source
DiscountOverlaps
Determines whether overlap tokens (Tokens with
0 position increment) are ignored when computing
norm. By default this is true, meaning overlap
tokens do not count when computing norms.
This is a Lucene.NET EXPERIMENTAL API, use at your own risk
Declaration
public virtual bool DiscountOverlaps { get; set; }
Property Value
Type |
Description |
System.Boolean |
|
See Also
Methods
|
Improve this Doc
View Source
Coord(Int32, Int32)
Implemented as overlap / maxOverlap
.
Declaration
public override float Coord(int overlap, int maxOverlap)
Parameters
Type |
Name |
Description |
System.Int32 |
overlap |
|
System.Int32 |
maxOverlap |
|
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
DecodeNormValue(Int64)
Decodes the norm value, assuming it is a single byte.
Declaration
public override sealed float DecodeNormValue(long norm)
Parameters
Type |
Name |
Description |
System.Int64 |
norm |
|
Returns
Type |
Description |
System.Single |
|
Overrides
See Also
|
Improve this Doc
View Source
EncodeNormValue(Single)
Encodes a normalization factor for storage in an index.
The encoding uses a three-bit mantissa, a five-bit exponent, and the
zero-exponent point at 15, thus representing values from around 7x10^9 to
2x10^-9 with about one significant decimal digit of accuracy. Zero is also
represented. Negative numbers are rounded up to zero. Values too large to
represent are rounded down to the largest representable value. Positive
values too small to represent are rounded up to the smallest positive
representable value.
Declaration
public override sealed long EncodeNormValue(float f)
Parameters
Type |
Name |
Description |
System.Single |
f |
|
Returns
Type |
Description |
System.Int64 |
|
Overrides
See Also
|
Improve this Doc
View Source
Idf(Int64, Int64)
Implemented as log(numDocs/(docFreq+1)) + 1
.
Declaration
public override float Idf(long docFreq, long numDocs)
Parameters
Type |
Name |
Description |
System.Int64 |
docFreq |
|
System.Int64 |
numDocs |
|
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
LengthNorm(FieldInvertState)
Implemented as
state.Boost * LengthNorm(numTerms)
, where
numTerms
is Length if
DiscountOverlaps is false
, else it's
Length -
NumOverlap.
This is a Lucene.NET EXPERIMENTAL API, use at your own risk
Declaration
public override float LengthNorm(FieldInvertState state)
Parameters
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
QueryNorm(Single)
Implemented as 1/sqrt(sumOfSquaredWeights)
.
Declaration
public override float QueryNorm(float sumOfSquaredWeights)
Parameters
Type |
Name |
Description |
System.Single |
sumOfSquaredWeights |
|
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
ScorePayload(Int32, Int32, Int32, BytesRef)
The default implementation returns 1
Declaration
public override float ScorePayload(int doc, int start, int end, BytesRef payload)
Parameters
Type |
Name |
Description |
System.Int32 |
doc |
|
System.Int32 |
start |
|
System.Int32 |
end |
|
BytesRef |
payload |
|
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
SloppyFreq(Int32)
Implemented as 1 / (distance + 1)
.
Declaration
public override float SloppyFreq(int distance)
Parameters
Type |
Name |
Description |
System.Int32 |
distance |
|
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
Tf(Single)
Implemented as Math.Sqrt(freq)
.
Declaration
public override float Tf(float freq)
Parameters
Type |
Name |
Description |
System.Single |
freq |
|
Returns
Type |
Description |
System.Single |
|
Overrides
|
Improve this Doc
View Source
ToString()
Declaration
public override string ToString()
Returns
Type |
Description |
System.String |
|
Overrides
System.Object.ToString()