Class LMSimilarity
Abstract superclass for language modeling Similarities. The following inner types are introduced:
- LMSimilarity.LMStats, which defines a new statistic, the probability that the collection language model generates the current term;
- LMSimilarity.ICollectionModel, which is a strategy interface for object that
compute the collection language model
p(w|C)
; - LMSimilarity.DefaultCollectionModel, an implementation of the former, that computes the term probability as the number of occurrences of the term in the collection, divided by the total number of tokens.
Note
This API is experimental and might change in incompatible ways in the next release.
Inheritance
Inherited Members
Namespace: Lucene.Net.Search.Similarities
Assembly: Lucene.Net.dll
Syntax
public abstract class LMSimilarity : SimilarityBase
Constructors
LMSimilarity()
Creates a new instance with the default collection language model.
Declaration
protected LMSimilarity()
LMSimilarity(ICollectionModel)
Creates a new instance with the specified collection language model.
Declaration
protected LMSimilarity(LMSimilarity.ICollectionModel collectionModel)
Parameters
Type | Name | Description |
---|---|---|
LMSimilarity.ICollectionModel | collectionModel |
Fields
m_collectionModel
The collection model.
Declaration
protected readonly LMSimilarity.ICollectionModel m_collectionModel
Field Value
Type | Description |
---|---|
LMSimilarity.ICollectionModel |
Methods
Explain(Explanation, BasicStats, int, float, float)
Subclasses should implement this method to explain the score. expl
already contains the score, the name of the class and the doc id, as well
as the term frequency and its explanation; subclasses can add additional
clauses to explain details of their scoring formulae.
The default implementation does nothing.
Declaration
protected override void Explain(Explanation expl, BasicStats stats, int doc, float freq, float docLen)
Parameters
Type | Name | Description |
---|---|---|
Explanation | expl | the explanation to extend with details. |
BasicStats | stats | the corpus level statistics. |
int | doc | the document id. |
float | freq | the term frequency. |
float | docLen | the document length. |
Overrides
FillBasicStats(BasicStats, CollectionStatistics, TermStatistics)
Computes the collection probability of the current term in addition to the usual statistics.
Declaration
protected override void FillBasicStats(BasicStats stats, CollectionStatistics collectionStats, TermStatistics termStats)
Parameters
Type | Name | Description |
---|---|---|
BasicStats | stats | |
CollectionStatistics | collectionStats | |
TermStatistics | termStats |
Overrides
GetName()
Returns the name of the LM method. The values of the parameters should be included as well.
Used in ToString()
.Declaration
public abstract string GetName()
Returns
Type | Description |
---|---|
string |
NewStats(string, float)
Factory method to return a custom stats object
Declaration
protected override BasicStats NewStats(string field, float queryBoost)
Parameters
Type | Name | Description |
---|---|---|
string | field | |
float | queryBoost |
Returns
Type | Description |
---|---|
BasicStats |
Overrides
ToString()
Returns the name of the LM method. If a custom collection model strategy is used, its name is included as well.
Declaration
public override string ToString()
Returns
Type | Description |
---|---|
string |