Class LMJelinekMercerSimilarity
Language model based on the Jelinek-Mercer smoothing method. From Chengxiang Zhai and John Lafferty. 2001. A study of smoothing methods for language models applied to Ad Hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '01). ACM, New York, NY, USA, 334-342.
The model has a single parameter, λ. According to said paper, the
optimal value depends on both the collection and the query. The optimal value
is around 0.1
for title queries and 0.7
for long queries.
Note
This API is experimental and might change in incompatible ways in the next release.
Inherited Members
Namespace: Lucene.Net.Search.Similarities
Assembly: Lucene.Net.dll
Syntax
public class LMJelinekMercerSimilarity : LMSimilarity
Constructors
LMJelinekMercerSimilarity(ICollectionModel, float)
Instantiates with the specified collectionModel
and λ parameter.
Declaration
public LMJelinekMercerSimilarity(LMSimilarity.ICollectionModel collectionModel, float lambda)
Parameters
Type | Name | Description |
---|---|---|
LMSimilarity.ICollectionModel | collectionModel | |
float | lambda |
LMJelinekMercerSimilarity(float)
Instantiates with the specified λ parameter.
Declaration
public LMJelinekMercerSimilarity(float lambda)
Parameters
Type | Name | Description |
---|---|---|
float | lambda |
Properties
Lambda
Returns the λ parameter.
Declaration
public virtual float Lambda { get; }
Property Value
Type | Description |
---|---|
float |
Methods
Explain(Explanation, BasicStats, int, float, float)
Subclasses should implement this method to explain the score. expl
already contains the score, the name of the class and the doc id, as well
as the term frequency and its explanation; subclasses can add additional
clauses to explain details of their scoring formulae.
The default implementation does nothing.
Declaration
protected override void Explain(Explanation expl, BasicStats stats, int doc, float freq, float docLen)
Parameters
Type | Name | Description |
---|---|---|
Explanation | expl | the explanation to extend with details. |
BasicStats | stats | the corpus level statistics. |
int | doc | the document id. |
float | freq | the term frequency. |
float | docLen | the document length. |
Overrides
GetName()
Returns the name of the LM method. The values of the parameters should be included as well.
Used in ToString()
.Declaration
public override string GetName()
Returns
Type | Description |
---|---|
string |
Overrides
Score(BasicStats, float, float)
Scores the document doc
.
Subclasses must apply their scoring formula in this class.
Declaration
public override float Score(BasicStats stats, float freq, float docLen)
Parameters
Type | Name | Description |
---|---|---|
BasicStats | stats | the corpus level statistics. |
float | freq | the term frequency. |
float | docLen | the document length. |
Returns
Type | Description |
---|---|
float | the score. |