Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class IBSimilarity

    Provides a framework for the family of information-based models, as described in StÉphane Clinchant and Eric Gaussier. 2010. Information-based models for ad hoc IR. In Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '10). ACM, New York, NY, USA, 234-241.

    The retrieval function is of the form RSV(q, d) = ∑ -xqw log Prob(Xw >= tdw | λw), where

    • xqw is the query boost;
    • Xw is a random variable that counts the occurrences of word w;
    • tdw is the normalized term frequency;
    • λw is a parameter.

    The framework described in the paper has many similarities to the DFR framework (see DFRSimilarity). It is possible that the two Similarities will be merged at one point.

    To construct an IBSimilarity, you must specify the implementations for all three components of the Information-Based model.

    ComponentImplementations
    Distribution: Probabilistic distribution used to model term occurrence
    • DistributionLL: Log-logistic
    • DistributionLL: Smoothed power-law
    Lambda: λw parameter of the probability distribution
    • LambdaDF: Nw/N or average number of documents where w occurs
    • LambdaTTF: Fw/N or average number of occurrences of w in the collection
    Normalization: Term frequency normalizationAny supported DFR normalization (listed in DFRSimilarity)
    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    object
    Similarity
    SimilarityBase
    IBSimilarity
    Inherited Members
    SimilarityBase.DiscountOverlaps
    SimilarityBase.ComputeWeight(float, CollectionStatistics, params TermStatistics[])
    SimilarityBase.NewStats(string, float)
    SimilarityBase.FillBasicStats(BasicStats, CollectionStatistics, TermStatistics)
    SimilarityBase.Explain(BasicStats, int, Explanation, float)
    SimilarityBase.GetSimScorer(Similarity.SimWeight, AtomicReaderContext)
    SimilarityBase.ComputeNorm(FieldInvertState)
    SimilarityBase.DecodeNormValue(byte)
    SimilarityBase.EncodeNormValue(float, float)
    SimilarityBase.Log2(double)
    Similarity.Coord(int, int)
    Similarity.QueryNorm(float)
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    Namespace: Lucene.Net.Search.Similarities
    Assembly: Lucene.Net.dll
    Syntax
    public class IBSimilarity : SimilarityBase

    Constructors

    IBSimilarity(Distribution, Lambda, Normalization)

    Creates IBSimilarity from the three components.

    Note that null values are not allowed: if you want no normalization, instead pass Normalization.NoNormalization.
    Declaration
    public IBSimilarity(Distribution distribution, Lambda lambda, Normalization normalization)
    Parameters
    Type Name Description
    Distribution distribution

    probabilistic distribution modeling term occurrence

    Lambda lambda

    distribution's λw parameter

    Normalization normalization

    term frequency normalization

    See Also
    DFRSimilarity

    Fields

    m_distribution

    The probabilistic distribution used to model term occurrence.

    Declaration
    protected readonly Distribution m_distribution
    Field Value
    Type Description
    Distribution
    See Also
    DFRSimilarity

    m_lambda

    The lambda (λw) parameter.

    Declaration
    protected readonly Lambda m_lambda
    Field Value
    Type Description
    Lambda
    See Also
    DFRSimilarity

    m_normalization

    The term frequency normalization.

    Declaration
    protected readonly Normalization m_normalization
    Field Value
    Type Description
    Normalization
    See Also
    DFRSimilarity

    Properties

    Distribution

    Returns the distribution

    Declaration
    public virtual Distribution Distribution { get; }
    Property Value
    Type Description
    Distribution
    See Also
    DFRSimilarity

    Lambda

    Returns the distribution's lambda parameter

    Declaration
    public virtual Lambda Lambda { get; }
    Property Value
    Type Description
    Lambda
    See Also
    DFRSimilarity

    Normalization

    Returns the term frequency normalization

    Declaration
    public virtual Normalization Normalization { get; }
    Property Value
    Type Description
    Normalization
    See Also
    DFRSimilarity

    Methods

    Explain(Explanation, BasicStats, int, float, float)

    Subclasses should implement this method to explain the score. expl already contains the score, the name of the class and the doc id, as well as the term frequency and its explanation; subclasses can add additional clauses to explain details of their scoring formulae.

    The default implementation does nothing.

    Declaration
    protected override void Explain(Explanation expl, BasicStats stats, int doc, float freq, float docLen)
    Parameters
    Type Name Description
    Explanation expl

    the explanation to extend with details.

    BasicStats stats

    the corpus level statistics.

    int doc

    the document id.

    float freq

    the term frequency.

    float docLen

    the document length.

    Overrides
    SimilarityBase.Explain(Explanation, BasicStats, int, float, float)
    See Also
    DFRSimilarity

    Score(BasicStats, float, float)

    Scores the document doc.

    Subclasses must apply their scoring formula in this class.

    Declaration
    public override float Score(BasicStats stats, float freq, float docLen)
    Parameters
    Type Name Description
    BasicStats stats

    the corpus level statistics.

    float freq

    the term frequency.

    float docLen

    the document length.

    Returns
    Type Description
    float

    the score.

    Overrides
    SimilarityBase.Score(BasicStats, float, float)
    See Also
    DFRSimilarity

    ToString()

    The name of IB methods follow the pattern IB <distribution> <lambda><normalization>. The name of the distribution is the same as in the original paper; for the names of lambda parameters, refer to the doc of the Lambda classes.

    Declaration
    public override string ToString()
    Returns
    Type Description
    string
    Overrides
    SimilarityBase.ToString()
    See Also
    DFRSimilarity

    See Also

    DFRSimilarity
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.