Show / Hide Table of Contents

    Class DFRSimilarity

    Implements the divergence from randomness (DFR) framework introduced in Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20, 4 (October 2002), 357-389.

    The DFR scoring formula is composed of three separate components: the basic model, the aftereffect and an additional normalization component, represented by the classes BasicModel, AfterEffect and Normalization, respectively. The names of these classes were chosen to match the names of their counterparts in the Terrier IR engine.

    To construct a DFRSimilarity, you must specify the implementations for all three components of DFR:

    ComponentImplementations
    BasicModel: Basic model of information content:
    • BasicModelBE: Limiting form of Bose-Einstein
    • BasicModelG: Geometric approximation of Bose-Einstein
    • BasicModelP: Poisson approximation of the Binomial
    • BasicModelD: Divergence approximation of the Binomial
    • BasicModelIn: Inverse document frequency
    • BasicModelIne: Inverse expected document frequency [mixture of Poisson and IDF]
    • BasicModelIF: Inverse term frequency [approximation of I(ne)]
    AfterEffect: First normalization of information gain:
    • AfterEffectL: Laplace's law of succession
    • AfterEffectB: Ratio of two Bernoulli processes
    • AfterEffect.NoAfterEffect: no first normalization
    Normalization: Second (length) normalization:
    • NormalizationH1: Uniform distribution of term frequency
    • NormalizationH2: term frequency density inversely related to length
    • NormalizationH3: term frequency normalization provided by Dirichlet prior
    • NormalizationZ: term frequency normalization provided by a Zipfian relation
    • Normalization.NoNormalization: no second normalization

    Note that qtf, the multiplicity of term-occurrence in the query, is not handled by this implementation.

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk
    Inheritance
    System.Object
    Similarity
    SimilarityBase
    DFRSimilarity
    Inherited Members
    SimilarityBase.DiscountOverlaps
    SimilarityBase.ComputeWeight(Single, CollectionStatistics, TermStatistics[])
    SimilarityBase.NewStats(String, Single)
    SimilarityBase.FillBasicStats(BasicStats, CollectionStatistics, TermStatistics)
    SimilarityBase.Explain(BasicStats, Int32, Explanation, Single)
    SimilarityBase.GetSimScorer(Similarity.SimWeight, AtomicReaderContext)
    SimilarityBase.ComputeNorm(FieldInvertState)
    SimilarityBase.DecodeNormValue(Byte)
    SimilarityBase.EncodeNormValue(Single, Single)
    SimilarityBase.Log2(Double)
    Similarity.Coord(Int32, Int32)
    Similarity.QueryNorm(Single)
    Namespace: Lucene.Net.Search.Similarities
    Assembly: Lucene.Net.dll
    Syntax
    public class DFRSimilarity : SimilarityBase

    Constructors

    | Improve this Doc View Source

    DFRSimilarity(BasicModel, AfterEffect, Normalization)

    Creates DFRSimilarity from the three components.

    Note that null values are not allowed: if you want no normalization or after-effect, instead pass Normalization.NoNormalization or AfterEffect.NoAfterEffect respectively.

    Declaration
    public DFRSimilarity(BasicModel basicModel, AfterEffect afterEffect, Normalization normalization)
    Parameters
    Type Name Description
    BasicModel basicModel

    Basic model of information content

    AfterEffect afterEffect

    First normalization of information gain

    Normalization normalization

    Second (length) normalization

    Fields

    | Improve this Doc View Source

    m_afterEffect

    The first normalization of the information content.

    Declaration
    protected readonly AfterEffect m_afterEffect
    Field Value
    Type Description
    AfterEffect
    | Improve this Doc View Source

    m_basicModel

    The basic model for information content.

    Declaration
    protected readonly BasicModel m_basicModel
    Field Value
    Type Description
    BasicModel
    | Improve this Doc View Source

    m_normalization

    The term frequency normalization.

    Declaration
    protected readonly Normalization m_normalization
    Field Value
    Type Description
    Normalization

    Properties

    | Improve this Doc View Source

    AfterEffect

    Returns the first normalization

    Declaration
    public virtual AfterEffect AfterEffect { get; }
    Property Value
    Type Description
    AfterEffect
    | Improve this Doc View Source

    BasicModel

    Returns the basic model of information content

    Declaration
    public virtual BasicModel BasicModel { get; }
    Property Value
    Type Description
    BasicModel
    | Improve this Doc View Source

    Normalization

    Returns the second normalization

    Declaration
    public virtual Normalization Normalization { get; }
    Property Value
    Type Description
    Normalization

    Methods

    | Improve this Doc View Source

    Explain(Explanation, BasicStats, Int32, Single, Single)

    Declaration
    protected override void Explain(Explanation expl, BasicStats stats, int doc, float freq, float docLen)
    Parameters
    Type Name Description
    Explanation expl
    BasicStats stats
    System.Int32 doc
    System.Single freq
    System.Single docLen
    Overrides
    SimilarityBase.Explain(Explanation, BasicStats, Int32, Single, Single)
    | Improve this Doc View Source

    Score(BasicStats, Single, Single)

    Declaration
    public override float Score(BasicStats stats, float freq, float docLen)
    Parameters
    Type Name Description
    BasicStats stats
    System.Single freq
    System.Single docLen
    Returns
    Type Description
    System.Single
    Overrides
    SimilarityBase.Score(BasicStats, Single, Single)
    | Improve this Doc View Source

    ToString()

    Declaration
    public override string ToString()
    Returns
    Type Description
    System.String
    Overrides
    SimilarityBase.ToString()

    See Also

    BasicModel
    AfterEffect
    Normalization
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)