Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class MergePolicy

    Expert: a MergePolicy determines the sequence of primitive merge operations.

    Whenever the segments in an index have been altered by IndexWriter, either the addition of a newly flushed segment, addition of many segments from AddIndexes* calls, or a previous merge that may now need to cascade, IndexWriter invokes FindMerges(MergeTrigger, SegmentInfos) to give the MergePolicy a chance to pick merges that are now required. This method returns a MergePolicy.MergeSpecification instance describing the set of merges that should be done, or null if no merges are necessary. When ForceMerge(Int32) is called, it calls FindForcedMerges(SegmentInfos, Int32, IDictionary<SegmentCommitInfo, Nullable<Boolean>>) and the MergePolicy should then return the necessary merges.

    Note that the policy can return more than one merge at a time. In this case, if the writer is using SerialMergeScheduler, the merges will be run sequentially but if it is using ConcurrentMergeScheduler they will be run concurrently.

    The default MergePolicy is TieredMergePolicy.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    System.Object
    MergePolicy
    LogMergePolicy
    NoMergePolicy
    TieredMergePolicy
    UpgradeIndexMergePolicy
    Implements
    System.IDisposable
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: Lucene.Net.Index
    Assembly: Lucene.Net.dll
    Syntax
    public abstract class MergePolicy : IDisposable

    Constructors

    | Improve this Doc View Source

    MergePolicy()

    Creates a new merge policy instance. Note that if you intend to use it without passing it to IndexWriter, you should call SetIndexWriter(IndexWriter).

    Declaration
    protected MergePolicy()
    | Improve this Doc View Source

    MergePolicy(Double, Int64)

    Creates a new merge policy instance with default settings for m_noCFSRatio and m_maxCFSSegmentSize. This ctor should be used by subclasses using different defaults than the MergePolicy

    Declaration
    protected MergePolicy(double defaultNoCFSRatio, long defaultMaxCFSSegmentSize)
    Parameters
    Type Name Description
    System.Double defaultNoCFSRatio
    System.Int64 defaultMaxCFSSegmentSize

    Fields

    | Improve this Doc View Source

    DEFAULT_MAX_CFS_SEGMENT_SIZE

    Default max segment size in order to use compound file system. Set to System.Int64.MaxValue.

    Declaration
    protected static readonly long DEFAULT_MAX_CFS_SEGMENT_SIZE
    Field Value
    Type Description
    System.Int64
    | Improve this Doc View Source

    DEFAULT_NO_CFS_RATIO

    Default ratio for compound file system usage. Set to 1.0, always use compound file system.

    Declaration
    protected static readonly double DEFAULT_NO_CFS_RATIO
    Field Value
    Type Description
    System.Double
    | Improve this Doc View Source

    m_maxCFSSegmentSize

    If the size of the merged segment exceeds this value then it will not use compound file format.

    Declaration
    protected long m_maxCFSSegmentSize
    Field Value
    Type Description
    System.Int64
    | Improve this Doc View Source

    m_noCFSRatio

    If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound format

    Declaration
    protected double m_noCFSRatio
    Field Value
    Type Description
    System.Double
    | Improve this Doc View Source

    m_writer

    IndexWriter that contains this instance.

    Declaration
    protected SetOnce<IndexWriter> m_writer
    Field Value
    Type Description
    SetOnce<IndexWriter>

    Properties

    | Improve this Doc View Source

    MaxCFSSegmentSizeMB

    Gets or Sets the largest size allowed for a compound file segment.

    If a merged segment will be more than this value, leave the segment as non-compound file even if compound file is enabled. Set this to System.Double.PositiveInfinity (default) and NoCFSRatio to 1.0 to always use CFS regardless of merge size.

    Declaration
    public double MaxCFSSegmentSizeMB { get; set; }
    Property Value
    Type Description
    System.Double
    | Improve this Doc View Source

    NoCFSRatio

    Gets or Sets current m_noCFSRatio.

    If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled. Set to 1.0 to always use CFS regardless of merge size.

    Declaration
    public double NoCFSRatio { get; set; }
    Property Value
    Type Description
    System.Double

    Methods

    | Improve this Doc View Source

    Clone()

    Declaration
    public virtual object Clone()
    Returns
    Type Description
    System.Object
    | Improve this Doc View Source

    Dispose()

    Release all resources for the policy.

    Declaration
    public void Dispose()
    | Improve this Doc View Source

    Dispose(Boolean)

    Release all resources for the policy.

    Declaration
    protected abstract void Dispose(bool disposing)
    Parameters
    Type Name Description
    System.Boolean disposing
    | Improve this Doc View Source

    FindForcedDeletesMerges(SegmentInfos)

    Determine what set of merge operations is necessary in order to expunge all deletes from the index.

    Declaration
    public abstract MergePolicy.MergeSpecification FindForcedDeletesMerges(SegmentInfos segmentInfos)
    Parameters
    Type Name Description
    SegmentInfos segmentInfos

    the total set of segments in the index

    Returns
    Type Description
    MergePolicy.MergeSpecification
    | Improve this Doc View Source

    FindForcedMerges(SegmentInfos, Int32, IDictionary<SegmentCommitInfo, Nullable<Boolean>>)

    Determine what set of merge operations is necessary in order to merge to <= the specified segment count. IndexWriter calls this when its ForceMerge(Int32, Boolean) method is called. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

    Declaration
    public abstract MergePolicy.MergeSpecification FindForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, IDictionary<SegmentCommitInfo, bool?> segmentsToMerge)
    Parameters
    Type Name Description
    SegmentInfos segmentInfos

    The total set of segments in the index

    System.Int32 maxSegmentCount

    Requested maximum number of segments in the index (currently this is always 1)

    System.Collections.Generic.IDictionary<SegmentCommitInfo, System.Nullable<System.Boolean>> segmentsToMerge

    Contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is true for a given SegmentInfo, that means this segment was an original segment present in the to-be-merged index; else, it was a segment produced by a cascaded merge.

    Returns
    Type Description
    MergePolicy.MergeSpecification
    | Improve this Doc View Source

    FindMerges(MergeTrigger, SegmentInfos)

    Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.

    Declaration
    public abstract MergePolicy.MergeSpecification FindMerges(MergeTrigger mergeTrigger, SegmentInfos segmentInfos)
    Parameters
    Type Name Description
    MergeTrigger mergeTrigger

    the event that triggered the merge

    SegmentInfos segmentInfos

    the total set of segments in the index

    Returns
    Type Description
    MergePolicy.MergeSpecification
    | Improve this Doc View Source

    IsMerged(SegmentInfos, SegmentCommitInfo)

    Returns true if this single info is already fully merged (has no pending deletes, is in the same dir as the writer, and matches the current compound file setting

    Declaration
    protected bool IsMerged(SegmentInfos infos, SegmentCommitInfo info)
    Parameters
    Type Name Description
    SegmentInfos infos
    SegmentCommitInfo info
    Returns
    Type Description
    System.Boolean
    | Improve this Doc View Source

    SetIndexWriter(IndexWriter)

    Sets the IndexWriter to use by this merge policy. This method is allowed to be called only once, and is usually set by IndexWriter. If it is called more than once, AlreadySetException is thrown.

    Declaration
    public virtual void SetIndexWriter(IndexWriter writer)
    Parameters
    Type Name Description
    IndexWriter writer
    See Also
    SetOnce<T>
    | Improve this Doc View Source

    Size(SegmentCommitInfo)

    Return the byte size of the provided SegmentCommitInfo, pro-rated by percentage of non-deleted documents is set.

    Declaration
    protected virtual long Size(SegmentCommitInfo info)
    Parameters
    Type Name Description
    SegmentCommitInfo info
    Returns
    Type Description
    System.Int64
    | Improve this Doc View Source

    UseCompoundFile(SegmentInfos, SegmentCommitInfo)

    Returns true if a new segment (regardless of its origin) should use the compound file format. The default implementation returns true iff the size of the given mergedInfo is less or equal to MaxCFSSegmentSizeMB and the size is less or equal to the TotalIndexSize * NoCFSRatio otherwise

    false
    .

    Declaration
    public virtual bool UseCompoundFile(SegmentInfos infos, SegmentCommitInfo mergedInfo)
    Parameters
    Type Name Description
    SegmentInfos infos
    SegmentCommitInfo mergedInfo
    Returns
    Type Description
    System.Boolean

    Implements

    System.IDisposable
    • Improve this Doc
    • View Source
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.