Class MergePolicy
Expert: a MergePolicy determines the sequence of primitive merge operations.
Whenever the segments in an index have been altered by IndexWriter, either the addition of a newly flushed segment, addition of many segments from AddIndexes* calls, or a previous merge that may now need to cascade, IndexWriter invokes FindMerges(MergeTrigger, SegmentInfos) to give the MergePolicy a chance to pick merges that are now required. This method returns a MergePolicy.MergeSpecification instance describing the set of merges that should be done, or null if no merges are necessary. When ForceMerge(int) is called, it calls FindForcedMerges(SegmentInfos, int, IDictionary<SegmentCommitInfo, bool>) and the MergePolicy should then return the necessary merges.
Note that the policy can return more than one merge at a time. In this case, if the writer is using SerialMergeScheduler, the merges will be run sequentially but if it is using ConcurrentMergeScheduler they will be run concurrently.
The default MergePolicy is TieredMergePolicy.
Note
This API is experimental and might change in incompatible ways in the next release.
Inheritance
Implements
Inherited Members
Namespace: Lucene.Net.Index
Assembly: Lucene.Net.dll
Syntax
public abstract class MergePolicy : IDisposable
Constructors
MergePolicy()
Creates a new merge policy instance. Note that if you intend to use it without passing it to IndexWriter, you should call SetIndexWriter(IndexWriter).
Declaration
protected MergePolicy()
MergePolicy(double, long)
Creates a new merge policy instance with default settings for m_noCFSRatio and m_maxCFSSegmentSize. This ctor should be used by subclasses using different defaults than the MergePolicy
Declaration
protected MergePolicy(double defaultNoCFSRatio, long defaultMaxCFSSegmentSize)
Parameters
Type | Name | Description |
---|---|---|
double | defaultNoCFSRatio | |
long | defaultMaxCFSSegmentSize |
Fields
DEFAULT_MAX_CFS_SEGMENT_SIZE
Default max segment size in order to use compound file system. Set to MaxValue.
Declaration
protected static readonly long DEFAULT_MAX_CFS_SEGMENT_SIZE
Field Value
Type | Description |
---|---|
long |
DEFAULT_NO_CFS_RATIO
Default ratio for compound file system usage. Set to 1.0
, always use
compound file system.
Declaration
protected static readonly double DEFAULT_NO_CFS_RATIO
Field Value
Type | Description |
---|---|
double |
m_maxCFSSegmentSize
If the size of the merged segment exceeds this value then it will not use compound file format.
Declaration
protected long m_maxCFSSegmentSize
Field Value
Type | Description |
---|---|
long |
m_noCFSRatio
If the size of the merge segment exceeds this ratio of the total index size then it will remain in non-compound format
Declaration
protected double m_noCFSRatio
Field Value
Type | Description |
---|---|
double |
m_writer
IndexWriter that contains this instance.
Declaration
protected SetOnce<IndexWriter> m_writer
Field Value
Type | Description |
---|---|
SetOnce<IndexWriter> |
Properties
MaxCFSSegmentSizeMB
Gets or Sets the largest size allowed for a compound file segment.
If a merged segment will be more than this value, leave the segment as non-compound file even if compound file is enabled. Set this to PositiveInfinity (default) and NoCFSRatio to 1.0 to always use CFS regardless of merge size.Declaration
public double MaxCFSSegmentSizeMB { get; set; }
Property Value
Type | Description |
---|---|
double |
NoCFSRatio
Gets or Sets current m_noCFSRatio.
If a merged segment will be more than this percentage of the total size of the index, leave the segment as non-compound file even if compound file is enabled. Set to 1.0 to always use CFS regardless of merge size.Declaration
public double NoCFSRatio { get; set; }
Property Value
Type | Description |
---|---|
double |
Methods
Clone()
Expert: a MergePolicy determines the sequence of primitive merge operations.
Whenever the segments in an index have been altered by IndexWriter, either the addition of a newly flushed segment, addition of many segments from AddIndexes* calls, or a previous merge that may now need to cascade, IndexWriter invokes FindMerges(MergeTrigger, SegmentInfos) to give the MergePolicy a chance to pick merges that are now required. This method returns a MergePolicy.MergeSpecification instance describing the set of merges that should be done, or null if no merges are necessary. When ForceMerge(int) is called, it calls FindForcedMerges(SegmentInfos, int, IDictionary<SegmentCommitInfo, bool>) and the MergePolicy should then return the necessary merges.
Note that the policy can return more than one merge at a time. In this case, if the writer is using SerialMergeScheduler, the merges will be run sequentially but if it is using ConcurrentMergeScheduler they will be run concurrently.
The default MergePolicy is TieredMergePolicy.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
public virtual object Clone()
Returns
Type | Description |
---|---|
object |
Dispose()
Release all resources for the policy.
Declaration
public void Dispose()
Dispose(bool)
Release all resources for the policy.
Declaration
protected abstract void Dispose(bool disposing)
Parameters
Type | Name | Description |
---|---|---|
bool | disposing |
FindForcedDeletesMerges(SegmentInfos)
Determine what set of merge operations is necessary in order to expunge all deletes from the index.
Declaration
public abstract MergePolicy.MergeSpecification FindForcedDeletesMerges(SegmentInfos segmentInfos)
Parameters
Type | Name | Description |
---|---|---|
SegmentInfos | segmentInfos | the total set of segments in the index |
Returns
Type | Description |
---|---|
MergePolicy.MergeSpecification |
FindForcedMerges(SegmentInfos, int, IDictionary<SegmentCommitInfo, bool>)
Determine what set of merge operations is necessary in order to merge to <= the specified segment count. IndexWriter calls this when its ForceMerge(int, bool) method is called. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.
Declaration
public abstract MergePolicy.MergeSpecification FindForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, IDictionary<SegmentCommitInfo, bool> segmentsToMerge)
Parameters
Type | Name | Description |
---|---|---|
SegmentInfos | segmentInfos | The total set of segments in the index |
int | maxSegmentCount | Requested maximum number of segments in the index (currently this is always 1) |
IDictionary<SegmentCommitInfo, bool> | segmentsToMerge | Contains the specific SegmentInfo instances that must be merged
away. This may be a subset of all
SegmentInfos. If the value is |
Returns
Type | Description |
---|---|
MergePolicy.MergeSpecification |
FindMerges(MergeTrigger, SegmentInfos)
Determine what set of merge operations are now necessary on the index. IndexWriter calls this whenever there is a change to the segments. This call is always synchronized on the IndexWriter instance so only one thread at a time will call this method.
Declaration
public abstract MergePolicy.MergeSpecification FindMerges(MergeTrigger mergeTrigger, SegmentInfos segmentInfos)
Parameters
Type | Name | Description |
---|---|---|
MergeTrigger | mergeTrigger | the event that triggered the merge |
SegmentInfos | segmentInfos | the total set of segments in the index |
Returns
Type | Description |
---|---|
MergePolicy.MergeSpecification |
IsMerged(SegmentInfos, SegmentCommitInfo)
Returns true
if this single info is already fully merged (has no
pending deletes, is in the same dir as the
writer, and matches the current compound file setting
Declaration
protected bool IsMerged(SegmentInfos infos, SegmentCommitInfo info)
Parameters
Type | Name | Description |
---|---|---|
SegmentInfos | infos | |
SegmentCommitInfo | info |
Returns
Type | Description |
---|---|
bool |
SetIndexWriter(IndexWriter)
Sets the IndexWriter to use by this merge policy. This method is allowed to be called only once, and is usually set by IndexWriter. If it is called more than once, AlreadySetException is thrown.
Declaration
public virtual void SetIndexWriter(IndexWriter writer)
Parameters
Type | Name | Description |
---|---|---|
IndexWriter | writer |
See Also
Size(SegmentCommitInfo)
Return the byte size of the provided SegmentCommitInfo, pro-rated by percentage of non-deleted documents is set.
Declaration
protected virtual long Size(SegmentCommitInfo info)
Parameters
Type | Name | Description |
---|---|---|
SegmentCommitInfo | info |
Returns
Type | Description |
---|---|
long |
UseCompoundFile(SegmentInfos, SegmentCommitInfo)
Returns true
if a new segment (regardless of its origin) should use the
compound file format. The default implementation returns true
iff the size of the given mergedInfo is less or equal to
MaxCFSSegmentSizeMB and the size is less or equal to the
TotalIndexSize * NoCFSRatio otherwise
false
.
Declaration
public virtual bool UseCompoundFile(SegmentInfos infos, SegmentCommitInfo mergedInfo)
Parameters
Type | Name | Description |
---|---|---|
SegmentInfos | infos | |
SegmentCommitInfo | mergedInfo |
Returns
Type | Description |
---|---|
bool |