Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class SegmentInfos

    A collection of segmentInfo objects with methods for operating on those segments in relation to the file system.

    The active segments in the index are stored in the segment info file, segments_N. There may be one or more segments_N files in the index; however, the one with the largest generation is the active one (when older segments_N files are present it's because they temporarily cannot be deleted, or, a writer is in the process of committing, or a custom IndexDeletionPolicy is in use). This file lists each segment by name and has details about the codec and generation of deletes.

    There is also a file segments.gen. this file contains the current generation (the _N in segments_N) of the index. This is used only as a fallback in case the current generation cannot be accurately determined by directory listing alone (as is the case for some NFS clients with time-based directory cache expiration). This file simply contains an WriteInt32(int) version header (FORMAT_SEGMENTS_GEN_CURRENT), followed by the generation recorded as WriteInt64(long), written twice.

    Files:

    • segments.gen: GenHeader, Generation, Generation, Footer
    • segments_N: Header, Version, NameCounter, SegCount, <SegName, SegCodec, DelGen, DeletionCount, FieldInfosGen, UpdatesFiles>SegCount, CommitUserData, Footer
    Data types:

    • Header --> WriteHeader(DataOutput, string, int)
    • GenHeader, NameCounter, SegCount, DeletionCount --> WriteInt32(int)
    • Generation, Version, DelGen, Checksum, FieldInfosGen --> WriteInt64(long)
    • SegName, SegCodec --> WriteString(string)
    • CommitUserData --> WriteStringStringMap(IDictionary<string, string>)
    • UpdatesFiles --> WriteStringSet(ISet<string>)
    • Footer --> WriteFooter(IndexOutput)
    Field Descriptions:

    • Version counts how often the index has been changed by adding or deleting documents.
    • NameCounter is used to generate names for new segment files.
    • SegName is the name of the segment, and is used as the file name prefix for all of the files that compose the segment's index.
    • DelGen is the generation count of the deletes file. If this is -1, there are no deletes. Anything above zero means there are deletes stored by LiveDocsFormat.
    • DeletionCount records the number of deleted documents in this segment.
    • SegCodec is the Name of the Codec that encoded this segment.
    • CommitUserData stores an optional user-supplied opaque IDictionary{string,string} that was passed to SetCommitData(IDictionary<string, string>).
    • FieldInfosGen is the generation count of the fieldInfos file. If this is -1, there are no updates to the fieldInfos in that segment. Anything above zero means there are updates to fieldInfos stored by FieldInfosFormat.
    • UpdatesFiles stores the list of files that were updated in that segment.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    object
    SegmentInfos
    Implements
    IEnumerable<SegmentCommitInfo>
    IEnumerable
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Index
    Assembly: Lucene.Net.dll
    Syntax
    public sealed class SegmentInfos : IEnumerable<SegmentCommitInfo>, IEnumerable

    Constructors

    SegmentInfos()

    Sole constructor. Typically you call this and then use Read(Directory) or Read(Directory, string) to populate each SegmentCommitInfo. Alternatively, you can add/remove your own SegmentCommitInfos.

    Declaration
    public SegmentInfos()

    Fields

    FORMAT_SEGMENTS_GEN_CURRENT

    Current format of segments.gen

    Declaration
    public static readonly int FORMAT_SEGMENTS_GEN_CURRENT
    Field Value
    Type Description
    int

    VERSION_40

    The file format version for the segments_N codec header, up to 4.5.

    Declaration
    public static readonly int VERSION_40
    Field Value
    Type Description
    int

    VERSION_46

    The file format version for the segments_N codec header, since 4.6+.

    Declaration
    public static readonly int VERSION_46
    Field Value
    Type Description
    int

    VERSION_48

    The file format version for the segments_N codec header, since 4.8+

    Declaration
    public const int VERSION_48 = 2
    Field Value
    Type Description
    int

    Properties

    Count

    Returns number of SegmentCommitInfos.

    NOTE: This was size() in Lucene.
    Declaration
    public int Count { get; }
    Property Value
    Type Description
    int

    Counter

    Used to name new segments.

    Declaration
    public int Counter { get; set; }
    Property Value
    Type Description
    int

    DefaultGenLookaheadCount

    Gets or Sets the defaultGenLookaheadCount.

    Advanced: set how many times to try incrementing the gen when loading the segments file. this only runs if the primary (listing directory) and secondary (opening segments.gen file) methods fail to find the segments file.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    public static int DefaultGenLookaheadCount { get; set; }
    Property Value
    Type Description
    int

    Generation

    Returns current generation.

    Declaration
    public long Generation { get; }
    Property Value
    Type Description
    long

    InfoStream

    If non-null, information about retries when loading the segments file will be printed to this.

    Declaration
    public static TextWriter InfoStream { get; set; }
    Property Value
    Type Description
    TextWriter

    this[int]

    Returns SegmentCommitInfo at the provided index.

    This was info(int) in Lucene.
    Declaration
    public SegmentCommitInfo this[int index] { get; }
    Parameters
    Type Name Description
    int index
    Property Value
    Type Description
    SegmentCommitInfo

    LastGeneration

    Returns last succesfully read or written generation.

    Declaration
    public long LastGeneration { get; }
    Property Value
    Type Description
    long

    Segments

    A collection of segmentInfo objects with methods for operating on those segments in relation to the file system.

    The active segments in the index are stored in the segment info file, segments_N. There may be one or more segments_N files in the index; however, the one with the largest generation is the active one (when older segments_N files are present it's because they temporarily cannot be deleted, or, a writer is in the process of committing, or a custom IndexDeletionPolicy is in use). This file lists each segment by name and has details about the codec and generation of deletes.

    There is also a file segments.gen. this file contains the current generation (the _N in segments_N) of the index. This is used only as a fallback in case the current generation cannot be accurately determined by directory listing alone (as is the case for some NFS clients with time-based directory cache expiration). This file simply contains an WriteInt32(int) version header (FORMAT_SEGMENTS_GEN_CURRENT), followed by the generation recorded as WriteInt64(long), written twice.

    Files:

    • segments.gen: GenHeader, Generation, Generation, Footer
    • segments_N: Header, Version, NameCounter, SegCount, <SegName, SegCodec, DelGen, DeletionCount, FieldInfosGen, UpdatesFiles>SegCount, CommitUserData, Footer
    Data types:

    • Header --> WriteHeader(DataOutput, string, int)
    • GenHeader, NameCounter, SegCount, DeletionCount --> WriteInt32(int)
    • Generation, Version, DelGen, Checksum, FieldInfosGen --> WriteInt64(long)
    • SegName, SegCodec --> WriteString(string)
    • CommitUserData --> WriteStringStringMap(IDictionary<string, string>)
    • UpdatesFiles --> WriteStringSet(ISet<string>)
    • Footer --> WriteFooter(IndexOutput)
    Field Descriptions:

    • Version counts how often the index has been changed by adding or deleting documents.
    • NameCounter is used to generate names for new segment files.
    • SegName is the name of the segment, and is used as the file name prefix for all of the files that compose the segment's index.
    • DelGen is the generation count of the deletes file. If this is -1, there are no deletes. Anything above zero means there are deletes stored by LiveDocsFormat.
    • DeletionCount records the number of deleted documents in this segment.
    • SegCodec is the Name of the Codec that encoded this segment.
    • CommitUserData stores an optional user-supplied opaque IDictionary{string,string} that was passed to SetCommitData(IDictionary<string, string>).
    • FieldInfosGen is the generation count of the fieldInfos file. If this is -1, there are no updates to the fieldInfos in that segment. Anything above zero means there are updates to fieldInfos stored by FieldInfosFormat.
    • UpdatesFiles stores the list of files that were updated in that segment.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    public IList<SegmentCommitInfo> Segments { get; }
    Property Value
    Type Description
    IList<SegmentCommitInfo>

    TotalDocCount

    Returns sum of all segment's docCounts. Note that this does not include deletions

    Declaration
    public int TotalDocCount { get; }
    Property Value
    Type Description
    int

    UseLegacySegmentNames

    Setting this to true will generate the same file names that were used in 4.8.0-beta00001 through 4.8.0-beta00015. When writing more than 10 segments, these segment names were incompatible with prior versions of Lucene.NET and incompatible with Lucene 4.8.0.

    This is only for reading codecs from the affected 4.8.0 beta versions, it is not recommended to use this setting for general use.

    This must be set prior to opening an index at application startup. When setting it at other times the behavior is undefined.

    Note that this property can also be set using the "useLegacySegmentNames" system property to "true" (such as setting the environment variable "lucene:useLegacySegmentNames"). System properties can also be injected by supplying a IConfigurationFactory at application startup through SetConfigurationFactory(IConfigurationFactory).
    Declaration
    public static bool UseLegacySegmentNames { get; set; }
    Property Value
    Type Description
    bool

    UserData

    Gets userData saved with this commit.

    Declaration
    public IDictionary<string, string> UserData { get; }
    Property Value
    Type Description
    IDictionary<string, string>
    See Also
    Commit()

    Version

    Version number when this SegmentInfos was generated.

    Declaration
    public long Version { get; }
    Property Value
    Type Description
    long

    Methods

    Add(SegmentCommitInfo)

    Appends the provided SegmentCommitInfo.

    Declaration
    public void Add(SegmentCommitInfo si)
    Parameters
    Type Name Description
    SegmentCommitInfo si

    AddAll(IEnumerable<SegmentCommitInfo>)

    Appends the provided SegmentCommitInfos.

    Declaration
    public void AddAll(IEnumerable<SegmentCommitInfo> sis)
    Parameters
    Type Name Description
    IEnumerable<SegmentCommitInfo> sis

    AsList()

    Returns all contained segments as an unmodifiableIList{SegmentCommitInfo} view.

    Declaration
    public IList<SegmentCommitInfo> AsList()
    Returns
    Type Description
    IList<SegmentCommitInfo>

    Changed()

    Call this before committing if changes have been made to the segments.

    Declaration
    public void Changed()

    Clear()

    Clear all SegmentCommitInfos.

    Declaration
    public void Clear()

    Clone()

    Returns a copy of this instance, also copying each SegmentInfo.

    Declaration
    public object Clone()
    Returns
    Type Description
    object

    GenerationFromSegmentsFileName(string)

    Parse the generation off the segments file name and return it.

    Declaration
    public static long GenerationFromSegmentsFileName(string fileName)
    Parameters
    Type Name Description
    string fileName
    Returns
    Type Description
    long

    GetEnumerator()

    Returns an unmodifiableIEnumerator{SegmentCommitInfo} of contained segments in order.

    Declaration
    public IEnumerator<SegmentCommitInfo> GetEnumerator()
    Returns
    Type Description
    IEnumerator<SegmentCommitInfo>

    GetFiles(Directory, bool)

    Returns all file names referenced by SegmentInfo instances matching the provided Directory (ie files associated with any "external" segments are skipped). The returned collection is recomputed on each invocation.

    Declaration
    public ICollection<string> GetFiles(Directory dir, bool includeSegmentsFile)
    Parameters
    Type Name Description
    Directory dir
    bool includeSegmentsFile
    Returns
    Type Description
    ICollection<string>

    GetLastCommitGeneration(Directory)

    Get the generation of the most recent commit to the index in this directory (N in the segments_N file).

    Declaration
    public static long GetLastCommitGeneration(Directory directory)
    Parameters
    Type Name Description
    Directory directory

    directory to search for the latest segments_N file

    Returns
    Type Description
    long

    GetLastCommitGeneration(string[])

    Get the generation of the most recent commit to the list of index files (N in the segments_N file).

    Declaration
    public static long GetLastCommitGeneration(string[] files)
    Parameters
    Type Name Description
    string[] files

    array of file names to check

    Returns
    Type Description
    long

    GetLastCommitSegmentsFileName(Directory)

    Get the filename of the segments_N file for the most recent commit to the index in this Directory.

    Declaration
    public static string GetLastCommitSegmentsFileName(Directory directory)
    Parameters
    Type Name Description
    Directory directory

    directory to search for the latest segments_N file

    Returns
    Type Description
    string

    GetLastCommitSegmentsFileName(string[])

    Get the filename of the segments_N file for the most recent commit in the list of index files.

    Declaration
    public static string GetLastCommitSegmentsFileName(string[] files)
    Parameters
    Type Name Description
    string[] files

    array of file names to check

    Returns
    Type Description
    string

    GetNextSegmentFileName()

    Get the next segments_N filename that will be written.

    Declaration
    public string GetNextSegmentFileName()
    Returns
    Type Description
    string

    GetSegmentsFileName()

    Get the segments_N filename in use by this segment infos.

    Declaration
    public string GetSegmentsFileName()
    Returns
    Type Description
    string

    Read(Directory)

    Find the latest commit (segments_N file) and load all SegmentCommitInfos.

    Declaration
    public void Read(Directory directory)
    Parameters
    Type Name Description
    Directory directory

    Read(Directory, string)

    Read a particular segmentFileName. Note that this may throw an IOException if a commit is in process.

    Declaration
    public void Read(Directory directory, string segmentFileName)
    Parameters
    Type Name Description
    Directory directory

    directory containing the segments file

    string segmentFileName

    segment file to load

    Exceptions
    Type Condition
    CorruptIndexException

    if the index is corrupt

    IOException

    if there is a low-level IO error

    Remove(SegmentCommitInfo)

    Remove the provided SegmentCommitInfo.

    WARNING: O(N) cost
    Declaration
    public void Remove(SegmentCommitInfo si)
    Parameters
    Type Name Description
    SegmentCommitInfo si

    ToString(Directory)

    Returns readable description of this segment.

    Declaration
    public string ToString(Directory directory)
    Parameters
    Type Name Description
    Directory directory
    Returns
    Type Description
    string

    Write3xInfo(Directory, SegmentInfo, IOContext)

    A collection of segmentInfo objects with methods for operating on those segments in relation to the file system.

    The active segments in the index are stored in the segment info file, segments_N. There may be one or more segments_N files in the index; however, the one with the largest generation is the active one (when older segments_N files are present it's because they temporarily cannot be deleted, or, a writer is in the process of committing, or a custom IndexDeletionPolicy is in use). This file lists each segment by name and has details about the codec and generation of deletes.

    There is also a file segments.gen. this file contains the current generation (the _N in segments_N) of the index. This is used only as a fallback in case the current generation cannot be accurately determined by directory listing alone (as is the case for some NFS clients with time-based directory cache expiration). This file simply contains an WriteInt32(int) version header (FORMAT_SEGMENTS_GEN_CURRENT), followed by the generation recorded as WriteInt64(long), written twice.

    Files:

    • segments.gen: GenHeader, Generation, Generation, Footer
    • segments_N: Header, Version, NameCounter, SegCount, <SegName, SegCodec, DelGen, DeletionCount, FieldInfosGen, UpdatesFiles>SegCount, CommitUserData, Footer
    Data types:

    • Header --> WriteHeader(DataOutput, string, int)
    • GenHeader, NameCounter, SegCount, DeletionCount --> WriteInt32(int)
    • Generation, Version, DelGen, Checksum, FieldInfosGen --> WriteInt64(long)
    • SegName, SegCodec --> WriteString(string)
    • CommitUserData --> WriteStringStringMap(IDictionary<string, string>)
    • UpdatesFiles --> WriteStringSet(ISet<string>)
    • Footer --> WriteFooter(IndexOutput)
    Field Descriptions:

    • Version counts how often the index has been changed by adding or deleting documents.
    • NameCounter is used to generate names for new segment files.
    • SegName is the name of the segment, and is used as the file name prefix for all of the files that compose the segment's index.
    • DelGen is the generation count of the deletes file. If this is -1, there are no deletes. Anything above zero means there are deletes stored by LiveDocsFormat.
    • DeletionCount records the number of deleted documents in this segment.
    • SegCodec is the Name of the Codec that encoded this segment.
    • CommitUserData stores an optional user-supplied opaque IDictionary{string,string} that was passed to SetCommitData(IDictionary<string, string>).
    • FieldInfosGen is the generation count of the fieldInfos file. If this is -1, there are no updates to the fieldInfos in that segment. Anything above zero means there are updates to fieldInfos stored by FieldInfosFormat.
    • UpdatesFiles stores the list of files that were updated in that segment.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    [Obsolete]
    public static string Write3xInfo(Directory dir, SegmentInfo si, IOContext context)
    Parameters
    Type Name Description
    Directory dir
    SegmentInfo si
    IOContext context
    Returns
    Type Description
    string

    WriteSegmentsGen(Directory, long)

    A utility for writing the SEGMENTS_GEN file to a Directory.

    NOTE: this is an internal utility which is kept public so that it's accessible by code from other packages. You should avoid calling this method unless you're absolutely sure what you're doing!

    Note

    This API is for internal purposes only and might change in incompatible ways in the next release.

    Declaration
    public static void WriteSegmentsGen(Directory dir, long generation)
    Parameters
    Type Name Description
    Directory dir
    long generation

    Implements

    IEnumerable<T>
    IEnumerable
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.