Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class MultiPassIndexSplitter

    This tool splits input index into multiple equal parts. The method employed here uses Lucene.Net.Index.IndexWriter.AddIndexes(params Lucene.Net.Index.IndexReader[]) where the input data comes from the input index with artificially applied deletes to the document id-s that fall outside the selected partition.

    Note 1: Deletes are only applied to a buffered list of deleted docs and don't affect the source index - this tool works also with read-only indexes.

    Note 2: the disadvantage of this tool is that source index needs to be read as many times as there are parts to be created, hence the name of this tool.

    NOTE: this tool is unaware of documents added atomically via AddDocuments(IEnumerable<IEnumerable<IIndexableField>>, Analyzer) or UpdateDocuments(Term, IEnumerable<IEnumerable<IIndexableField>>, Analyzer), which means it can easily break up such document groups.
    Inheritance
    object
    MultiPassIndexSplitter
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Lucene.Net.Index
    Assembly: Lucene.Net.Misc.dll
    Syntax
    public class MultiPassIndexSplitter

    Methods

    Main(string[])

    LUCENENET specific: In the Java implementation, this Main method was intended to be called from the command line. However, in .NET a method within a DLL can't be directly called from the command line so we provide a .NET tool, lucene-cli, with a command that maps to this method: index split

    Declaration
    public static void Main(string[] args)
    Parameters
    Type Name Description
    string[] args
    Exceptions
    Type Condition
    ArgumentException

    Split(LuceneVersion, IndexReader, Directory[], bool)

    Split source index into multiple parts.

    Declaration
    public virtual void Split(LuceneVersion version, IndexReader @in, Directory[] outputs, bool seq)
    Parameters
    Type Name Description
    LuceneVersion version

    lucene compatibility version

    IndexReader in

    source index, can have deletions, can have multiple segments (or multiple readers).

    Directory[] outputs

    list of directories where the output parts will be stored.

    bool seq

    if true, then the source index will be split into equal increasing ranges of document id-s. If false, source document id-s will be assigned in a deterministic round-robin fashion to one of the output splits.

    Exceptions
    Type Condition
    IOException

    If there is a low-level I/O error

    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.