Show / Hide Table of Contents

    Class TermVectorsWriter

    Codec API for writing term vectors:

    1. For every document, StartDocument(Int32) is called, informing the Codec how many fields will be written.
    2. StartField(FieldInfo, Int32, Boolean, Boolean, Boolean) is called for each field in the document, informing the codec how many terms will be written for that field, and whether or not positions, offsets, or payloads are enabled.
    3. Within each field, StartTerm(BytesRef, Int32) is called for each term.
    4. If offsets and/or positions are enabled, then AddPosition(Int32, Int32, Int32, BytesRef) will be called for each term occurrence.
    5. After all documents have been written, Finish(FieldInfos, Int32) is called for verification/sanity-checks.
    6. Finally the writer is disposed (Dispose(Boolean))

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk
    Inheritance
    System.Object
    TermVectorsWriter
    CompressingTermVectorsWriter
    Lucene40TermVectorsWriter
    Namespace: Lucene.Net.Codecs
    Assembly: Lucene.Net.dll
    Syntax
    public abstract class TermVectorsWriter : IDisposable

    Constructors

    | Improve this Doc View Source

    TermVectorsWriter()

    Sole constructor. (For invocation by subclass constructors, typically implicit.)

    Declaration
    protected TermVectorsWriter()

    Properties

    | Improve this Doc View Source

    Comparer

    Return the IComparer<BytesRef> used to sort terms before feeding to this API.

    Declaration
    public abstract IComparer<BytesRef> Comparer { get; }
    Property Value
    Type Description
    IComparer<BytesRef>

    Methods

    | Improve this Doc View Source

    Abort()

    Aborts writing entirely, implementation should remove any partially-written files, etc.

    Declaration
    public abstract void Abort()
    | Improve this Doc View Source

    AddAllDocVectors(Fields, MergeState)

    Safe (but, slowish) default method to write every vector field in the document.

    Declaration
    protected void AddAllDocVectors(Fields vectors, MergeState mergeState)
    Parameters
    Type Name Description
    Fields vectors
    MergeState mergeState
    | Improve this Doc View Source

    AddPosition(Int32, Int32, Int32, BytesRef)

    Adds a term position and offsets.

    Declaration
    public abstract void AddPosition(int position, int startOffset, int endOffset, BytesRef payload)
    Parameters
    Type Name Description
    System.Int32 position
    System.Int32 startOffset
    System.Int32 endOffset
    BytesRef payload
    | Improve this Doc View Source

    AddProx(Int32, DataInput, DataInput)

    Called by IndexWriter when writing new segments.

    This is an expert API that allows the codec to consume positions and offsets directly from the indexer.

    The default implementation calls AddPosition(Int32, Int32, Int32, BytesRef), but subclasses can override this if they want to efficiently write all the positions, then all the offsets, for example.

    NOTE: this API is extremely expert and subject to change or removal!!!

    This is a Lucene.NET INTERNAL API, use at your own risk
    Declaration
    public virtual void AddProx(int numProx, DataInput positions, DataInput offsets)
    Parameters
    Type Name Description
    System.Int32 numProx
    DataInput positions
    DataInput offsets
    | Improve this Doc View Source

    Dispose()

    Disposes all resources used by this object.

    Declaration
    public void Dispose()
    | Improve this Doc View Source

    Dispose(Boolean)

    Implementations must override and should dispose all resources used by this instance.

    Declaration
    protected abstract void Dispose(bool disposing)
    Parameters
    Type Name Description
    System.Boolean disposing
    | Improve this Doc View Source

    Finish(FieldInfos, Int32)

    Called before Dispose(Boolean), passing in the number of documents that were written. Note that this is intentionally redundant (equivalent to the number of calls to StartDocument(Int32), but a Codec should check that this is the case to detect the bug described in LUCENE-1282.

    Declaration
    public abstract void Finish(FieldInfos fis, int numDocs)
    Parameters
    Type Name Description
    FieldInfos fis
    System.Int32 numDocs
    | Improve this Doc View Source

    FinishDocument()

    Called after a doc and all its fields have been added.

    Declaration
    public virtual void FinishDocument()
    | Improve this Doc View Source

    FinishField()

    Called after a field and all its terms have been added.

    Declaration
    public virtual void FinishField()
    | Improve this Doc View Source

    FinishTerm()

    Called after a term and all its positions have been added.

    Declaration
    public virtual void FinishTerm()
    | Improve this Doc View Source

    Merge(MergeState)

    Merges in the term vectors from the readers in mergeState. The default implementation skips over deleted documents, and uses StartDocument(Int32), StartField(FieldInfo, Int32, Boolean, Boolean, Boolean), StartTerm(BytesRef, Int32), AddPosition(Int32, Int32, Int32, BytesRef), and Finish(FieldInfos, Int32), returning the number of documents that were written. Implementations can override this method for more sophisticated merging (bulk-byte copying, etc).

    Declaration
    public virtual int Merge(MergeState mergeState)
    Parameters
    Type Name Description
    MergeState mergeState
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    StartDocument(Int32)

    Called before writing the term vectors of the document. StartField(FieldInfo, Int32, Boolean, Boolean, Boolean) will be called numVectorFields times. Note that if term vectors are enabled, this is called even if the document has no vector fields, in this case numVectorFields will be zero.

    Declaration
    public abstract void StartDocument(int numVectorFields)
    Parameters
    Type Name Description
    System.Int32 numVectorFields
    | Improve this Doc View Source

    StartField(FieldInfo, Int32, Boolean, Boolean, Boolean)

    Called before writing the terms of the field. StartTerm(BytesRef, Int32) will be called numTerms times.

    Declaration
    public abstract void StartField(FieldInfo info, int numTerms, bool positions, bool offsets, bool payloads)
    Parameters
    Type Name Description
    FieldInfo info
    System.Int32 numTerms
    System.Boolean positions
    System.Boolean offsets
    System.Boolean payloads
    | Improve this Doc View Source

    StartTerm(BytesRef, Int32)

    Adds a term and its term frequency freq. If this field has positions and/or offsets enabled, then AddPosition(Int32, Int32, Int32, BytesRef) will be called freq times respectively.

    Declaration
    public abstract void StartTerm(BytesRef term, int freq)
    Parameters
    Type Name Description
    BytesRef term
    System.Int32 freq
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)