Class TermVectorsWriter

Codec API for writing term vectors:

For every document, StartDocument(Int32) is called, informing the Codec how many fields will be written.
StartField(FieldInfo, Int32, Boolean, Boolean, Boolean) is called for each field in the document, informing the codec how many terms will be written for that field, and whether or not positions, offsets, or payloads are enabled.
Within each field, StartTerm(BytesRef, Int32) is called for each term.
If offsets and/or positions are enabled, then AddPosition(Int32, Int32, Int32, BytesRef) will be called for each term occurrence.
After all documents have been written, Finish(FieldInfos, Int32) is called for verification/sanity-checks.
Finally the writer is disposed (Dispose(Boolean))

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

Inheritance

System.Object

TermVectorsWriter

CompressingTermVectorsWriter

Lucene40TermVectorsWriter

Namespace: Lucene.Net.Codecs

Assembly: Lucene.Net.dll

Syntax

public abstract class TermVectorsWriter : IDisposable

Constructors

| Improve this Doc View Source

TermVectorsWriter()

Sole constructor. (For invocation by subclass constructors, typically implicit.)

Declaration

protected TermVectorsWriter()

Properties

| Improve this Doc View Source

Comparer

Return the IComparer<BytesRef> used to sort terms before feeding to this API.

Declaration

public abstract IComparer<BytesRef> Comparer { get; }

Property Value

Type	Description
IComparer<BytesRef>

Methods

| Improve this Doc View Source

Abort()

Aborts writing entirely, implementation should remove any partially-written files, etc.

Declaration

public abstract void Abort()

| Improve this Doc View Source

AddAllDocVectors(Fields, MergeState)

Safe (but, slowish) default method to write every vector field in the document.

Declaration

protected void AddAllDocVectors(Fields vectors, MergeState mergeState)

Parameters

Type	Name	Description
Fields	vectors
MergeState	mergeState

| Improve this Doc View Source

AddPosition(Int32, Int32, Int32, BytesRef)

Adds a term position and offsets.

Declaration

public abstract void AddPosition(int position, int startOffset, int endOffset, BytesRef payload)

Parameters

Type	Name	Description
System.Int32	position
System.Int32	startOffset
System.Int32	endOffset
BytesRef	payload

| Improve this Doc View Source

AddProx(Int32, DataInput, DataInput)

Called by IndexWriter when writing new segments.

This is an expert API that allows the codec to consume positions and offsets directly from the indexer.

The default implementation calls AddPosition(Int32, Int32, Int32, BytesRef), but subclasses can override this if they want to efficiently write all the positions, then all the offsets, for example.

NOTE: this API is extremely expert and subject to change or removal!!!

This is a Lucene.NET INTERNAL API, use at your own risk

Declaration

public virtual void AddProx(int numProx, DataInput positions, DataInput offsets)

Parameters

Type	Name	Description
System.Int32	numProx
DataInput	positions
DataInput	offsets

| Improve this Doc View Source

Dispose()

Disposes all resources used by this object.

Declaration

public void Dispose()

| Improve this Doc View Source

Dispose(Boolean)

Implementations must override and should dispose all resources used by this instance.

Declaration

protected abstract void Dispose(bool disposing)

Parameters

Type	Name	Description
System.Boolean	disposing

| Improve this Doc View Source

Finish(FieldInfos, Int32)

Called before Dispose(Boolean), passing in the number of documents that were written. Note that this is intentionally redundant (equivalent to the number of calls to StartDocument(Int32), but a Codec should check that this is the case to detect the bug described in LUCENE-1282.

Declaration

public abstract void Finish(FieldInfos fis, int numDocs)

Parameters

Type	Name	Description
FieldInfos	fis
System.Int32	numDocs

| Improve this Doc View Source

FinishDocument()

Called after a doc and all its fields have been added.

Declaration

public virtual void FinishDocument()

| Improve this Doc View Source

FinishField()

Called after a field and all its terms have been added.

Declaration

public virtual void FinishField()

| Improve this Doc View Source

FinishTerm()

Called after a term and all its positions have been added.

Declaration

public virtual void FinishTerm()

| Improve this Doc View Source

Merge(MergeState)

Merges in the term vectors from the readers in mergeState. The default implementation skips over deleted documents, and uses StartDocument(Int32), StartField(FieldInfo, Int32, Boolean, Boolean, Boolean), StartTerm(BytesRef, Int32), AddPosition(Int32, Int32, Int32, BytesRef), and Finish(FieldInfos, Int32), returning the number of documents that were written. Implementations can override this method for more sophisticated merging (bulk-byte copying, etc).

Declaration

public virtual int Merge(MergeState mergeState)

Parameters

Type	Name	Description
MergeState	mergeState

Returns

Type	Description
System.Int32

| Improve this Doc View Source

StartDocument(Int32)

Called before writing the term vectors of the document. StartField(FieldInfo, Int32, Boolean, Boolean, Boolean) will be called numVectorFields times. Note that if term vectors are enabled, this is called even if the document has no vector fields, in this case numVectorFields will be zero.

Declaration

public abstract void StartDocument(int numVectorFields)

Parameters

Type	Name	Description
System.Int32	numVectorFields

| Improve this Doc View Source

StartField(FieldInfo, Int32, Boolean, Boolean, Boolean)

Called before writing the terms of the field. StartTerm(BytesRef, Int32) will be called numTerms times.

Declaration

public abstract void StartField(FieldInfo info, int numTerms, bool positions, bool offsets, bool payloads)

Parameters

Type	Name	Description
FieldInfo	info
System.Int32	numTerms
System.Boolean	positions
System.Boolean	offsets
System.Boolean	payloads

| Improve this Doc View Source

StartTerm(BytesRef, Int32)

Adds a term and its term frequency freq. If this field has positions and/or offsets enabled, then AddPosition(Int32, Int32, Int32, BytesRef) will be called freq times respectively.

Declaration

public abstract void StartTerm(BytesRef term, int freq)

Parameters

Type	Name	Description
BytesRef	term
System.Int32	freq