Namespace Lucene.Net.Codecs.BlockTerms
Pluggable term index / block terms dictionary implementations.
Classes
BlockTermsReader
Handles a terms dict, but decouples all details of doc/freqs/positions reading to an instance of Lucene.Net.Codecs.PostingsReaderBase. This class is reusable for codecs that use a different format for docs/freqs/positions (though codecs are also free to make their own terms dict impl).
This class also interacts with an instance of TermsIndexReaderBase, to abstract away the specific implementation of the terms dict index.Note
This API is experimental and might change in incompatible ways in the next release.
BlockTermsWriter
Writes terms dict, block-encoding (column stride) each term's metadata for each set of terms between two index terms.
Note
This API is experimental and might change in incompatible ways in the next release.
FixedGapTermsIndexReader
TermsIndexReaderBase for simple every Nth terms indexes.
Note
This API is experimental and might change in incompatible ways in the next release.
FixedGapTermsIndexWriter
Selects every Nth term as and index term, and hold term bytes (mostly) fully expanded in memory. This terms index supports seeking by ord. See VariableGapTermsIndexWriter for a more memory efficient terms index that does not support seeking by ord.
Note
This API is experimental and might change in incompatible ways in the next release.
TermsIndexReaderBase
BlockTermsReader interacts with an instance of this class to manage its terms index. The writer must accept indexed terms (many pairs of Lucene.Net.Util.BytesRef text + long fileOffset), and then this reader must be able to retrieve the nearest index term to a provided term text.
Note
This API is experimental and might change in incompatible ways in the next release.
TermsIndexReaderBase.FieldIndexEnum
Similar to Lucene.Net.Index.TermsEnum, except, the only "metadata" it reports for a given indexed term is the long fileOffset into the main terms dictionary file.
TermsIndexWriterBase
Base class for terms index implementations to plug into BlockTermsWriter.
Note
This API is experimental and might change in incompatible ways in the next release.
TermsIndexWriterBase.FieldWriter
Terms index API for a single field.
VariableGapTermsIndexReader
See VariableGapTermsIndexWriter.
Note
This API is experimental and might change in incompatible ways in the next release.
VariableGapTermsIndexWriter
Selects index terms according to provided pluggable VariableGapTermsIndexWriter.IndexTermSelector, and stores them in a prefix trie that's loaded entirely in RAM stored as an Lucene.Net.Util.Fst.FST<T>. This terms index only supports unsigned byte term sort order (unicode codepoint order when the bytes are UTF8).
Note
This API is experimental and might change in incompatible ways in the next release.
VariableGapTermsIndexWriter.EveryNOrDocFreqTermSelector
Sets an index term when docFreq >= docFreqThresh, or every interval terms. This should reduce seek time to high docFreq terms.
VariableGapTermsIndexWriter.EveryNTermSelector
Pluggable term index / block terms dictionary implementations.
VariableGapTermsIndexWriter.IndexTermSelector
Hook for selecting which terms should be placed in the terms index.
NewField(FieldInfo) is called at the start of each new field, and IsIndexTerm(BytesRef, TermStats) for each term in that field.Note
This API is experimental and might change in incompatible ways in the next release.