Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Codecs.BlockTerms

    Pluggable term index / block terms dictionary implementations.

    Classes

    BlockTermsReader

    Handles a terms dict, but decouples all details of doc/freqs/positions reading to an instance of Lucene.Net.Codecs.PostingsReaderBase. This class is reusable for codecs that use a different format for docs/freqs/positions (though codecs are also free to make their own terms dict impl).

    This class also interacts with an instance of TermsIndexReaderBase, to abstract away the specific implementation of the terms dict index.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    BlockTermsWriter

    Writes terms dict, block-encoding (column stride) each term's metadata for each set of terms between two index terms.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    FixedGapTermsIndexReader

    TermsIndexReaderBase for simple every Nth terms indexes.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    FixedGapTermsIndexWriter

    Selects every Nth term as and index term, and hold term bytes (mostly) fully expanded in memory. This terms index supports seeking by ord. See VariableGapTermsIndexWriter for a more memory efficient terms index that does not support seeking by ord.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    TermsIndexReaderBase

    BlockTermsReader interacts with an instance of this class to manage its terms index. The writer must accept indexed terms (many pairs of Lucene.Net.Util.BytesRef text + long fileOffset), and then this reader must be able to retrieve the nearest index term to a provided term text.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    TermsIndexReaderBase.FieldIndexEnum

    Similar to Lucene.Net.Index.TermsEnum, except, the only "metadata" it reports for a given indexed term is the long fileOffset into the main terms dictionary file.

    TermsIndexWriterBase

    Base class for terms index implementations to plug into BlockTermsWriter.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    TermsIndexWriterBase.FieldWriter

    Terms index API for a single field.

    VariableGapTermsIndexReader

    See VariableGapTermsIndexWriter.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    VariableGapTermsIndexWriter

    Selects index terms according to provided pluggable VariableGapTermsIndexWriter.IndexTermSelector, and stores them in a prefix trie that's loaded entirely in RAM stored as an Lucene.Net.Util.Fst.FST`1. This terms index only supports unsigned byte term sort order (unicode codepoint order when the bytes are UTF8).

    Note

    This API is experimental and might change in incompatible ways in the next release.

    VariableGapTermsIndexWriter.EveryNOrDocFreqTermSelector

    Sets an index term when docFreq >= docFreqThresh, or every interval terms. This should reduce seek time to high docFreq terms.

    VariableGapTermsIndexWriter.EveryNTermSelector

    VariableGapTermsIndexWriter.IndexTermSelector

    Hook for selecting which terms should be placed in the terms index.

    NewField(FieldInfo) is called at the start of each new field, and IsIndexTerm(BytesRef, TermStats) for each term in that field.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    • Improve this Doc
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.