Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class BloomFilteringPostingsFormat

    A Lucene.Net.Codecs.PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate Lucene.Net.Codecs.PostingsFormat is used to record all other Postings data.

    A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes values using MurmurHash2. This should be suitable for most purposes.

    The format of the blm file is as follows:
    • BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
    • Filter --> FieldNumber, FuzzySet
    • FuzzySet -->See Serialize(DataOutput)
    • Header --> CodecHeader (WriteHeader(DataOutput, string, int))
    • DelegatePostingsFormatName --> String (WriteString(string)) The name of a ServiceProvider registered Lucene.Net.Codecs.PostingsFormat
    • NumFilteredFields --> Uint32 (WriteInt32(int))
    • FieldNumber --> Uint32 (WriteInt32(int)) The number of the field in this segment
    • Footer --> CodecFooter (Lucene.Net.Codecs.CodecUtil.WriteFooter(Lucene.Net.Store.IndexOutput))

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Inheritance
    object
    PostingsFormat
    BloomFilteringPostingsFormat
    Inherited Members
    PostingsFormat.EMPTY
    PostingsFormat.SetPostingsFormatFactory(IPostingsFormatFactory)
    PostingsFormat.GetPostingsFormatFactory()
    PostingsFormat.Name
    PostingsFormat.ToString()
    PostingsFormat.ForName(string)
    PostingsFormat.AvailablePostingsFormats
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.ReferenceEquals(object, object)
    Namespace: Lucene.Net.Codecs.Bloom
    Assembly: Lucene.Net.Codecs.dll
    Syntax
    [PostingsFormatName("BloomFilter")]
    public sealed class BloomFilteringPostingsFormat : PostingsFormat

    Constructors

    BloomFilteringPostingsFormat()

    Used only by core Lucene at read-time via Service Provider instantiation - do not use at Write-time in application code.

    Declaration
    public BloomFilteringPostingsFormat()

    BloomFilteringPostingsFormat(PostingsFormat)

    Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This Lucene.Net.Codecs.PostingsFormat delegates to a choice of delegate Lucene.Net.Codecs.PostingsFormat for encoding all other postings data. This choice of constructor defaults to the DefaultBloomFilterFactory for configuring per-field BloomFilters.

    Declaration
    public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
    Parameters
    Type Name Description
    PostingsFormat delegatePostingsFormat

    The Lucene.Net.Codecs.PostingsFormat that records all the non-bloom filter data i.e. postings info.

    BloomFilteringPostingsFormat(PostingsFormat, BloomFilterFactory)

    Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This Lucene.Net.Codecs.PostingsFormat delegates to a choice of delegate Lucene.Net.Codecs.PostingsFormat for encoding all other postings data.

    Declaration
    public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat, BloomFilterFactory bloomFilterFactory)
    Parameters
    Type Name Description
    PostingsFormat delegatePostingsFormat

    The Lucene.Net.Codecs.PostingsFormat that records all the non-bloom filter data i.e. postings info.

    BloomFilterFactory bloomFilterFactory

    The BloomFilterFactory responsible for sizing BloomFilters appropriately.

    Fields

    VERSION_CHECKSUM

    A Lucene.Net.Codecs.PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate Lucene.Net.Codecs.PostingsFormat is used to record all other Postings data.

    A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes values using MurmurHash2. This should be suitable for most purposes.

    The format of the blm file is as follows:
    • BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
    • Filter --> FieldNumber, FuzzySet
    • FuzzySet -->See Serialize(DataOutput)
    • Header --> CodecHeader (WriteHeader(DataOutput, string, int))
    • DelegatePostingsFormatName --> String (WriteString(string)) The name of a ServiceProvider registered Lucene.Net.Codecs.PostingsFormat
    • NumFilteredFields --> Uint32 (WriteInt32(int))
    • FieldNumber --> Uint32 (WriteInt32(int)) The number of the field in this segment
    • Footer --> CodecFooter (Lucene.Net.Codecs.CodecUtil.WriteFooter(Lucene.Net.Store.IndexOutput))

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    public static readonly int VERSION_CHECKSUM
    Field Value
    Type Description
    int

    VERSION_CURRENT

    A Lucene.Net.Codecs.PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate Lucene.Net.Codecs.PostingsFormat is used to record all other Postings data.

    A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes values using MurmurHash2. This should be suitable for most purposes.

    The format of the blm file is as follows:
    • BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
    • Filter --> FieldNumber, FuzzySet
    • FuzzySet -->See Serialize(DataOutput)
    • Header --> CodecHeader (WriteHeader(DataOutput, string, int))
    • DelegatePostingsFormatName --> String (WriteString(string)) The name of a ServiceProvider registered Lucene.Net.Codecs.PostingsFormat
    • NumFilteredFields --> Uint32 (WriteInt32(int))
    • FieldNumber --> Uint32 (WriteInt32(int)) The number of the field in this segment
    • Footer --> CodecFooter (Lucene.Net.Codecs.CodecUtil.WriteFooter(Lucene.Net.Store.IndexOutput))

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    public static readonly int VERSION_CURRENT
    Field Value
    Type Description
    int

    VERSION_START

    A Lucene.Net.Codecs.PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate Lucene.Net.Codecs.PostingsFormat is used to record all other Postings data.

    A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes values using MurmurHash2. This should be suitable for most purposes.

    The format of the blm file is as follows:
    • BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
    • Filter --> FieldNumber, FuzzySet
    • FuzzySet -->See Serialize(DataOutput)
    • Header --> CodecHeader (WriteHeader(DataOutput, string, int))
    • DelegatePostingsFormatName --> String (WriteString(string)) The name of a ServiceProvider registered Lucene.Net.Codecs.PostingsFormat
    • NumFilteredFields --> Uint32 (WriteInt32(int))
    • FieldNumber --> Uint32 (WriteInt32(int)) The number of the field in this segment
    • Footer --> CodecFooter (Lucene.Net.Codecs.CodecUtil.WriteFooter(Lucene.Net.Store.IndexOutput))

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Declaration
    public static readonly int VERSION_START
    Field Value
    Type Description
    int

    Methods

    FieldsConsumer(SegmentWriteState)

    Writes a new segment.

    Declaration
    public override FieldsConsumer FieldsConsumer(SegmentWriteState state)
    Parameters
    Type Name Description
    SegmentWriteState state
    Returns
    Type Description
    FieldsConsumer
    Overrides
    Lucene.Net.Codecs.PostingsFormat.FieldsConsumer(Lucene.Net.Index.SegmentWriteState)

    FieldsProducer(SegmentReadState)

    Reads a segment. NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.

    Declaration
    public override FieldsProducer FieldsProducer(SegmentReadState state)
    Parameters
    Type Name Description
    SegmentReadState state
    Returns
    Type Description
    FieldsProducer
    Overrides
    Lucene.Net.Codecs.PostingsFormat.FieldsProducer(Lucene.Net.Index.SegmentReadState)
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.