Class WAH8DocIdSet
  
  DocIdSet implementation based on word-aligned hybrid encoding on
words of 8 bits.
This implementation doesn't support random-access but has a fast
DocIdSetIterator which can advance in logarithmic time thanks to
an index.
The compression scheme is simplistic and should work well with sparse and
very dense doc id sets while being only slightly larger than a
FixedBitSet for incompressible sets (overhead<2% in the worst
case) in spite of the index.
Format: The format is byte-aligned. An 8-bits word is either clean,
meaning composed only of zeros or ones, or dirty, meaning that it contains
between 1 and 7 bits set. The idea is to encode sequences of clean words
using run-length encoding and to leave sequences of dirty words as-is.
| TokenClean length+Dirty length+Dirty words |  | 
|---|
| 1 byte0-n bytes0-n bytes0-n bytes |  | 
- Token encodes whether clean means full of zeros or ones in the
        first bit, the number of clean words minus 2 on the next 3 bits and the
        number of dirty words on the last 4 bits. The higher-order bit is a
        continuation bit, meaning that the number is incomplete and needs additional
        bytes to be read.
- Clean length+: If clean length has its higher-order bit set,
        you need to read a vint (ReadVInt32()), shift it by 3 bits on
        the left side and add it to the 3 bits which have been read in the token.
- Dirty length+ works the same way as Clean length+ but
        on 4 bits and for the length of dirty words.
- Dirty wordsare the dirty words, there are Dirty length
        of them.
This format cannot encode sequences of less than 2 clean words and 0 dirty
word. The reason is that if you find a single clean word, you should rather
encode it as a dirty word. This takes the same space as starting a new
sequence (since you need one byte for the token) but will be lighter to
decode. There is however an exception for the first sequence. Since the first
sequence may start directly with a dirty word, the clean length is encoded
directly, without subtracting 2.
There is an additional restriction on the format: the sequence of dirty
words is not allowed to contain two consecutive clean words. This restriction
exists to make sure no space is wasted and to make sure iterators can read
the next doc ID by reading at most 2 dirty words.
This is a Lucene.NET EXPERIMENTAL API, use at your own risk
    Inheritance
    System.Object
    
    WAH8DocIdSet
   
  
    Inherited Members
    
    
    
    
    
    
      System.Object.Equals(System.Object)
    
    
      System.Object.Equals(System.Object, System.Object)
    
    
      System.Object.GetHashCode()
    
    
      System.Object.GetType()
    
    
      System.Object.MemberwiseClone()
    
    
      System.Object.ReferenceEquals(System.Object, System.Object)
    
    
      System.Object.ToString()
    
   
  
  Assembly: Lucene.Net.dll
  Syntax
  
    public sealed class WAH8DocIdSet : DocIdSet
   
  Fields
  
  
    |
    Improve this Doc
  
  
    View Source
  
  DEFAULT_INDEX_INTERVAL
  
  
  Declaration
  
    public const int DEFAULT_INDEX_INTERVAL = 24
   
  Field Value
  
    
      
        | Type | Description | 
    
    
      
        | System.Int32 |  | 
    
  
  Properties
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  IsCacheable
  
  
  Declaration
  
    public override bool IsCacheable { get; }
   
  Property Value
  
    
      
        | Type | Description | 
    
    
      
        | System.Boolean |  | 
    
  
  Overrides
  
  Methods
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  Cardinality()
  Return the number of documents in this DocIdSet in constant time. 
Declaration
  
  Returns
  
    
      
        | Type | Description | 
    
    
      
        | System.Int32 |  | 
    
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  GetIterator()
  
  
  Declaration
  
    public override DocIdSetIterator GetIterator()
   
  Returns
  
  Overrides
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  Intersect(ICollection<WAH8DocIdSet>)
  
  
  Declaration
  
    public static WAH8DocIdSet Intersect(ICollection<WAH8DocIdSet> docIdSets)
   
  Parameters
  
    
      
        | Type | Name | Description | 
    
    
      
        | System.Collections.Generic.ICollection<WAH8DocIdSet> | docIdSets |  | 
    
  
  Returns
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  Intersect(ICollection<WAH8DocIdSet>, Int32)
  Compute the intersection of the provided sets. This method is much faster than
computing the intersection manually since it operates directly at the byte level.
Declaration
  
    public static WAH8DocIdSet Intersect(ICollection<WAH8DocIdSet> docIdSets, int indexInterval)
   
  Parameters
  
    
      
        | Type | Name | Description | 
    
    
      
        | System.Collections.Generic.ICollection<WAH8DocIdSet> | docIdSets |  | 
      
        | System.Int32 | indexInterval |  | 
    
  
  Returns
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  RamBytesUsed()
  Return the memory usage of this class in bytes. 
Declaration
  
    public long RamBytesUsed()
   
  Returns
  
    
      
        | Type | Description | 
    
    
      
        | System.Int64 |  | 
    
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  Union(ICollection<WAH8DocIdSet>)
  
  
  Declaration
  
    public static WAH8DocIdSet Union(ICollection<WAH8DocIdSet> docIdSets)
   
  Parameters
  
    
      
        | Type | Name | Description | 
    
    
      
        | System.Collections.Generic.ICollection<WAH8DocIdSet> | docIdSets |  | 
    
  
  Returns
  
  
    |
    Improve this Doc
  
  
    View Source
  
  
  Union(ICollection<WAH8DocIdSet>, Int32)
  Compute the union of the provided sets. This method is much faster than
computing the union manually since it operates directly at the byte level.
Declaration
  
    public static WAH8DocIdSet Union(ICollection<WAH8DocIdSet> docIdSets, int indexInterval)
   
  Parameters
  
    
      
        | Type | Name | Description | 
    
    
      
        | System.Collections.Generic.ICollection<WAH8DocIdSet> | docIdSets |  | 
      
        | System.Int32 | indexInterval |  | 
    
  
  Returns