Class OpenBitSet
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet is used in Java.
Unlike
java.util.BitSet, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
| 1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
| 1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Assembly: Lucene.Net.dll
Syntax
public class OpenBitSet : DocIdSet, IBits
Constructors
OpenBitSet()
Constructor: allocates enough space for 64 bits.
Declaration
OpenBitSet(long)
Constructs an OpenBitSet large enough to hold numBits.
Declaration
public OpenBitSet(long numBits)
Parameters
| Type |
Name |
Description |
| long |
numBits |
|
OpenBitSet(long[], int)
Constructs an OpenBitSet from an existing long[].
The first 64 bits are in long[0], with bit index 0 at the least significant
bit, and bit index 63 at the most significant. Given a bit index, the word
containing it is long[index/64], and it is at bit number index%64 within
that word.
numWords are the number of elements in the array that contain set bits
(non-zero longs).
numWords should be <= bits.Length, and any existing
words in the array at position >= numWords should be zero.
Declaration
public OpenBitSet(long[] bits, int numWords)
Parameters
| Type |
Name |
Description |
| long[] |
bits |
|
| int |
numWords |
|
Fields
m_bits
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet is used in Java.
Unlike
java.util.BitSet, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
| 1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
| 1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
Field Value
m_wlen
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet is used in Java.
Unlike
java.util.BitSet, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
| 1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
| 1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
Field Value
Properties
Bits
Optionally provides a IBits interface for random access
to matching documents.
Declaration
public override IBits Bits { get; }
Property Value
| Type |
Description |
| IBits |
null, if this DocIdSet does not support random access.
In contrast to GetIterator(), a return value of nulldoes not imply that no documents match the filter!
The default implementation does not provide random access, so you
only need to implement this method if your DocIdSet can
guarantee random access to every docid in O(1) time without
external disk access (as IBits interface cannot throw
IOException). This is generally true for bit sets
like FixedBitSet, which return
itself if they are used as DocIdSet.
|
Overrides
Capacity
Returns the current capacity in bits (1 greater than the index of the last bit).
Declaration
public virtual long Capacity { get; }
Property Value
Cardinality
Gets the number of set bits.
Declaration
public virtual long Cardinality { get; }
Property Value
| Type |
Description |
| long |
The number of set bits.
|
IsCacheable
This DocIdSet implementation is cacheable.
Declaration
public override bool IsCacheable { get; }
Property Value
Overrides
IsEmpty
Returns true if there are no set bits
Declaration
public virtual bool IsEmpty { get; }
Property Value
Length
Returns the current capacity of this set. This is not equal to Cardinality.
NOTE: This is equivalent to size() or length() in Lucene.
Declaration
public virtual int Length { get; }
Property Value
NumWords
Expert: gets the number of longs in the array that are in use.
Declaration
public virtual int NumWords { get; }
Property Value
Methods
And(OpenBitSet)
Declaration
public virtual void And(OpenBitSet other)
Parameters
AndNot(OpenBitSet)
Declaration
public virtual void AndNot(OpenBitSet other)
Parameters
AndNotCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of "a and not b"
or "intersection(a, not(b))".
Neither set is modified.
Declaration
public static long AndNotCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Bits2words(long)
Returns the number of 64 bit words it would take to hold numBits.
Declaration
public static int Bits2words(long numBits)
Parameters
| Type |
Name |
Description |
| long |
numBits |
|
Returns
Clear(int, int)
Clears a range of bits. Clearing past the end does not change the size of the set.
Declaration
public virtual void Clear(int startIndex, int endIndex)
Parameters
| Type |
Name |
Description |
| int |
startIndex |
Lower index
|
| int |
endIndex |
One-past the last bit to clear
|
Clear(long)
Clears a bit, allowing access beyond the current set size without changing the size.
Declaration
public virtual void Clear(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Clear(long, long)
Clears a range of bits. Clearing past the end does not change the size of the set.
Declaration
public virtual void Clear(long startIndex, long endIndex)
Parameters
| Type |
Name |
Description |
| long |
startIndex |
Lower index
|
| long |
endIndex |
One-past the last bit to clear
|
Clone()
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet is used in Java.
Unlike
java.util.BitSet, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
| 1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
| 1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
Returns
EnsureCapacity(long)
Ensure that the long[] is big enough to hold numBits, expanding it if
necessary.
Declaration
public virtual void EnsureCapacity(long numBits)
Parameters
| Type |
Name |
Description |
| long |
numBits |
|
EnsureCapacityWords(int)
Expand the long[] with the size given as a number of words (64 bit longs).
Declaration
public virtual void EnsureCapacityWords(int numWords)
Parameters
| Type |
Name |
Description |
| int |
numWords |
|
Equals(object)
Returns true if both sets have the same bits set.
Declaration
public override bool Equals(object o)
Parameters
| Type |
Name |
Description |
| object |
o |
|
Returns
Overrides
ExpandingWordNum(long)
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet is used in Java.
Unlike
java.util.BitSet, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
| 1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
|---|
| 50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
| 1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
protected virtual int ExpandingWordNum(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
FastClear(int)
Clears a bit.
The index should be less than the Length.
Declaration
public virtual void FastClear(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
FastClear(long)
Clears a bit.
The index should be less than the Length.
Declaration
public virtual void FastClear(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
FastFlip(int)
Flips a bit.
The index should be less than the Length.
Declaration
public virtual void FastFlip(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
FastFlip(long)
Flips a bit.
The index should be less than the Length.
Declaration
public virtual void FastFlip(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
FastGet(int)
Returns true or false for the specified bit index.
The index should be less than the Length.
Declaration
public virtual bool FastGet(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
FastGet(long)
Returns true or false for the specified bit index.
The index should be less than the Length.
Declaration
public virtual bool FastGet(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
FastSet(int)
Sets the bit at the specified index.
The index should be less than the Length.
Declaration
public virtual void FastSet(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
FastSet(long)
Sets the bit at the specified index.
The index should be less than the Length.
Declaration
public virtual void FastSet(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Flip(long)
Flips a bit, expanding the set size if necessary.
Declaration
public virtual void Flip(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Flip(long, long)
Flips a range of bits, expanding the set size if necessary.
Declaration
public virtual void Flip(long startIndex, long endIndex)
Parameters
| Type |
Name |
Description |
| long |
startIndex |
Lower index
|
| long |
endIndex |
One-past the last bit to flip
|
FlipAndGet(int)
Flips a bit and returns the resulting bit value.
The index should be less than the Length.
Declaration
public virtual bool FlipAndGet(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
FlipAndGet(long)
Flips a bit and returns the resulting bit value.
The index should be less than the Length.
Declaration
public virtual bool FlipAndGet(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
Get(int)
Returns true or false for the specified bit index.
Declaration
public virtual bool Get(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
Get(long)
Returns true or false for the specified bit index.
Declaration
public virtual bool Get(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
GetAndSet(int)
Sets a bit and returns the previous value.
The index should be less than the Length.
Declaration
public virtual bool GetAndSet(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
GetAndSet(long)
Sets a bit and returns the previous value.
The index should be less than the Length.
Declaration
public virtual bool GetAndSet(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
GetBit(int)
Returns 1 if the bit is set, 0 if not.
The index should be less than the Length.
Declaration
public virtual int GetBit(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
GetBits()
Expert: returns the long[] storing the bits.
Declaration
public virtual long[] GetBits()
Returns
GetHashCode()
Serves as the default hash function.
Declaration
public override int GetHashCode()
Returns
| Type |
Description |
| int |
A hash code for the current object.
|
Overrides
GetIterator()
Provides a DocIdSetIterator to access the set.
This implementation can return null if there
are no docs that match.
Declaration
public override DocIdSetIterator GetIterator()
Returns
Overrides
Intersect(OpenBitSet)
Declaration
public virtual void Intersect(OpenBitSet other)
Parameters
IntersectionCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of the intersection of the two sets.
Neither set is modified.
Declaration
public static long IntersectionCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Intersects(OpenBitSet)
returns true if the sets have any elements in common.
Declaration
public virtual bool Intersects(OpenBitSet other)
Parameters
Returns
NextSetBit(int)
Returns the index of the first set bit starting at the index specified.
-1 is returned if there are no more set bits.
Declaration
public virtual int NextSetBit(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
NextSetBit(long)
Returns the index of the first set bit starting at the index specified.
-1 is returned if there are no more set bits.
Declaration
public virtual long NextSetBit(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
Or(OpenBitSet)
Declaration
public virtual void Or(OpenBitSet other)
Parameters
PrevSetBit(int)
Returns the index of the first set bit starting downwards at
the index specified.
-1 is returned if there are no more set bits.
Declaration
public virtual int PrevSetBit(int index)
Parameters
| Type |
Name |
Description |
| int |
index |
|
Returns
PrevSetBit(long)
Returns the index of the first set bit starting downwards at
the index specified.
-1 is returned if there are no more set bits.
Declaration
public virtual long PrevSetBit(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Returns
Remove(OpenBitSet)
Remove all elements set in other. this = this AND_NOT other.
Declaration
public virtual void Remove(OpenBitSet other)
Parameters
Set(long)
Sets a bit, expanding the set size if necessary.
Declaration
public virtual void Set(long index)
Parameters
| Type |
Name |
Description |
| long |
index |
|
Set(long, long)
Sets a range of bits, expanding the set size if necessary.
Declaration
public virtual void Set(long startIndex, long endIndex)
Parameters
| Type |
Name |
Description |
| long |
startIndex |
Lower index
|
| long |
endIndex |
One-past the last bit to set
|
TrimTrailingZeros()
Lowers numWords, the number of words in use,
by checking for trailing zero words.
Declaration
public virtual void TrimTrailingZeros()
Union(OpenBitSet)
Declaration
public virtual void Union(OpenBitSet other)
Parameters
UnionCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of the union of the two sets.
Neither set is modified.
Declaration
public static long UnionCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Xor(OpenBitSet)
Declaration
public virtual void Xor(OpenBitSet other)
Parameters
XorCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of the exclusive-or of the two sets.
Neither set is modified.
Declaration
public static long XorCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Implements