Class OpenBitSet
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet
is used in Java.
Unlike
java.util.BitSet
, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet
in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Assembly: Lucene.Net.dll
Syntax
public class OpenBitSet : DocIdSet, IBits
Constructors
OpenBitSet()
Constructor: allocates enough space for 64 bits.
Declaration
OpenBitSet(long)
Constructs an OpenBitSet large enough to hold numBits
.
Declaration
public OpenBitSet(long numBits)
Parameters
Type |
Name |
Description |
long |
numBits |
|
OpenBitSet(long[], int)
Constructs an OpenBitSet from an existing long[].
The first 64 bits are in long[0], with bit index 0 at the least significant
bit, and bit index 63 at the most significant. Given a bit index, the word
containing it is long[index/64], and it is at bit number index%64 within
that word.
numWords
are the number of elements in the array that contain set bits
(non-zero longs).
numWords
should be <= bits.Length, and any existing
words in the array at position >= numWords should be zero.
Declaration
public OpenBitSet(long[] bits, int numWords)
Parameters
Type |
Name |
Description |
long[] |
bits |
|
int |
numWords |
|
Fields
m_bits
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet
is used in Java.
Unlike
java.util.BitSet
, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet
in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
Field Value
m_wlen
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet
is used in Java.
Unlike
java.util.BitSet
, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet
in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
Field Value
Properties
Bits
Optionally provides a IBits interface for random access
to matching documents.
Declaration
public override IBits Bits { get; }
Property Value
Type |
Description |
IBits |
null , if this DocIdSet does not support random access.
In contrast to GetIterator(), a return value of null does not imply that no documents match the filter!
The default implementation does not provide random access, so you
only need to implement this method if your DocIdSet can
guarantee random access to every docid in O(1) time without
external disk access (as IBits interface cannot throw
IOException). This is generally true for bit sets
like FixedBitSet, which return
itself if they are used as DocIdSet.
|
Overrides
Capacity
Returns the current capacity in bits (1 greater than the index of the last bit).
Declaration
public virtual long Capacity { get; }
Property Value
Cardinality
Gets the number of set bits.
Declaration
public virtual long Cardinality { get; }
Property Value
Type |
Description |
long |
The number of set bits.
|
IsCacheable
This DocIdSet implementation is cacheable.
Declaration
public override bool IsCacheable { get; }
Property Value
Overrides
IsEmpty
Returns true
if there are no set bits
Declaration
public virtual bool IsEmpty { get; }
Property Value
Length
Returns the current capacity of this set. This is not equal to Cardinality.
NOTE: This is equivalent to size() or length() in Lucene.
Declaration
public virtual int Length { get; }
Property Value
NumWords
Expert: gets the number of longs in the array that are in use.
Declaration
public virtual int NumWords { get; }
Property Value
Methods
And(OpenBitSet)
Declaration
public virtual void And(OpenBitSet other)
Parameters
AndNot(OpenBitSet)
Declaration
public virtual void AndNot(OpenBitSet other)
Parameters
AndNotCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of "a and not b"
or "intersection(a, not(b))".
Neither set is modified.
Declaration
public static long AndNotCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Bits2words(long)
Returns the number of 64 bit words it would take to hold numBits
.
Declaration
public static int Bits2words(long numBits)
Parameters
Type |
Name |
Description |
long |
numBits |
|
Returns
Clear(int, int)
Clears a range of bits. Clearing past the end does not change the size of the set.
Declaration
public virtual void Clear(int startIndex, int endIndex)
Parameters
Type |
Name |
Description |
int |
startIndex |
Lower index
|
int |
endIndex |
One-past the last bit to clear
|
Clear(long)
Clears a bit, allowing access beyond the current set size without changing the size.
Declaration
public virtual void Clear(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Clear(long, long)
Clears a range of bits. Clearing past the end does not change the size of the set.
Declaration
public virtual void Clear(long startIndex, long endIndex)
Parameters
Type |
Name |
Description |
long |
startIndex |
Lower index
|
long |
endIndex |
One-past the last bit to clear
|
Clone()
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet
is used in Java.
Unlike
java.util.BitSet
, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet
in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
Returns
EnsureCapacity(long)
Ensure that the long[] is big enough to hold numBits, expanding it if
necessary.
Declaration
public virtual void EnsureCapacity(long numBits)
Parameters
Type |
Name |
Description |
long |
numBits |
|
EnsureCapacityWords(int)
Expand the long[] with the size given as a number of words (64 bit longs).
Declaration
public virtual void EnsureCapacityWords(int numWords)
Parameters
Type |
Name |
Description |
int |
numWords |
|
Equals(object)
Returns true
if both sets have the same bits set.
Declaration
public override bool Equals(object o)
Parameters
Type |
Name |
Description |
object |
o |
|
Returns
Overrides
ExpandingWordNum(long)
An "open" BitSet implementation that allows direct access to the array of words
storing the bits.
NOTE: This can be used in .NET any place where a
java.util.BitSet
is used in Java.
Unlike
java.util.BitSet
, the fact that bits are packed into an array of longs
is part of the interface. This allows efficient implementation of other algorithms
by someone other than the author. It also allows one to efficiently implement
alternate serialization or interchange formats.
OpenBitSet is faster than
java.util.BitSet
in most operations
and *much* faster at calculating cardinality of sets and results of set operations.
It can also handle sets of larger cardinality (up to 64 * 2**32-1)
The goals of
OpenBitSet are the fastest implementation possible, and
maximum code reuse. Extra safety and encapsulation
may always be built on top, but if that's built in, the cost can never be removed (and
hence people re-implement their own version in order to get better performance).
Performance Results
Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 3.36 | 3.96 | 1.44 | 1.46 | 1.99 | 1.58 |
1% full | 3.31 | 3.90 | | 1.04 | | 0.99 |
Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M
BitSet size = 1,000,000
Results are java.util.BitSet time divided by OpenBitSet time.
| cardinality | IntersectionCount | Union | NextSetBit | Get | GetIterator |
---|
50% full | 2.50 | 3.50 | 1.00 | 1.03 | 1.12 | 1.25 |
1% full | 2.51 | 3.49 | | 1.00 | | 1.02 |
Declaration
protected virtual int ExpandingWordNum(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
FastClear(int)
Clears a bit.
The index
should be less than the Length.
Declaration
public virtual void FastClear(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
FastClear(long)
Clears a bit.
The index
should be less than the Length.
Declaration
public virtual void FastClear(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
FastFlip(int)
Flips a bit.
The index
should be less than the Length.
Declaration
public virtual void FastFlip(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
FastFlip(long)
Flips a bit.
The index
should be less than the Length.
Declaration
public virtual void FastFlip(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
FastGet(int)
Returns true
or false
for the specified bit index
.
The index should be less than the Length.
Declaration
public virtual bool FastGet(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
FastGet(long)
Returns true
or false
for the specified bit index
.
The index should be less than the Length.
Declaration
public virtual bool FastGet(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
FastSet(int)
Sets the bit at the specified index
.
The index
should be less than the Length.
Declaration
public virtual void FastSet(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
FastSet(long)
Sets the bit at the specified index
.
The index
should be less than the Length.
Declaration
public virtual void FastSet(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Flip(long)
Flips a bit, expanding the set size if necessary.
Declaration
public virtual void Flip(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Flip(long, long)
Flips a range of bits, expanding the set size if necessary.
Declaration
public virtual void Flip(long startIndex, long endIndex)
Parameters
Type |
Name |
Description |
long |
startIndex |
Lower index
|
long |
endIndex |
One-past the last bit to flip
|
FlipAndGet(int)
Flips a bit and returns the resulting bit value.
The index
should be less than the Length.
Declaration
public virtual bool FlipAndGet(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
FlipAndGet(long)
Flips a bit and returns the resulting bit value.
The index
should be less than the Length.
Declaration
public virtual bool FlipAndGet(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
Get(int)
Returns true
or false
for the specified bit index
.
Declaration
public virtual bool Get(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
Get(long)
Returns true
or false
for the specified bit index
.
Declaration
public virtual bool Get(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
GetAndSet(int)
Sets a bit and returns the previous value.
The index
should be less than the Length.
Declaration
public virtual bool GetAndSet(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
GetAndSet(long)
Sets a bit and returns the previous value.
The index
should be less than the Length.
Declaration
public virtual bool GetAndSet(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
GetBit(int)
Returns 1 if the bit is set, 0 if not.
The index
should be less than the Length.
Declaration
public virtual int GetBit(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
GetBits()
Expert: returns the long[] storing the bits.
Declaration
public virtual long[] GetBits()
Returns
GetHashCode()
Serves as the default hash function.
Declaration
public override int GetHashCode()
Returns
Type |
Description |
int |
A hash code for the current object.
|
Overrides
GetIterator()
Provides a DocIdSetIterator to access the set.
This implementation can return null
if there
are no docs that match.
Declaration
public override DocIdSetIterator GetIterator()
Returns
Overrides
Intersect(OpenBitSet)
Declaration
public virtual void Intersect(OpenBitSet other)
Parameters
IntersectionCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of the intersection of the two sets.
Neither set is modified.
Declaration
public static long IntersectionCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Intersects(OpenBitSet)
returns true
if the sets have any elements in common.
Declaration
public virtual bool Intersects(OpenBitSet other)
Parameters
Returns
NextSetBit(int)
Returns the index of the first set bit starting at the index
specified.
-1 is returned if there are no more set bits.
Declaration
public virtual int NextSetBit(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
NextSetBit(long)
Returns the index of the first set bit starting at the index
specified.
-1 is returned if there are no more set bits.
Declaration
public virtual long NextSetBit(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
Or(OpenBitSet)
Declaration
public virtual void Or(OpenBitSet other)
Parameters
PrevSetBit(int)
Returns the index of the first set bit starting downwards at
the index
specified.
-1 is returned if there are no more set bits.
Declaration
public virtual int PrevSetBit(int index)
Parameters
Type |
Name |
Description |
int |
index |
|
Returns
PrevSetBit(long)
Returns the index of the first set bit starting downwards at
the index
specified.
-1 is returned if there are no more set bits.
Declaration
public virtual long PrevSetBit(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Returns
Remove(OpenBitSet)
Remove all elements set in other. this = this AND_NOT other.
Declaration
public virtual void Remove(OpenBitSet other)
Parameters
Set(long)
Sets a bit, expanding the set size if necessary.
Declaration
public virtual void Set(long index)
Parameters
Type |
Name |
Description |
long |
index |
|
Set(long, long)
Sets a range of bits, expanding the set size if necessary.
Declaration
public virtual void Set(long startIndex, long endIndex)
Parameters
Type |
Name |
Description |
long |
startIndex |
Lower index
|
long |
endIndex |
One-past the last bit to set
|
TrimTrailingZeros()
Lowers numWords, the number of words in use,
by checking for trailing zero words.
Declaration
public virtual void TrimTrailingZeros()
Union(OpenBitSet)
Declaration
public virtual void Union(OpenBitSet other)
Parameters
UnionCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of the union of the two sets.
Neither set is modified.
Declaration
public static long UnionCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Xor(OpenBitSet)
Declaration
public virtual void Xor(OpenBitSet other)
Parameters
XorCount(OpenBitSet, OpenBitSet)
Returns the popcount or cardinality of the exclusive-or of the two sets.
Neither set is modified.
Declaration
public static long XorCount(OpenBitSet a, OpenBitSet b)
Parameters
Returns
Implements