Class CompactLabelToOrdinal
This is a very efficient LabelToOrdinal implementation that uses a CharBlockArray to store all labels and a configurable number of CompactLabelToOrdinal.HashArrays to reference the labels.
Since the CompactLabelToOrdinal.HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels.
This data structure grows by adding a new HashArray whenever the number of
collisions in the CollisionMap exceeds loadFactorGetMaxOrdinal().
Growing also includes reinserting all colliding
labels into the CompactLabelToOrdinal.HashArrays to possibly reduce the number of collisions.
For setting the loadFactor see CompactLabelToOrdinal(int, float, int).
This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels.
Note
This API is experimental and might change in incompatible ways in the next release.
Inherited Members
Namespace: Lucene.Net.Facet.Taxonomy.WriterCache
Assembly: Lucene.Net.Facet.dll
Syntax
public class CompactLabelToOrdinal : LabelToOrdinal
Constructors
CompactLabelToOrdinal(int, float, int)
Sole constructor.
Declaration
public CompactLabelToOrdinal(int initialCapacity, float loadFactor, int numHashArrays)
Parameters
Type | Name | Description |
---|---|---|
int | initialCapacity | |
float | loadFactor | |
int | numHashArrays |
Fields
DefaultLoadFactor
Default maximum load factor.
Declaration
public const float DefaultLoadFactor = 0.15
Field Value
Type | Description |
---|---|
float |
TERMINATOR_CHAR
This is a very efficient LabelToOrdinal implementation that uses a CharBlockArray to store all labels and a configurable number of CompactLabelToOrdinal.HashArrays to reference the labels.
Since the CompactLabelToOrdinal.HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels.
This data structure grows by adding a new HashArray whenever the number of
collisions in the CollisionMap exceeds loadFactorGetMaxOrdinal().
Growing also includes reinserting all colliding
labels into the CompactLabelToOrdinal.HashArrays to possibly reduce the number of collisions.
For setting the loadFactor see CompactLabelToOrdinal(int, float, int).
This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
public const char TERMINATOR_CHAR = '\uffff'
Field Value
Type | Description |
---|---|
char |
Properties
SizeOfMap
How many labels.
Declaration
public virtual int SizeOfMap { get; }
Property Value
Type | Description |
---|---|
int |
Methods
AddLabel(FacetLabel, int)
Adds a new label if its not yet in the table. Throws an ArgumentException if the same label with a different ordinal was previoulsy added to this table.
Declaration
public override void AddLabel(FacetLabel label, int ordinal)
Parameters
Type | Name | Description |
---|---|---|
FacetLabel | label | |
int | ordinal |
Overrides
GetOrdinal(FacetLabel)
Returns the ordinal assigned to the given label, or INVALID_ORDINAL if the label cannot be found in this table.
Declaration
public override int GetOrdinal(FacetLabel label)
Parameters
Type | Name | Description |
---|---|---|
FacetLabel | label |
Returns
Type | Description |
---|---|
int |