Namespace Lucene.Net.Facet.Taxonomy.WriterCache
Improves indexing time by caching a map of CategoryPath to their Ordinal.
Classes
Cl2oTaxonomyWriterCache
ITaxonomy
CollisionMap
HashMap to store colliding labels. See Compact
CompactLabelToOrdinal
This is a very efficient Label
Since the Lucene.
This data structure grows by adding a new HashArray whenever the number of
collisions in the CollisionGetMaxOrdinal().
Growing also includes reinserting all colliding
labels into the Lucene.
For setting the Lucene.
This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels.
LabelToOrdinal
Abstract class for storing Label->Ordinal mappings in a taxonomy.
LruTaxonomyWriterCache
LRU ITaxonomy
NameHashInt32CacheLru
An an LRU cache of mapping from name to int. Used to cache Ordinals of category paths. It uses as key, hash of the path instead of the path. This way the cache takes less RAM, but correctness depends on assuming no collisions.
NOTE: this was NameHashIntCacheLRU in Lucene
NameInt32CacheLru
An an LRU cache of mapping from name to int. Used to cache Ordinals of category paths.
NOTE: This was NameIntCacheLRU in Lucene
Interfaces
INameInt32CacheLru
Public members of the Name
ITaxonomyWriterCache
ITaxonomy
It basically has Put(Facet
However, if it does so, it should clear out large parts of the cache at once,
because the user will typically need to work hard to recover from every cache
cleanup (see Put(Facet
NOTE: the cache may be accessed concurrently by multiple threads, therefore cache implementations should take this into consideration.
Enums
LruTaxonomyWriterCache.LRUType
Determines cache type. For guaranteed correctness - not relying on no-collisions in the hash function, LRU_STRING should be used.