Class CachingCollector
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
NOTE: this class consumes 4 (or 8 bytes, if scoring is cached) per collected document. If the result set is large this can easily be a very substantial amount of RAM!
NOTE: this class caches at least 128 documents before checking RAM limits.
See the Lucene modules/grouping
module for more
details including a full code example.
Inheritance
Implements
Inherited Members
Namespace: Lucene.Net.Search
Assembly: Lucene.Net.dll
Syntax
public abstract class CachingCollector : ICollector
Fields
| Improve this Doc View Sourcem_base
Declaration
protected int m_base
Field Value
Type | Description |
---|---|
System.Int32 |
m_cachedDocs
Declaration
protected readonly IList<int[]> m_cachedDocs
Field Value
Type | Description |
---|---|
System.Collections.Generic.IList<System.Int32[]> |
m_curDocs
Declaration
protected int[] m_curDocs
Field Value
Type | Description |
---|---|
System.Int32[] |
m_lastDocBase
Declaration
protected int m_lastDocBase
Field Value
Type | Description |
---|---|
System.Int32 |
m_maxDocsToCache
Declaration
protected readonly int m_maxDocsToCache
Field Value
Type | Description |
---|---|
System.Int32 |
m_other
Declaration
protected readonly ICollector m_other
Field Value
Type | Description |
---|---|
ICollector |
m_upto
Declaration
protected int m_upto
Field Value
Type | Description |
---|---|
System.Int32 |
Properties
| Improve this Doc View SourceAcceptsDocsOutOfOrder
Declaration
public virtual bool AcceptsDocsOutOfOrder { get; }
Property Value
Type | Description |
---|---|
System.Boolean |
IsCached
Declaration
public virtual bool IsCached { get; }
Property Value
Type | Description |
---|---|
System.Boolean |
Methods
| Improve this Doc View SourceCollect(Int32)
Called once for every document matching a query, with the unbased document number.
Note: The collection of the current segment can be terminated by throwing a CollectionTerminatedException. In this case, the last docs of the current AtomicReaderContext will be skipped and IndexSearcher will swallow the exception and continue collection with the next leaf.
Note: this is called in an inner search loop. For good search performance, implementations of this method should not call Doc(Int32) or Document(Int32) on every hit. Doing so can slow searches by an order of magnitude or more.
Declaration
public abstract void Collect(int doc)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | doc |
Create(ICollector, Boolean, Double)
Create a new CachingCollector that wraps the given collector and caches documents and scores up to the specified RAM threshold.
Declaration
public static CachingCollector Create(ICollector other, bool cacheScores, double maxRAMMB)
Parameters
Type | Name | Description |
---|---|---|
ICollector | other | The ICollector to wrap and delegate calls to. |
System.Boolean | cacheScores | Whether to cache scores in addition to document IDs. Note that this increases the RAM consumed per doc. |
System.Double | maxRAMMB | The maximum RAM in MB to consume for caching the documents and scores. If the collector exceeds the threshold, no documents and scores are cached. |
Returns
Type | Description |
---|---|
CachingCollector |
Create(ICollector, Boolean, Int32)
Create a new CachingCollector that wraps the given collector and caches documents and scores up to the specified max docs threshold.
Declaration
public static CachingCollector Create(ICollector other, bool cacheScores, int maxDocsToCache)
Parameters
Type | Name | Description |
---|---|---|
ICollector | other | The ICollector to wrap and delegate calls to. |
System.Boolean | cacheScores | Whether to cache scores in addition to document IDs. Note that this increases the RAM consumed per doc. |
System.Int32 | maxDocsToCache | The maximum number of documents for caching the documents and possible the scores. If the collector exceeds the threshold, no documents and scores are cached. |
Returns
Type | Description |
---|---|
CachingCollector |
Create(Boolean, Boolean, Double)
Creates a CachingCollector which does not wrap another collector. The cached documents and scores can later be replayed (Replay(ICollector)).
Declaration
public static CachingCollector Create(bool acceptDocsOutOfOrder, bool cacheScores, double maxRAMMB)
Parameters
Type | Name | Description |
---|---|---|
System.Boolean | acceptDocsOutOfOrder | whether documents are allowed to be collected out-of-order |
System.Boolean | cacheScores | |
System.Double | maxRAMMB |
Returns
Type | Description |
---|---|
CachingCollector |
Replay(ICollector)
Replays the cached doc IDs (and scores) to the given ICollector. If this
instance does not cache scores, then Scorer is not set on
other.SetScorer(Scorer)
as well as scores are not replayed.
Declaration
public abstract void Replay(ICollector other)
Parameters
Type | Name | Description |
---|---|---|
ICollector | other |
Exceptions
Type | Condition |
---|---|
System.InvalidOperationException | If this collector is not cached (i.e., if the RAM limits were too low for the number of documents + scores to cache). |
System.ArgumentException | If the given Collect's does not support out-of-order collection, while the collector passed to the ctor does. |
SetNextReader(AtomicReaderContext)
Declaration
public virtual void SetNextReader(AtomicReaderContext context)
Parameters
Type | Name | Description |
---|---|---|
AtomicReaderContext | context |
SetScorer(Scorer)
Called before successive calls to Collect(Int32). Implementations
that need the score of the current document (passed-in to
Declaration
public abstract void SetScorer(Scorer scorer)
Parameters
Type | Name | Description |
---|---|---|
Scorer | scorer |