Class CachingCollector
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Implements
Inherited Members
Namespace: Lucene.Net.Search
Assembly: Lucene.Net.dll
Syntax
public abstract class CachingCollector : ICollector
Fields
m_base
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected int m_base
Field Value
Type | Description |
---|---|
int |
m_cachedDocs
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected readonly IList<int[]> m_cachedDocs
Field Value
Type | Description |
---|---|
IList<int[]> |
m_curDocs
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected int[] m_curDocs
Field Value
Type | Description |
---|---|
int[] |
m_lastDocBase
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected int m_lastDocBase
Field Value
Type | Description |
---|---|
int |
m_maxDocsToCache
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected readonly int m_maxDocsToCache
Field Value
Type | Description |
---|---|
int |
m_other
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected readonly ICollector m_other
Field Value
Type | Description |
---|---|
ICollector |
m_upto
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
protected int m_upto
Field Value
Type | Description |
---|---|
int |
Properties
AcceptsDocsOutOfOrder
Return true
if this collector does not
require the matching docIDs to be delivered in int sort
order (smallest to largest) to Collect(int).
Most Lucene Query implementations will visit matching docIDs in order. However, some queries (currently limited to certain cases of BooleanQuery) can achieve faster searching if the ICollector allows them to deliver the docIDs out of order.
Many collectors don't mind getting docIDs out of
order, so it's important to return true
here.
Declaration
public virtual bool AcceptsDocsOutOfOrder { get; }
Property Value
Type | Description |
---|---|
bool |
IsCached
Caches all docs, and optionally also scores, coming from
a search, and is then able to replay them to another
collector. You specify the max RAM this class may use.
Once the collection is done, call IsCached. If
this returns true
, you can use Replay(ICollector)
against a new collector. If it returns false
, this means
too much RAM was required and you must instead re-run the
original search.
See the Lucene modules/grouping
module for more
details including a full code example.
Note
This API is experimental and might change in incompatible ways in the next release.
Declaration
public virtual bool IsCached { get; }
Property Value
Type | Description |
---|---|
bool |
Methods
Collect(int)
Called once for every document matching a query, with the unbased document number.
Note: The collection of the current segment can be terminated by throwing a CollectionTerminatedException. In this case, the last docs of the current AtomicReaderContext will be skipped and IndexSearcher will swallow the exception and continue collection with the next leaf. Note: this is called in an inner search loop. For good search performance, implementations of this method should not call Doc(int) or Document(int) on every hit. Doing so can slow searches by an order of magnitude or more.Declaration
public abstract void Collect(int doc)
Parameters
Type | Name | Description |
---|---|---|
int | doc |
Create(ICollector, bool, double)
Create a new CachingCollector that wraps the given collector and caches documents and scores up to the specified RAM threshold.
Declaration
public static CachingCollector Create(ICollector other, bool cacheScores, double maxRAMMB)
Parameters
Type | Name | Description |
---|---|---|
ICollector | other | The ICollector to wrap and delegate calls to. |
bool | cacheScores | Whether to cache scores in addition to document IDs. Note that this increases the RAM consumed per doc. |
double | maxRAMMB | The maximum RAM in MB to consume for caching the documents and scores. If the collector exceeds the threshold, no documents and scores are cached. |
Returns
Type | Description |
---|---|
CachingCollector |
Create(ICollector, bool, int)
Create a new CachingCollector that wraps the given collector and caches documents and scores up to the specified max docs threshold.
Declaration
public static CachingCollector Create(ICollector other, bool cacheScores, int maxDocsToCache)
Parameters
Type | Name | Description |
---|---|---|
ICollector | other | The ICollector to wrap and delegate calls to. |
bool | cacheScores | Whether to cache scores in addition to document IDs. Note that this increases the RAM consumed per doc. |
int | maxDocsToCache | The maximum number of documents for caching the documents and possible the scores. If the collector exceeds the threshold, no documents and scores are cached. |
Returns
Type | Description |
---|---|
CachingCollector |
Create(bool, bool, double)
Creates a CachingCollector which does not wrap another collector. The cached documents and scores can later be replayed (Replay(ICollector)).
Declaration
public static CachingCollector Create(bool acceptDocsOutOfOrder, bool cacheScores, double maxRAMMB)
Parameters
Type | Name | Description |
---|---|---|
bool | acceptDocsOutOfOrder | whether documents are allowed to be collected out-of-order |
bool | cacheScores | |
double | maxRAMMB |
Returns
Type | Description |
---|---|
CachingCollector |
Replay(ICollector)
Replays the cached doc IDs (and scores) to the given ICollector. If this
instance does not cache scores, then Scorer is not set on
other.SetScorer(Scorer)
as well as scores are not replayed.
Declaration
public abstract void Replay(ICollector other)
Parameters
Type | Name | Description |
---|---|---|
ICollector | other |
Exceptions
Type | Condition |
---|---|
InvalidOperationException | If this collector is not cached (i.e., if the RAM limits were too low for the number of documents + scores to cache). |
ArgumentException | If the given Collect's does not support out-of-order collection, while the collector passed to the ctor does. |
SetNextReader(AtomicReaderContext)
Called before collecting from each AtomicReaderContext. All doc ids in Collect(int) will correspond to Reader.
Add DocBase to the current Reader's internal document id to re-base ids in Collect(int).Declaration
public virtual void SetNextReader(AtomicReaderContext context)
Parameters
Type | Name | Description |
---|---|---|
AtomicReaderContext | context | next atomic reader context |
SetScorer(Scorer)
Called before successive calls to Collect(int). Implementations
that need the score of the current document (passed-in to
Declaration
public abstract void SetScorer(Scorer scorer)
Parameters
Type | Name | Description |
---|---|---|
Scorer | scorer |