Namespace Lucene.Net.Spatial.Prefix

Prefix Tree Strategy

Classes

AbstractPrefixTreeFilter

Base class for Lucene Filters on SpatialPrefixTree fields.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

AbstractPrefixTreeFilter.BaseTermsEnumTraverser

Holds transient state and docid collecting utility methods as part of traversing a TermsEnum.

AbstractVisitingPrefixTreeFilter

Traverses a SpatialPrefixTree indexed field, using the template & visitor design patterns for subclasses to guide the traversal and collect matching documents.

Subclasses implement GetDocIdSet(AtomicReaderContext, IBits) by instantiating a custom AbstractVisitingPrefixTreeFilter.VisitorTemplate subclass (i.e. an anonymous inner class) and implement the required methods.

This is a Lucene.NET INTERNAL API, use at your own risk

AbstractVisitingPrefixTreeFilter.VisitorTemplate

An abstract class designed to make it easy to implement predicates or other operations on a SpatialPrefixTree indexed field. An instance of this class is not designed to be re-used across AtomicReaderContext instances so simply create a new one for each call to, say a GetDocIdSet(AtomicReaderContext, IBits). The GetDocIdSet() method here starts the work. It first checks that there are indexed terms; if not it quickly returns null. Then it calls Start() so a subclass can set up a return value, like an FixedBitSet. Then it starts the traversal process, calling FindSubCellsToVisit(Cell) which by default finds the top cells that intersect queryShape. If there isn't an indexed cell for a corresponding cell returned for this method then it's short-circuited until it finds one, at which point Visit(Cell) is called. At some depths, of the tree, the algorithm switches to a scanning mode that calls VisitScanned(Cell) for each leaf cell found.

This is a Lucene.NET INTERNAL API, use at your own risk

AbstractVisitingPrefixTreeFilter.VNode

A Visitor node/cell found via the query shape for AbstractVisitingPrefixTreeFilter.VisitorTemplate. Sometimes these are reset(cell). It's like a LinkedList node but forms a tree.

This is a Lucene.NET INTERNAL API, use at your own risk

ContainsPrefixTreeFilter

Finds docs where its indexed shape Contains the query shape. For use on RecursivePrefixTreeStrategy.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

IntersectsPrefixTreeFilter

A Filter matching documents that have an (i.e. not DISTINCT) relationship with a provided query shape.

This is a Lucene.NET INTERNAL API, use at your own risk

PointPrefixTreeFieldCacheProvider

Implementation of ShapeFieldCacheProvider<T> designed for PrefixTreeStrategys.

Note, due to the fragmented representation of Shapes in these Strategies, this implementation can only retrieve the central Point of the original Shapes.

This is a Lucene.NET INTERNAL API, use at your own risk

PrefixTreeStrategy

An abstract SpatialStrategy based on SpatialPrefixTree. The two subclasses are RecursivePrefixTreeStrategy and TermQueryPrefixTreeStrategy. This strategy is most effective as a fast approximate spatial search filter.

Characteristics:

Can index any shape; however only RecursivePrefixTreeStrategy can effectively search non-point shapes.
Can index a variable number of shapes per field value. This strategy can do it via multiple calls to CreateIndexableFields(IShape) for a document or by giving it some sort of Shape aggregate (e.g. NTS WKT MultiPoint). The shape's boundary is approximated to a grid precision.
Can query with any shape. The shape's boundary is approximated to a grid precision.
Only Intersects is supported. If only points are indexed then this is effectively equivalent to IsWithin.
The strategy supports MakeDistanceValueSource(IPoint, Double) even for multi-valued data, so long as the indexed data is all points; the behavior is undefined otherwise. However, it will likely be removed in the future in lieu of using another strategy with a more scalable implementation. Use of this call is the only circumstance in which a cache is used. The cache is simple but as such it doesn't scale to large numbers of points nor is it real-time-search friendly.

Implementation:

The SpatialPrefixTree does most of the work, for example returning a list of terms representing grids of various sizes for a supplied shape. An important configuration item is DistErrPct which balances shape precision against scalability. See those docs.

This is a Lucene.NET INTERNAL API, use at your own risk

RecursivePrefixTreeStrategy

A PrefixTreeStrategy which uses AbstractVisitingPrefixTreeFilter. This strategy has support for searching non-point shapes (note: not tested). Even a query shape with distErrPct=0 (fully precise to the grid) should have good performance for typical data, unless there is a lot of indexed data coincident with the shape's edge.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

TermQueryPrefixTreeStrategy

A basic implementation of PrefixTreeStrategy using a large TermsFilter of all the cells from GetCells(IShape, Int32, Boolean, Boolean). It only supports the search of indexed Point shapes.

The precision of query shapes (DistErrPct) is an important factor in using this Strategy. If the precision is too precise then it will result in many terms which will amount to a slower query.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

WithinPrefixTreeFilter

Finds docs where its indexed shape is IsWithin the query shape. It works by looking at cells outside of the query shape to ensure documents there are excluded. By default, it will examine all cells, and it's fairly slow. If you know that the indexed shapes are never comprised of multiple disjoint parts (which also means it is not multi-valued), then you can pass SpatialPrefixTree.GetDistanceForLevel(maxLevels) as the queryBuffer constructor parameter to minimally look this distance beyond the query shape's edge. Even if the indexed shapes are sometimes comprised of multiple disjoint parts, you might want to use this option with a large buffer as a faster approximation with minimal false-positives.

This is a Lucene.NET EXPERIMENTAL API, use at your own risk