Namespace Lucene.Net.QueryParsers.Flexible.Standard.Processors

Lucene Query Node Processors.

Lucene Query Node Processors

The package org.apache.lucene.queryparser.flexible.standard.processors contains every processor needed to assembly a pipeline that modifies the query node tree according to the actual Lucene queries.

These processors are already assembled correctly in the StandardQueryNodeProcessorPipeline.

Classes

AllowLeadingWildcardProcessor

This processor verifies if ALLOW_LEADING_WILDCARD is defined in the QueryConfigHandler. If it is and leading wildcard is not allowed, it looks for every WildcardQueryNode contained in the query node tree and throws an exception if any of them has a leading wildcard ('*' or '?').

AnalyzerQueryNodeProcessor

This processor verifies if ANALYZER is defined in the QueryConfigHandler. If it is and the analyzer is not null, it looks for every FieldQueryNode that is not WildcardQueryNode, FuzzyQueryNode or IRangeQueryNode contained in the query node tree, then it applies the analyzer to that FieldQueryNode object.

If the analyzer return only one term, the returned term is set to the FieldQueryNode and it's returned.

If the analyzer return more than one term, a TokenizedPhraseQueryNode or MultiPhraseQueryNode is created, whether there is one or more terms at the same position, and it's returned.

If no term is returned by the analyzer a NoTokenFoundQueryNode object is returned.

BooleanQuery2ModifierNodeProcessor

This processor is used to apply the correct ModifierQueryNode to BooleanQueryNodes children. This is a variant of BooleanModifiersQueryNodeProcessor which ignores precedence.

The StandardSyntaxParser knows the rules of precedence, but lucene does not. e.g.

(A AND B OR C AND D)

ist treated like

(+A +B +C +D)

This processor walks through the query node tree looking for BooleanQueryNodes. If an AndQueryNode is found, every child, which is not a ModifierQueryNode or the ModifierQueryNode is MOD_NONE, becomes a MOD_REQ. For default BooleanQueryNode, it checks the default operator is AND, if it is, the same operation when an AndQueryNode is found is applied to it. Each BooleanQueryNode which direct parent is also a BooleanQueryNode is removed (to ignore the rules of precedence).

BooleanSingleChildOptimizationQueryNodeProcessor

This processor removes every BooleanQueryNode that contains only one child and returns this child. If this child is ModifierQueryNode that was defined by the user. A modifier is not defined by the user when it's a BooleanModifierNode

BoostQueryNodeProcessor

This processor iterates the query node tree looking for every IFieldableNode that has BOOST in its config. If there is, the boost is applied to that IFieldableNode.

DefaultPhraseSlopQueryNodeProcessor

This processor verifies if PHRASE_SLOP is defined in the QueryConfigHandler. If it is, it looks for every TokenizedPhraseQueryNode and MultiPhraseQueryNode that does not have any SlopQueryNode applied to it and creates an SlopQueryNode and apply to it. The new SlopQueryNode has the same slop value defined in the configuration.

FuzzyQueryNodeProcessor

This processor iterates the query node tree looking for every FuzzyQueryNode, when this kind of node is found, it checks on the query configuration for FUZZY_CONFIG, gets the fuzzy prefix length and default similarity from it and set to the fuzzy node. For more information about fuzzy prefix length check: FuzzyQuery.

GroupQueryNodeProcessor

The ISyntaxParser generates query node trees that consider the boolean operator precedence, but Lucene current syntax does not support boolean precedence, so this processor remove all the precedence and apply the equivalent modifier according to the boolean operation defined on an specific query node.

If there is a GroupQueryNode in the query node tree, the query node tree is not merged with the one above it.

Example: TODO: describe a good example to show how this processor works

LowercaseExpandedTermsQueryNodeProcessor

This processor verifies if LOWERCASE_EXPANDED_TERMS is defined in the QueryConfigHandler. If it is and the expanded terms should be lower-cased, it looks for every WildcardQueryNode, FuzzyQueryNode and children of a IRangeQueryNode and lower-case its term.

MatchAllDocsQueryNodeProcessor

This processor converts every WildcardQueryNode that is ":" to MatchAllDocsQueryNode.

MultiFieldQueryNodeProcessor

This processor is used to expand terms so the query looks for the same term in different fields. It also boosts a query based on its field.

This processor looks for every IFieldableNode contained in the query node tree. If a IFieldableNode is found, it checks if there is a MULTI_FIELDS defined in the QueryConfigHandler. If there is, the IFieldableNode is cloned N times and the clones are added to a BooleanQueryNode together with the original node. N is defined by the number of fields that it will be expanded to. The BooleanQueryNode is returned.

MultiTermRewriteMethodProcessor

This processor instates the default Lucene.Net.Search.MultiTermQuery.RewriteMethod, CONSTANT_SCORE_AUTO_REWRITE_DEFAULT, for multi-term query nodes.

NumericQueryNodeProcessor

This processor is used to convert FieldQueryNodes to NumericRangeQueryNodes. It looks for NUMERIC_CONFIG set in the FieldConfig of every FieldQueryNode found. If NUMERIC_CONFIG is found, it considers that FieldQueryNode to be a numeric query and convert it to NumericRangeQueryNode with upper and lower inclusive and lower and upper equals to the value represented by the FieldQueryNode converted to System.Object representing a .NET numeric type. It means that field:1 is converted to field:[1 TO 1].

Note that FieldQueryNodes children of a IRangeQueryNode are ignored.

NumericRangeQueryNodeProcessor

This processor is used to convert TermRangeQueryNodes to NumericRangeQueryNodes. It looks for NUMERIC_CONFIG set in the FieldConfig of every TermRangeQueryNode found. If NUMERIC_CONFIG is found, it considers that TermRangeQueryNode to be a numeric range query and convert it to NumericRangeQueryNode.

OpenRangeQueryNodeProcessor

Processes TermRangeQuerys with open ranges.

PhraseSlopQueryNodeProcessor

This processor removes invalid SlopQueryNode objects in the query node tree. A SlopQueryNode is invalid if its child is neither a TokenizedPhraseQueryNode nor a MultiPhraseQueryNode.

RemoveEmptyNonLeafQueryNodeProcessor

This processor removes every IQueryNode that is not a leaf and has not children. If after processing the entire tree the root node is not a leaf and has no children, a MatchNoDocsQueryNode object is returned.

This processor is used at the end of a pipeline to avoid invalid query node tree structures like a GroupQueryNode or ModifierQueryNode with no children.

StandardQueryNodeProcessorPipeline

This pipeline has all the processors needed to process a query node tree, generated by StandardSyntaxParser, already assembled.

The order they are assembled affects the results.

This processor pipeline was designed to work with StandardQueryConfigHandler.

The result query node tree can be used to build a Lucene.Net.Search.Query object using StandardQueryTreeBuilder.

TermRangeQueryNodeProcessor

This processors process TermRangeQueryNodes. It reads the lower and upper bounds value from the TermRangeQueryNode object and try to parse their values using a dateFormat. If the values cannot be parsed to a date value, it will only create the TermRangeQueryNode using the non-parsed values.

If a LOCALE is defined in the QueryConfigHandler it will be used to parse the date, otherwise System.Globalization.CultureInfo.CurrentCulture will be used.

If a DATE_RESOLUTION is defined and the Lucene.Net.Documents.DateTools.Resolution is not null it will also be used to parse the date value.

WildcardQueryNodeProcessor

The StandardSyntaxParser creates PrefixWildcardQueryNode nodes which have values containing the prefixed wildcard. However, Lucene PrefixQuery cannot contain the prefixed wildcard. So, this processor basically removed the prefixed wildcard from the PrefixWildcardQueryNode value.