Class RegExp
Regular Expression extension to Automaton.
Regular expressions are built from the following abstract syntax:
regexp | ::= | unionexp | ||
| | ||||
unionexp | ::= | interexp | unionexp | (union) | |
| | interexp | |||
interexp | ::= | concatexp & interexp | (intersection) | [OPTIONAL] |
| | concatexp | |||
concatexp | ::= | repeatexp concatexp | (concatenation) | |
| | repeatexp | |||
repeatexp | ::= | repeatexp ? | (zero or one occurrence) | |
| | repeatexp * | (zero or more occurrences) | ||
| | repeatexp + | (one or more occurrences) | ||
| | repeatexp {n} | (n occurrences) | ||
| | repeatexp {n,} | (n or more occurrences) | ||
| | repeatexp {n,m} | (n to m occurrences, including both) | ||
| | complexp | |||
complexp | ::= | ~ complexp | (complement) | [OPTIONAL] |
| | charclassexp | |||
charclassexp | ::= | [ charclasses ] | (character class) | |
| | [^ charclasses ] | (negated character class) | ||
| | simpleexp | |||
charclasses | ::= | charclass charclasses | ||
| | charclass | |||
charclass | ::= | charexp - charexp | (character range, including end-points) | |
| | charexp | |||
simpleexp | ::= | charexp | ||
| | . | (any single character) | ||
| | # | (the empty language) | [OPTIONAL] | |
| | @ | (any string) | [OPTIONAL] | |
| | " <Unicode string without double-quotes> " | (a string) | ||
| | ( ) | (the empty string) | ||
| | ( unionexp ) | (precedence override) | ||
| | < <identifier> > | (named automaton) | [OPTIONAL] | |
| | <n-m> | (numerical interval) | [OPTIONAL] | |
charexp | ::= | <Unicode character> | (a single non-reserved character) | |
| | </strong> <Unicode character> | (a single character) |
The productions marked [OPTIONAL] are only allowed if
specified by the syntax flags passed to the RegExp constructor.
The reserved characters used in the (enabled) syntax must be escaped with
backslash (
This API is experimental and might change in incompatible ways in the next release.</code>) or double-quotes (
"..."
). (In
contrast to other regexp syntaxes, this is required also in character
classes.) Be aware that dash (-
) has a special meaning in
charclass expressions. An identifier is a string not containing right
angle bracket (>
) or dash (-
). Numerical
intervals are specified by non-negative decimal integers and include both end
points, and if n
and m
have the same number
of digits, then the conforming strings must have that length (i.e. prefixed
by 0's).
Note
Inheritance
Inherited Members
Namespace: Lucene.Net.Util.Automaton
Assembly: Lucene.Net.dll
Syntax
public class RegExp
Constructors
| Improve this Doc View SourceRegExp(String)
Constructs new RegExp from a string. Same as
RegExp(s, RegExpSyntax.ALL)
.
Declaration
public RegExp(string s)
Parameters
Type | Name | Description |
---|---|---|
System.String | s | Regexp string. |
Exceptions
Type | Condition |
---|---|
System.ArgumentException | If an error occured while parsing the regular expression. |
RegExp(String, RegExpSyntax)
Constructs new RegExp from a string.
Declaration
public RegExp(string s, RegExpSyntax syntax_flags)
Parameters
Type | Name | Description |
---|---|---|
System.String | s | Regexp string. |
RegExpSyntax | syntax_flags | Boolean 'or' of optional RegExpSyntax constructs to be enabled. |
Exceptions
Type | Condition |
---|---|
System.ArgumentException | If an error occured while parsing the regular expression |
Methods
| Improve this Doc View SourceGetIdentifiers()
Returns set of automaton identifiers that occur in this regular expression.
Declaration
public virtual ISet<string> GetIdentifiers()
Returns
Type | Description |
---|---|
System.Collections.Generic.ISet<System.String> |
SetAllowMutate(Boolean)
Sets or resets allow mutate flag. If this flag is set, then automata construction uses mutable automata, which is slightly faster but not thread safe. By default, the flag is not set.
Declaration
public virtual bool SetAllowMutate(bool flag)
Parameters
Type | Name | Description |
---|---|---|
System.Boolean | flag | If |
Returns
Type | Description |
---|---|
System.Boolean | Previous value of the flag. |
ToAutomaton()
Declaration
public virtual Automaton ToAutomaton()
Returns
Type | Description |
---|---|
Automaton |
ToAutomaton(IAutomatonProvider)
Constructs new Automaton from this RegExp. The constructed automaton is minimal and deterministic and has no transitions to dead states.
Declaration
public virtual Automaton ToAutomaton(IAutomatonProvider automaton_provider)
Parameters
Type | Name | Description |
---|---|---|
IAutomatonProvider | automaton_provider | Provider of automata for named identifiers. |
Returns
Type | Description |
---|---|
Automaton |
Exceptions
Type | Condition |
---|---|
System.ArgumentException | If this regular expression uses a named identifier that is not available from the automaton provider. |
ToAutomaton(IDictionary<String, Automaton>)
Constructs new Automaton from this RegExp. The constructed automaton is minimal and deterministic and has no transitions to dead states.
Declaration
public virtual Automaton ToAutomaton(IDictionary<string, Automaton> automata)
Parameters
Type | Name | Description |
---|---|---|
System.Collections.Generic.IDictionary<System.String, Automaton> | automata | A map from automaton identifiers to automata (of type Automaton). |
Returns
Type | Description |
---|---|
Automaton |
Exceptions
Type | Condition |
---|---|
System.ArgumentException | If this regular expression uses a named identifier that does not occur in the automaton map. |
ToString()
Constructs string from parsed regular expression.
Declaration
public override string ToString()
Returns
Type | Description |
---|---|
System.String |