Class PatternTokenizer
This tokenizer uses regex pattern matching to construct distinct tokens for the input stream. It takes two arguments: "pattern" and "group".
- "pattern" is the regular expression.
- "group" says which group to extract into tokens.
group=-1 (the default) is equivalent to "split". In this case, the tokens will be equivalent to the output from (without empty tokens): System.Text.RegularExpressions.Regex.Replace(System.String,System.String)
Using group >= 0 selects the matching group as the token.  For example, if you have:
 pattern = \'([^\']+)\'
 group = 0
 input = aaa 'bbb' 'ccc'the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)
NOTE: This Lucene.Net.Analysis.Tokenizer does not output tokens that are of zero length.
Inheritance
Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Pattern
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public sealed class PatternTokenizer : Tokenizer, IDisposableConstructors
| Improve this Doc View SourcePatternTokenizer(AttributeSource.AttributeFactory, TextReader, Regex, Int32)
creates a new PatternTokenizer returning tokens from group (-1 for split functionality)
Declaration
public PatternTokenizer(AttributeSource.AttributeFactory factory, TextReader input, Regex pattern, int group)Parameters
| Type | Name | Description | 
|---|---|---|
| Lucene.Net.Util.AttributeSource.AttributeFactory | factory | |
| System.IO.TextReader | input | |
| System.Text.RegularExpressions.Regex | pattern | |
| System.Int32 | group | 
PatternTokenizer(TextReader, Regex, Int32)
creates a new PatternTokenizer returning tokens from group (-1 for split functionality)
Declaration
public PatternTokenizer(TextReader input, Regex pattern, int group)Parameters
| Type | Name | Description | 
|---|---|---|
| System.IO.TextReader | input | |
| System.Text.RegularExpressions.Regex | pattern | |
| System.Int32 | group | 
Methods
| Improve this Doc View SourceEnd()
Declaration
public override void End()Overrides
IncrementToken()
Declaration
public override bool IncrementToken()Returns
| Type | Description | 
|---|---|
| System.Boolean | 
Overrides
Reset()
Declaration
public override void Reset()