Class WhitespaceTokenizer
A WhitespaceTokenizer is a tokenizer that divides text at whitespace. Adjacent sequences of non-Whitespace characters form tokens.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating WhitespaceTokenizer:
- As of 3.1, CharTokenizer uses an int based API to normalize and detect token characters. See IsTokenChar(int) and Normalize(int) for details.
Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Core
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public sealed class WhitespaceTokenizer : CharTokenizer, IDisposable
Constructors
WhitespaceTokenizer(LuceneVersion, AttributeFactory, TextReader)
Construct a new WhitespaceTokenizer using a given Lucene.Net.Util.AttributeSource.AttributeFactory.
Declaration
public WhitespaceTokenizer(LuceneVersion matchVersion, AttributeSource.AttributeFactory factory, TextReader @in)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | Lucene.Net.Util.LuceneVersion to match |
AttributeSource.AttributeFactory | factory | the attribute factory to use for this Lucene.Net.Analysis.Tokenizer |
TextReader | in | the input to split up into tokens |
WhitespaceTokenizer(LuceneVersion, TextReader)
A WhitespaceTokenizer is a tokenizer that divides text at whitespace. Adjacent sequences of non-Whitespace characters form tokens.
You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating WhitespaceTokenizer:
- As of 3.1, CharTokenizer uses an int based API to normalize and detect token characters. See IsTokenChar(int) and Normalize(int) for details.
Declaration
public WhitespaceTokenizer(LuceneVersion matchVersion, TextReader @in)
Parameters
Type | Name | Description |
---|---|---|
LuceneVersion | matchVersion | Lucene.Net.Util.LuceneVersion to match |
TextReader | in | the input to split up into tokens |
Methods
IsTokenChar(int)
Collects only characters which do not satisfy IsWhiteSpace(char).
Declaration
protected override bool IsTokenChar(int c)
Parameters
Type | Name | Description |
---|---|---|
int | c |
Returns
Type | Description |
---|---|
bool |