Class LowerCaseTokenizer
Lower
Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.
You must specify the required Lucene
- As of 3.1, Char
Tokenizer uses an int based API to normalize and detect token characters. See IsToken and Normalize(Int32) for details.Char(Int32)
Inheritance
Implements
Inherited Members
Namespace: Lucene.Net.Analysis.Core
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public sealed class LowerCaseTokenizer : LetterTokenizer, IDisposable
Constructors
| Improve this Doc View SourceLowerCaseTokenizer(LuceneVersion, AttributeSource.AttributeFactory, TextReader)
Construct a new Lower
Declaration
public LowerCaseTokenizer(LuceneVersion matchVersion, AttributeSource.AttributeFactory factory, TextReader in)
Parameters
Type | Name | Description |
---|---|---|
Lucene |
matchVersion | Lucene |
Attribute |
factory | the attribute factory to use for this Tokenizer |
Text |
in | the input to split up into tokens |
LowerCaseTokenizer(LuceneVersion, TextReader)
Construct a new Lower
Declaration
public LowerCaseTokenizer(LuceneVersion matchVersion, TextReader in)
Parameters
Type | Name | Description |
---|---|---|
Lucene |
matchVersion | Lucene |
Text |
in | the input to split up into tokens |
Methods
| Improve this Doc View SourceNormalize(Int32)
Converts char to lower case
Declaration
protected override int Normalize(int c)
Parameters
Type | Name | Description |
---|---|---|
System. |
c |
Returns
Type | Description |
---|---|
System. |