Class LowerCaseTokenizer

LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. It divides text at non-letters and converts them to lower case. While it is functionally equivalent to the combination of LetterTokenizer and LowerCaseFilter, there is a performance advantage to doing the two tasks at once, hence this (redundant) implementation.

Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces.

You must specify the required LuceneVersion compatibility when creating LowerCaseTokenizer:

As of 3.1, CharTokenizer uses an int based API to normalize and detect token characters. See IsTokenChar(Int32) and Normalize(Int32) for details.

Inheritance

System.Object

LowerCaseTokenizer

Implements

IDisposable

Inherited Members

LetterTokenizer.IsTokenChar(Int32)

CharTokenizer.IncrementToken()

CharTokenizer.End()

CharTokenizer.Reset()

Tokenizer.m_input

Tokenizer.Dispose(Boolean)

Tokenizer.CorrectOffset(Int32)

Tokenizer.SetReader(TextReader)

TokenStream.Dispose()

AttributeSource.GetAttributeFactory()

AttributeSource.GetAttributeClassesEnumerator()

AttributeSource.GetAttributeImplsEnumerator()

AttributeSource.AddAttributeImpl(Attribute)

AttributeSource.AddAttribute<T>()

AttributeSource.HasAttributes

AttributeSource.HasAttribute<T>()

AttributeSource.GetAttribute<T>()

AttributeSource.ClearAttributes()

AttributeSource.CaptureState()

AttributeSource.RestoreState(AttributeSource.State)

AttributeSource.GetHashCode()

AttributeSource.Equals(Object)

AttributeSource.ReflectAsString(Boolean)

AttributeSource.ReflectWith(IAttributeReflector)

AttributeSource.CloneAttributes()

AttributeSource.CopyTo(AttributeSource)

AttributeSource.ToString()

Namespace: Lucene.Net.Analysis.Core

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

public sealed class LowerCaseTokenizer : LetterTokenizer, IDisposable

Constructors

| Improve this Doc View Source

LowerCaseTokenizer(LuceneVersion, AttributeSource.AttributeFactory, TextReader)

Construct a new LowerCaseTokenizer using a given AttributeSource.AttributeFactory.

Declaration

public LowerCaseTokenizer(LuceneVersion matchVersion, AttributeSource.AttributeFactory factory, TextReader in)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	LuceneVersion to match
AttributeSource.AttributeFactory	factory	the attribute factory to use for this Tokenizer
TextReader	in	the input to split up into tokens

| Improve this Doc View Source

LowerCaseTokenizer(LuceneVersion, TextReader)

Construct a new LowerCaseTokenizer.

Declaration

public LowerCaseTokenizer(LuceneVersion matchVersion, TextReader in)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	LuceneVersion to match
TextReader	in	the input to split up into tokens

Methods

| Improve this Doc View Source

Normalize(Int32)

Converts char to lower case .

Declaration

protected override int Normalize(int c)

Parameters

Type	Name	Description
System.Int32	c

Returns

Type	Description
System.Int32

Overrides

CharTokenizer.Normalize(Int32)

Implements

IDisposable