Class CapitalizationFilter

A filter to apply normal capitalization rules to Tokens. It will make the first letter capital and the rest lower case.

This filter is particularly useful to build nice looking facet parameters. This filter is not appropriate if you intend to use a prefix query.

Inheritance

object

AttributeSource

TokenStream

TokenFilter

CapitalizationFilter

Implements

IDisposable

Inherited Members

TokenFilter.End()

TokenFilter.Reset()

TokenStream.Dispose()

AttributeSource.GetAttributeFactory()

AttributeSource.GetAttributeClassesEnumerator()

AttributeSource.GetAttributeImplsEnumerator()

AttributeSource.AddAttributeImpl(Attribute)

AttributeSource.AddAttribute<T>()

AttributeSource.HasAttributes

AttributeSource.HasAttribute<T>()

AttributeSource.GetAttribute<T>()

AttributeSource.ClearAttributes()

AttributeSource.CaptureState()

AttributeSource.RestoreState(AttributeSource.State)

AttributeSource.GetHashCode()

AttributeSource.Equals(object)

AttributeSource.ReflectAsString(bool)

AttributeSource.ReflectWith(IAttributeReflector)

AttributeSource.CloneAttributes()

AttributeSource.CopyTo(AttributeSource)

AttributeSource.ToString()

object.Equals(object, object)

object.GetType()

object.ReferenceEquals(object, object)

Namespace: Lucene.Net.Analysis.Miscellaneous

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

public sealed class CapitalizationFilter : TokenFilter, IDisposable

Constructors

CapitalizationFilter(TokenStream)

Creates a CapitalizationFilter with the default parameters using the invariant culture.

Calls CapitalizationFilter(in, true, null, true, null, 0, DEFAULT_MAX_WORD_COUNT, DEFAULT_MAX_TOKEN_LENGTH, null)

Declaration

public CapitalizationFilter(TokenStream @in)

Parameters

Type	Name	Description
TokenStream	in

CapitalizationFilter(TokenStream, bool, CharArraySet, bool, ICollection<char[]>, int, int, int)

Creates a CapitalizationFilter with the specified parameters using the invariant culture.

Declaration

public CapitalizationFilter(TokenStream @in, bool onlyFirstWord, CharArraySet keep, bool forceFirstLetter, ICollection<char[]> okPrefix, int minWordLength, int maxWordCount, int maxTokenLength)

Parameters

Type	Name	Description
TokenStream	in	input tokenstream
bool	onlyFirstWord	should each word be capitalized or all of the words?
CharArraySet	keep	a keep word list. Each word that should be kept separated by whitespace.
bool	forceFirstLetter	Force the first letter to be capitalized even if it is in the keep list.
ICollection<char[]>	okPrefix	do not change word capitalization if a word begins with something in this list.
int	minWordLength	how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or".
int	maxWordCount	if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
int	maxTokenLength	The maximum length for an individual token. Tokens that exceed this length will not have the capitalization operation performed.

CapitalizationFilter(TokenStream, bool, CharArraySet, bool, ICollection<char[]>, int, int, int, CultureInfo)

Creates a CapitalizationFilter with the specified parameters and the specified culture.

Declaration

public CapitalizationFilter(TokenStream @in, bool onlyFirstWord, CharArraySet keep, bool forceFirstLetter, ICollection<char[]> okPrefix, int minWordLength, int maxWordCount, int maxTokenLength, CultureInfo culture)

Parameters

Type	Name	Description
TokenStream	in	input tokenstream
bool	onlyFirstWord	should each word be capitalized or all of the words?
CharArraySet	keep	a keep word list. Each word that should be kept separated by whitespace.
bool	forceFirstLetter	Force the first letter to be capitalized even if it is in the keep list.
ICollection<char[]>	okPrefix	do not change word capitalization if a word begins with something in this list.
int	minWordLength	how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or".
int	maxWordCount	if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
int	maxTokenLength	The maximum length for an individual token. Tokens that exceed this length will not have the capitalization operation performed.
CultureInfo	culture	The culture to use for the casing operation. If null, InvariantCulture will be used.

CapitalizationFilter(TokenStream, CultureInfo)

Creates a CapitalizationFilter with the default parameters and the specified culture.

Calls CapitalizationFilter(in, true, null, true, null, 0, DEFAULT_MAX_WORD_COUNT, DEFAULT_MAX_TOKEN_LENGTH)

Declaration

public CapitalizationFilter(TokenStream @in, CultureInfo culture)

Parameters

Type	Name	Description
TokenStream	in	input tokenstream
CultureInfo	culture	The culture to use for the casing operation. If null, InvariantCulture will be used.

Fields

DEFAULT_MAX_TOKEN_LENGTH

A filter to apply normal capitalization rules to Tokens. It will make the first letter capital and the rest lower case.

This filter is particularly useful to build nice looking facet parameters. This filter is not appropriate if you intend to use a prefix query.

Declaration

public static readonly int DEFAULT_MAX_TOKEN_LENGTH

Field Value

Type	Description
int

DEFAULT_MAX_WORD_COUNT

A filter to apply normal capitalization rules to Tokens. It will make the first letter capital and the rest lower case.

This filter is particularly useful to build nice looking facet parameters. This filter is not appropriate if you intend to use a prefix query.

Declaration

public static readonly int DEFAULT_MAX_WORD_COUNT

Field Value

Type	Description
int

Methods

IncrementToken()

Consumers (i.e., Lucene.Net.Index.IndexWriter) use this method to advance the stream to the next token. Implementing classes must implement this method and update the appropriate Lucene.Net.Util.IAttributes with the attributes of the next token.

The producer must make no assumptions about the attributes after the method has been returned: the caller may arbitrarily change it. If the producer needs to preserve the state for subsequent calls, it can use Lucene.Net.Util.AttributeSource.CaptureState() to create a copy of the current attribute state.

this method is called for every token of a document, so an efficient implementation is crucial for good performance. To avoid calls to Lucene.Net.Util.AttributeSource.AddAttribute<T>() and Lucene.Net.Util.AttributeSource.GetAttribute<T>(), references to all Lucene.Net.Util.IAttributes that this stream uses should be retrieved during instantiation.

To ensure that filters and consumers know which attributes are available, the attributes must be added during instantiation. Filters and consumers are not required to check for availability of attributes in Lucene.Net.Analysis.TokenStream.IncrementToken().

Declaration

public override bool IncrementToken()

Returns

Type	Description
bool	false for end of stream; true otherwise

Overrides

Lucene.Net.Analysis.TokenStream.IncrementToken()

Implements

IDisposable