Class MockTokenizer
Tokenizer for testing.
This tokenizer is a replacement for WHITESPACE, SIMPLE, and KEYWORD
tokenizers. If you are writing a component such as a TokenFilter, its a great idea to test
it wrapping this tokenizer instead for extra checks. This tokenizer has the following behavior:
-
An internal state-machine is used for checking consumer consistency. These checks can
be disabled with EnableChecks.
-
For convenience, optionally lowercases terms that it outputs.
Inheritance
System.Object
MockTokenizer
Implements
System.IDisposable
Inherited Members
System.Object.Equals(System.Object, System.Object)
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
Assembly: Lucene.Net.TestFramework.dll
Syntax
public class MockTokenizer : Tokenizer, IDisposable
Constructors
|
Improve this Doc
View Source
MockTokenizer(AttributeSource.AttributeFactory, TextReader)
Calls MockTokenizer(AttributeFactory, TextReader, WHITESPACE, true)
Declaration
public MockTokenizer(AttributeSource.AttributeFactory factory, TextReader input)
Parameters
|
Improve this Doc
View Source
MockTokenizer(AttributeSource.AttributeFactory, TextReader, CharacterRunAutomaton, Boolean)
Declaration
public MockTokenizer(AttributeSource.AttributeFactory factory, TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase)
Parameters
|
Improve this Doc
View Source
MockTokenizer(AttributeSource.AttributeFactory, TextReader, CharacterRunAutomaton, Boolean, Int32)
Declaration
public MockTokenizer(AttributeSource.AttributeFactory factory, TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase, int maxTokenLength)
Parameters
|
Improve this Doc
View Source
MockTokenizer(TextReader)
Calls MockTokenizer(TextReader, WHITESPACE, true)
.
Declaration
public MockTokenizer(TextReader input)
Parameters
Type |
Name |
Description |
System.IO.TextReader |
input |
|
|
Improve this Doc
View Source
MockTokenizer(TextReader, CharacterRunAutomaton, Boolean)
Declaration
public MockTokenizer(TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase)
Parameters
Type |
Name |
Description |
System.IO.TextReader |
input |
|
CharacterRunAutomaton |
runAutomaton |
|
System.Boolean |
lowerCase |
|
|
Improve this Doc
View Source
MockTokenizer(TextReader, CharacterRunAutomaton, Boolean, Int32)
Declaration
public MockTokenizer(TextReader input, CharacterRunAutomaton runAutomaton, bool lowerCase, int maxTokenLength)
Parameters
Type |
Name |
Description |
System.IO.TextReader |
input |
|
CharacterRunAutomaton |
runAutomaton |
|
System.Boolean |
lowerCase |
|
System.Int32 |
maxTokenLength |
|
Fields
|
Improve this Doc
View Source
DEFAULT_MAX_TOKEN_LENGTH
Declaration
public static readonly int DEFAULT_MAX_TOKEN_LENGTH
Field Value
Type |
Description |
System.Int32 |
|
|
Improve this Doc
View Source
KEYWORD
Acts Similar to KeywordTokenizer.
TODO: Keyword returns an "empty" token for an empty reader...
Declaration
public static readonly CharacterRunAutomaton KEYWORD
Field Value
|
Improve this Doc
View Source
SIMPLE
Declaration
public static readonly CharacterRunAutomaton SIMPLE
Field Value
|
Improve this Doc
View Source
WHITESPACE
Declaration
public static readonly CharacterRunAutomaton WHITESPACE
Field Value
Properties
|
Improve this Doc
View Source
EnableChecks
Toggle consumer workflow checking: if your test consumes tokenstreams normally you
should leave this enabled.
Declaration
public virtual bool EnableChecks { get; set; }
Property Value
Type |
Description |
System.Boolean |
|
Methods
|
Improve this Doc
View Source
Dispose(Boolean)
Declaration
protected override void Dispose(bool disposing)
Parameters
Type |
Name |
Description |
System.Boolean |
disposing |
|
Overrides
|
Improve this Doc
View Source
End()
Declaration
public override void End()
Overrides
|
Improve this Doc
View Source
IncrementToken()
Declaration
public override sealed bool IncrementToken()
Returns
Type |
Description |
System.Boolean |
|
Overrides
|
Improve this Doc
View Source
IsTokenChar(Int32)
Declaration
protected virtual bool IsTokenChar(int c)
Parameters
Type |
Name |
Description |
System.Int32 |
c |
|
Returns
Type |
Description |
System.Boolean |
|
|
Improve this Doc
View Source
Normalize(Int32)
Declaration
protected virtual int Normalize(int c)
Parameters
Type |
Name |
Description |
System.Int32 |
c |
|
Returns
Type |
Description |
System.Int32 |
|
|
Improve this Doc
View Source
ReadChar()
Declaration
protected virtual int ReadChar()
Returns
Type |
Description |
System.Int32 |
|
|
Improve this Doc
View Source
ReadCodePoint()
Declaration
protected virtual int ReadCodePoint()
Returns
Type |
Description |
System.Int32 |
|
|
Improve this Doc
View Source
Reset()
Declaration
public override void Reset()
Overrides
Implements
System.IDisposable