Class Tokenizer
A Tokenizer is a TokenStream whose input is a TextReader.
This is an abstract class; subclasses must override IncrementToken() NOTE: Subclasses overriding IncrementToken() must call ClearAttributes() before setting attributes.Implements
Inherited Members
Namespace: Lucene.Net.Analysis
Assembly: Lucene.Net.dll
Syntax
public abstract class Tokenizer : TokenStream, IDisposable
Constructors
Tokenizer(AttributeFactory, TextReader)
Construct a token stream processing the given input using the given AttributeSource.AttributeFactory.
Declaration
protected Tokenizer(AttributeSource.AttributeFactory factory, TextReader input)
Parameters
| Type | Name | Description |
|---|---|---|
| AttributeSource.AttributeFactory | factory | |
| TextReader | input |
Tokenizer(TextReader)
Construct a token stream processing the given input.
Declaration
protected Tokenizer(TextReader input)
Parameters
| Type | Name | Description |
|---|---|---|
| TextReader | input |
Fields
m_input
The text source for this Tokenizer.
Declaration
protected TextReader m_input
Field Value
| Type | Description |
|---|---|
| TextReader |
Methods
CorrectOffset(int)
Return the corrected offset. If m_input is a CharFilter subclass
this method calls CorrectOffset(int), else returns currentOff.
Declaration
protected int CorrectOffset(int currentOff)
Parameters
| Type | Name | Description |
|---|---|---|
| int | currentOff | offset as seen in the output |
Returns
| Type | Description |
|---|---|
| int | corrected offset based on the input |
See Also
Dispose(bool)
Releases resources associated with this stream.
If you override this method, always callbase.Dispose(disposing), otherwise
some internal state will not be correctly reset (e.g., Tokenizer will
throw InvalidOperationException on reuse).
Declaration
protected override void Dispose(bool disposing)
Parameters
| Type | Name | Description |
|---|---|---|
| bool | disposing |
Overrides
Remarks
NOTE:
The default implementation closes the input TextReader, so
be sure to call base.Dispose(disposing) when overriding this method.
Reset()
This method is called by a consumer before it begins consumption using IncrementToken().
Resets this stream to a clean state. Stateful implementations must implement this method so that they can be reused, just as if they had been created fresh. If you override this method, always callbase.Reset(), otherwise
some internal state will not be correctly reset (e.g., Tokenizer will
throw InvalidOperationException on further usage).
Declaration
public override void Reset()
Overrides
SetReader(TextReader)
Expert: Set a new reader on the Tokenizer. Typically, an analyzer (in its tokenStream method) will use this to re-use a previously created tokenizer.
Declaration
public void SetReader(TextReader input)
Parameters
| Type | Name | Description |
|---|---|---|
| TextReader | input |