Class Tokenizer
A Tokenizer is a TokenStream whose input is a System.IO.TextReader.
This is an abstract class; subclasses must override IncrementToken()
NOTE: Subclasses overriding IncrementToken() must call ClearAttributes() before setting attributes.
Implements
Inherited Members
Namespace: Lucene.Net.Analysis
Assembly: Lucene.Net.dll
Syntax
public abstract class Tokenizer : TokenStream, IDisposableConstructors
| Improve this Doc View SourceTokenizer(AttributeSource.AttributeFactory, TextReader)
Construct a token stream processing the given input using the given AttributeSource.AttributeFactory.
Declaration
protected Tokenizer(AttributeSource.AttributeFactory factory, TextReader input)Parameters
| Type | Name | Description | 
|---|---|---|
| AttributeSource.AttributeFactory | factory | |
| System.IO.TextReader | input | 
Tokenizer(TextReader)
Construct a token stream processing the given input.
Declaration
protected Tokenizer(TextReader input)Parameters
| Type | Name | Description | 
|---|---|---|
| System.IO.TextReader | input | 
Fields
| Improve this Doc View Sourcem_input
The text source for this Tokenizer.
Declaration
protected TextReader m_inputField Value
| Type | Description | 
|---|---|
| System.IO.TextReader | 
Methods
| Improve this Doc View SourceCorrectOffset(Int32)
Return the corrected offset. If m_input is a CharFilter subclass
this method calls CorrectOffset(Int32), else returns currentOff. 
Declaration
protected int CorrectOffset(int currentOff)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Int32 | currentOff | offset as seen in the output | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | corrected offset based on the input | 
See Also
| Improve this Doc View SourceDispose(Boolean)
Releases resources associated with this stream.
If you override this method, always call base.Dispose(disposing), otherwise
some internal state will not be correctly reset (e.g., Tokenizer will
throw System.InvalidOperationException on reuse).
Declaration
protected override void Dispose(bool disposing)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Boolean | disposing | 
Overrides
Remarks
NOTE:
The default implementation closes the input System.IO.TextReader, so
be sure to call base.Dispose(disposing) when overriding this method.
Reset()
Declaration
public override void Reset()Overrides
| Improve this Doc View SourceSetReader(TextReader)
Expert: Set a new reader on the Tokenizer. Typically, an analyzer (in its tokenStream method) will use this to re-use a previously created tokenizer.
Declaration
public void SetReader(TextReader input)Parameters
| Type | Name | Description | 
|---|---|---|
| System.IO.TextReader | input |