Class Tokenizer
A Tokenizer is a TokenStream whose input is a System.IO.TextReader.
This is an abstract class; subclasses must override IncrementToken()
NOTE: Subclasses overriding IncrementToken() must call ClearAttributes() before setting attributes.
Implements
Inherited Members
Namespace: Lucene.Net.Analysis
Assembly: Lucene.Net.dll
Syntax
public abstract class Tokenizer : TokenStream, IDisposable
Constructors
| Improve this Doc View SourceTokenizer(AttributeSource.AttributeFactory, TextReader)
Construct a token stream processing the given input using the given AttributeSource.AttributeFactory.
Declaration
protected Tokenizer(AttributeSource.AttributeFactory factory, TextReader input)
Parameters
Type | Name | Description |
---|---|---|
AttributeSource.AttributeFactory | factory | |
System.IO.TextReader | input |
Tokenizer(TextReader)
Construct a token stream processing the given input.
Declaration
protected Tokenizer(TextReader input)
Parameters
Type | Name | Description |
---|---|---|
System.IO.TextReader | input |
Fields
| Improve this Doc View Sourcem_input
The text source for this Tokenizer.
Declaration
protected TextReader m_input
Field Value
Type | Description |
---|---|
System.IO.TextReader |
Methods
| Improve this Doc View SourceCorrectOffset(Int32)
Return the corrected offset. If m_input is a CharFilter subclass
this method calls CorrectOffset(Int32), else returns currentOff
.
Declaration
protected int CorrectOffset(int currentOff)
Parameters
Type | Name | Description |
---|---|---|
System.Int32 | currentOff | offset as seen in the output |
Returns
Type | Description |
---|---|
System.Int32 | corrected offset based on the input |
See Also
| Improve this Doc View SourceDispose(Boolean)
Releases resources associated with this stream.
If you override this method, always call base.Dispose(disposing)
, otherwise
some internal state will not be correctly reset (e.g., Tokenizer will
throw System.InvalidOperationException on reuse).
Declaration
protected override void Dispose(bool disposing)
Parameters
Type | Name | Description |
---|---|---|
System.Boolean | disposing |
Overrides
Remarks
NOTE:
The default implementation closes the input System.IO.TextReader, so
be sure to call base.Dispose(disposing)
when overriding this method.
Reset()
Declaration
public override void Reset()
Overrides
| Improve this Doc View SourceSetReader(TextReader)
Expert: Set a new reader on the Tokenizer. Typically, an analyzer (in its tokenStream method) will use this to re-use a previously created tokenizer.
Declaration
public void SetReader(TextReader input)
Parameters
Type | Name | Description |
---|---|---|
System.IO.TextReader | input |