Class Tokenizer
A Tokenizer is a Token
This is an abstract class; subclasses must override Increment
NOTE: Subclasses overriding Increment
Implements
Inherited Members
Namespace: Lucene.Net.Analysis
Assembly: Lucene.Net.dll
Syntax
public abstract class Tokenizer : TokenStream, IDisposable
Constructors
| Improve this Doc View SourceTokenizer(AttributeSource.AttributeFactory, TextReader)
Construct a token stream processing the given input using the given Attribute
Declaration
protected Tokenizer(AttributeSource.AttributeFactory factory, TextReader input)
Parameters
Type | Name | Description |
---|---|---|
Attribute |
factory | |
Text |
input |
Tokenizer(TextReader)
Construct a token stream processing the given input.
Declaration
protected Tokenizer(TextReader input)
Parameters
Type | Name | Description |
---|---|---|
Text |
input |
Fields
| Improve this Doc View Sourcem_input
The text source for this Tokenizer.
Declaration
protected TextReader m_input
Field Value
Type | Description |
---|---|
Text |
Methods
| Improve this Doc View SourceCorrectOffset(Int32)
Return the corrected offset. If m_input is a CharcurrentOff
.
Declaration
protected int CorrectOffset(int currentOff)
Parameters
Type | Name | Description |
---|---|---|
System. |
currentOff | offset as seen in the output |
Returns
Type | Description |
---|---|
System. |
corrected offset based on the input |
See Also
| Improve this Doc View SourceDispose(Boolean)
Releases resources associated with this stream.
If you override this method, always call base.Dispose(disposing)
, otherwise
some internal state will not be correctly reset (e.g., Tokenizer will
throw
Declaration
protected override void Dispose(bool disposing)
Parameters
Type | Name | Description |
---|---|---|
System. |
disposing |
Overrides
Remarks
NOTE:
The default implementation closes the input base.Dispose(disposing)
when overriding this method.
Reset()
Declaration
public override void Reset()
Overrides
| Improve this Doc View SourceSetReader(TextReader)
Expert: Set a new reader on the Tokenizer. Typically, an analyzer (in its tokenStream method) will use this to re-use a previously created tokenizer.
Declaration
public void SetReader(TextReader input)
Parameters
Type | Name | Description |
---|---|---|
Text |
input |