Class OpenNLPTokenizer
Run OpenNLP SentenceDetector and Tokenizer.
The last token in each sentence is marked by setting the EOS_FLAG_BIT in the IFlags
Inheritance
System.Object
OpenNLPTokenizer
Implements
IDisposable
Inherited Members
Namespace: Lucene.Net.Analysis.OpenNlp
Assembly: Lucene.Net.Analysis.OpenNLP.dll
Syntax
public sealed class OpenNLPTokenizer : SegmentingTokenizerBase, IDisposable
Constructors
| Improve this Doc View SourceOpenNLPTokenizer(AttributeSource.AttributeFactory, TextReader, NLPSentenceDetectorOp, NLPTokenizerOp)
Declaration
public OpenNLPTokenizer(AttributeSource.AttributeFactory factory, TextReader reader, NLPSentenceDetectorOp sentenceOp, NLPTokenizerOp tokenizerOp)
Parameters
Type | Name | Description |
---|---|---|
Attribute |
factory | |
Text |
reader | |
NLPSentence |
sentenceOp | |
NLPTokenizer |
tokenizerOp |
OpenNLPTokenizer(TextReader, NLPSentenceDetectorOp, NLPTokenizerOp)
Creates a new Open
Declaration
public OpenNLPTokenizer(TextReader reader, NLPSentenceDetectorOp sentenceOp, NLPTokenizerOp tokenizerOp)
Parameters
Type | Name | Description |
---|---|---|
Text |
reader | |
NLPSentence |
sentenceOp | |
NLPTokenizer |
tokenizerOp |
Fields
| Improve this Doc View SourceEOS_FLAG_BIT
Declaration
public static int EOS_FLAG_BIT
Field Value
Type | Description |
---|---|
System. |
Methods
| Improve this Doc View SourceDispose(Boolean)
Declaration
protected override void Dispose(bool disposing)
Parameters
Type | Name | Description |
---|---|---|
System. |
disposing |
IncrementWord()
Declaration
protected override bool IncrementWord()
Returns
Type | Description |
---|---|
System. |
Overrides
| Improve this Doc View SourceReset()
Declaration
public override void Reset()
Overrides
| Improve this Doc View SourceSetNextSentence(Int32, Int32)
Declaration
protected override void SetNextSentence(int sentenceStart, int sentenceEnd)
Parameters
Type | Name | Description |
---|---|---|
System. |
sentenceStart | |
System. |
sentenceEnd |
Overrides
Implements
IDisposable