Class OpenNLPLemmatizerFilter
Runs OpenNLP dictionary-based and/or MaxEnt lemmatizers.
Both a dictionary-based lemmatizer and a MaxEnt lemmatizer are supported, via the "dictionary" and "lemmatizerModel" params, respectively. If both are configured, the dictionary-based lemmatizer is tried first, and then the MaxEnt lemmatizer is consulted for out-of-vocabulary tokens.
The dictionary file must be encoded as UTF-8, with one entry per line,
in the form word[tab]lemma[tab]part-of-speech
Implements
System.IDisposable
Inherited Members
Namespace: Lucene.Net.Analysis.OpenNlp
Assembly: Lucene.Net.Analysis.OpenNLP.dll
Syntax
public class OpenNLPLemmatizerFilter : TokenFilter, IDisposable
Constructors
| Improve this Doc View SourceOpenNLPLemmatizerFilter(TokenStream, NLPLemmatizerOp)
Declaration
public OpenNLPLemmatizerFilter(TokenStream input, NLPLemmatizerOp lemmatizerOp)
Parameters
Type | Name | Description |
---|---|---|
TokenStream | input | |
NLPLemmatizerOp | lemmatizerOp |
Methods
| Improve this Doc View SourceIncrementToken()
Declaration
public override sealed bool IncrementToken()
Returns
Type | Description |
---|---|
System.Boolean |
Overrides
| Improve this Doc View SourceReset()
Declaration
public override void Reset()
Overrides
Implements
System.IDisposable