Class HMMChineseTokenizerFactory
Factory for HMMChinese
Note: this class will currently emit tokens for punctuation. So you should either add
a Word
words="org/apache/lucene/analysis/cn/smart/stopwords.txt"
This is a Lucene.NET EXPERIMENTAL API, use at your own risk
Inherited Members
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.ToString()
Namespace: Lucene.Net.Analysis.Cn.Smart
Assembly: Lucene.Net.Analysis.SmartCn.dll
Syntax
public sealed class HMMChineseTokenizerFactory : TokenizerFactory
Constructors
| Improve this Doc View SourceHMMChineseTokenizerFactory(IDictionary<String, String>)
Creates a new HMMChinese
Declaration
public HMMChineseTokenizerFactory(IDictionary<string, string> args)
Parameters
Type | Name | Description |
---|---|---|
System. |
args |
Methods
| Improve this Doc View SourceCreate(AttributeSource.AttributeFactory, TextReader)
Declaration
public override Tokenizer Create(AttributeSource.AttributeFactory factory, TextReader reader)
Parameters
Type | Name | Description |
---|---|---|
Attribute |
factory | |
System. |
reader |
Returns
Type | Description |
---|---|
Tokenizer |