Class ChineseTokenizer
Tokenize Chinese text as individual chinese characters.
The difference between ChineseTokenizer and CJKTokenizer is that they have different token parsing logic.
For example, if the Chinese text "C1C2C3C4" is to be indexed:
- The tokens returned from ChineseTokenizer are C1, C2, C3, C4.
- The tokens returned from the CJKTokenizer are C1C2, C2C3, C3C4.
Therefore the index created by CJKTokenizer is much larger.
The problem is that when searching for C1, C1C2, C1C3, C4C2, C1C2C3 ... the ChineseTokenizer works, but the CJKTokenizer will not work.
Inheritance
System.Object
    Lucene.Net.Util.AttributeSource
    Lucene.Net.Analysis.TokenStream
    Lucene.Net.Analysis.Tokenizer
    ChineseTokenizer
  Implements
System.IDisposable
  Inherited Members
      Lucene.Net.Analysis.Tokenizer.m_input
    
    
    
    
    
      Lucene.Net.Analysis.TokenStream.Dispose()
    
    
      Lucene.Net.Util.AttributeSource.GetAttributeFactory()
    
    
      Lucene.Net.Util.AttributeSource.GetAttributeClassesEnumerator()
    
    
      Lucene.Net.Util.AttributeSource.GetAttributeImplsEnumerator()
    
    
      Lucene.Net.Util.AttributeSource.AddAttributeImpl(Lucene.Net.Util.Attribute)
    
    
      Lucene.Net.Util.AttributeSource.AddAttribute<T>()
    
    
      Lucene.Net.Util.AttributeSource.HasAttributes
    
    
      Lucene.Net.Util.AttributeSource.HasAttribute<T>()
    
    
      Lucene.Net.Util.AttributeSource.GetAttribute<T>()
    
    
      Lucene.Net.Util.AttributeSource.ClearAttributes()
    
    
      Lucene.Net.Util.AttributeSource.CaptureState()
    
    
      Lucene.Net.Util.AttributeSource.RestoreState(Lucene.Net.Util.AttributeSource.State)
    
    
      Lucene.Net.Util.AttributeSource.GetHashCode()
    
    
    
    
      Lucene.Net.Util.AttributeSource.ReflectWith(Lucene.Net.Util.IAttributeReflector)
    
    
      Lucene.Net.Util.AttributeSource.CloneAttributes()
    
    
      Lucene.Net.Util.AttributeSource.CopyTo(Lucene.Net.Util.AttributeSource)
    
    
      Lucene.Net.Util.AttributeSource.ToString()
    
    
      System.Object.Equals(System.Object, System.Object)
    
    
      System.Object.GetType()
    
    
      System.Object.MemberwiseClone()
    
    
      System.Object.ReferenceEquals(System.Object, System.Object)
    
  Namespace: Lucene.Net.Analysis.Cn
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
[Obsolete("(3.1) Use StandardTokenizer instead, which has the same functionality.")]
public sealed class ChineseTokenizer : Tokenizer, IDisposableConstructors
| Improve this Doc View SourceChineseTokenizer(AttributeSource.AttributeFactory, TextReader)
Declaration
public ChineseTokenizer(AttributeSource.AttributeFactory factory, TextReader in)Parameters
| Type | Name | Description | 
|---|---|---|
| Lucene.Net.Util.AttributeSource.AttributeFactory | factory | |
| System.IO.TextReader | in | 
ChineseTokenizer(TextReader)
Declaration
public ChineseTokenizer(TextReader in)Parameters
| Type | Name | Description | 
|---|---|---|
| System.IO.TextReader | in | 
Methods
| Improve this Doc View SourceEnd()
Declaration
public override sealed void End()Overrides
Lucene.Net.Analysis.TokenStream.End()
  
    |
    Improve this Doc
  
  
    View Source
  
  
  IncrementToken()
Declaration
public override bool IncrementToken()Returns
| Type | Description | 
|---|---|
| System.Boolean | 
Overrides
Lucene.Net.Analysis.TokenStream.IncrementToken()
  
    |
    Improve this Doc
  
  
    View Source
  
  
  Reset()
Declaration
public override void Reset()Overrides
Lucene.Net.Analysis.Tokenizer.Reset()
  Implements
      System.IDisposable