• API

    Show / Hide Table of Contents

    Class NGramTokenFilter

    Tokenizes the input into n-grams of the given size(s).

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating a NGramTokenFilter. As of Lucene 4.4, this token filters:

    • handles supplementary characters correctly,
    • emits all n-grams for the same token at the same position,
    • does not modify offsets,
    • sorts n-grams by their offset in the original token first, then increasing length (meaning that "abc" will give "a", "ab", "abc", "b", "bc", "c").

    You can make this filter use the old behavior by providing a version < LUCENE_44 in the constructor but this is not recommended as it will lead to broken Lucene.Net.Analysis.TokenStreams that will cause highlighting bugs.

    If you were using this Lucene.Net.Analysis.TokenFilter to perform partial highlighting, this won't work anymore since this filter doesn't update offsets. You should modify your analysis chain to use NGramTokenizer, and potentially override IsTokenChar(Int32) to perform pre-tokenization.

    Inheritance
    System.Object
    Lucene.Net.Util.AttributeSource
    Lucene.Net.Analysis.TokenStream
    Lucene.Net.Analysis.TokenFilter
    NGramTokenFilter
    Implements
    System.IDisposable
    Inherited Members
    Lucene.Net.Analysis.TokenFilter.m_input
    Lucene.Net.Analysis.TokenFilter.End()
    TokenFilter.Dispose(Boolean)
    Lucene.Net.Analysis.TokenStream.Dispose()
    Lucene.Net.Util.AttributeSource.GetAttributeFactory()
    Lucene.Net.Util.AttributeSource.GetAttributeClassesEnumerator()
    Lucene.Net.Util.AttributeSource.GetAttributeImplsEnumerator()
    Lucene.Net.Util.AttributeSource.AddAttributeImpl(Lucene.Net.Util.Attribute)
    Lucene.Net.Util.AttributeSource.AddAttribute<T>()
    Lucene.Net.Util.AttributeSource.HasAttributes
    Lucene.Net.Util.AttributeSource.HasAttribute<T>()
    Lucene.Net.Util.AttributeSource.GetAttribute<T>()
    Lucene.Net.Util.AttributeSource.ClearAttributes()
    Lucene.Net.Util.AttributeSource.CaptureState()
    Lucene.Net.Util.AttributeSource.RestoreState(Lucene.Net.Util.AttributeSource.State)
    Lucene.Net.Util.AttributeSource.GetHashCode()
    AttributeSource.Equals(Object)
    AttributeSource.ReflectAsString(Boolean)
    Lucene.Net.Util.AttributeSource.ReflectWith(Lucene.Net.Util.IAttributeReflector)
    Lucene.Net.Util.AttributeSource.CloneAttributes()
    Lucene.Net.Util.AttributeSource.CopyTo(Lucene.Net.Util.AttributeSource)
    Lucene.Net.Util.AttributeSource.ToString()
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    Namespace: Lucene.Net.Analysis.NGram
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public sealed class NGramTokenFilter : TokenFilter, IDisposable

    Constructors

    | Improve this Doc View Source

    NGramTokenFilter(LuceneVersion, TokenStream)

    Creates NGramTokenFilter with default min and max n-grams.

    Declaration
    public NGramTokenFilter(LuceneVersion version, TokenStream input)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion version

    Lucene version to enable correct position increments. See NGramTokenFilter for details.

    Lucene.Net.Analysis.TokenStream input

    Lucene.Net.Analysis.TokenStream holding the input to be tokenized

    | Improve this Doc View Source

    NGramTokenFilter(LuceneVersion, TokenStream, Int32, Int32)

    Creates NGramTokenFilter with given min and max n-grams.

    Declaration
    public NGramTokenFilter(LuceneVersion version, TokenStream input, int minGram, int maxGram)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion version

    Lucene version to enable correct position increments. See NGramTokenFilter for details.

    Lucene.Net.Analysis.TokenStream input

    Lucene.Net.Analysis.TokenStream holding the input to be tokenized

    System.Int32 minGram

    the smallest n-gram to generate

    System.Int32 maxGram

    the largest n-gram to generate

    Fields

    | Improve this Doc View Source

    DEFAULT_MAX_NGRAM_SIZE

    Declaration
    public const int DEFAULT_MAX_NGRAM_SIZE = 2
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    DEFAULT_MIN_NGRAM_SIZE

    Declaration
    public const int DEFAULT_MIN_NGRAM_SIZE = 1
    Field Value
    Type Description
    System.Int32

    Methods

    | Improve this Doc View Source

    IncrementToken()

    Returns the next token in the stream, or null at EOS.

    Declaration
    public override sealed bool IncrementToken()
    Returns
    Type Description
    System.Boolean
    Overrides
    Lucene.Net.Analysis.TokenStream.IncrementToken()
    | Improve this Doc View Source

    Reset()

    Declaration
    public override void Reset()
    Overrides
    Lucene.Net.Analysis.TokenFilter.Reset()

    Implements

    System.IDisposable
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)