• API

    Show / Hide Table of Contents

    Class EdgeNGramTokenizer

    Tokenizes the input from an edge into n-grams of given size(s).

    This Lucene.Net.Analysis.Tokenizer create n-grams from the beginning edge or ending edge of a input token.

    As of Lucene 4.4, this tokenizer

    • can handle
      maxGram
      larger than 1024 chars, but beware that this will result in increased memory usage
    • doesn't trim the input,
    • sets position increments equal to 1 instead of 1 for the first token and 0 for all other ones
    • doesn't support backward n-grams anymore.
    • supports IsTokenChar(Int32) pre-tokenization,
    • correctly handles supplementary characters.

    Although highly discouraged, it is still possible to use the old behavior through Lucene43EdgeNGramTokenizer.

    Inheritance
    System.Object
    Lucene.Net.Util.AttributeSource
    Lucene.Net.Analysis.TokenStream
    Lucene.Net.Analysis.Tokenizer
    NGramTokenizer
    EdgeNGramTokenizer
    Implements
    System.IDisposable
    Inherited Members
    NGramTokenizer.DEFAULT_MIN_NGRAM_SIZE
    NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE
    NGramTokenizer.IncrementToken()
    NGramTokenizer.IsTokenChar(Int32)
    NGramTokenizer.End()
    NGramTokenizer.Reset()
    Lucene.Net.Analysis.Tokenizer.m_input
    Tokenizer.Dispose(Boolean)
    Tokenizer.CorrectOffset(Int32)
    Tokenizer.SetReader(TextReader)
    Lucene.Net.Analysis.TokenStream.Dispose()
    Lucene.Net.Util.AttributeSource.GetAttributeFactory()
    Lucene.Net.Util.AttributeSource.GetAttributeClassesEnumerator()
    Lucene.Net.Util.AttributeSource.GetAttributeImplsEnumerator()
    Lucene.Net.Util.AttributeSource.AddAttributeImpl(Lucene.Net.Util.Attribute)
    Lucene.Net.Util.AttributeSource.AddAttribute<T>()
    Lucene.Net.Util.AttributeSource.HasAttributes
    Lucene.Net.Util.AttributeSource.HasAttribute<T>()
    Lucene.Net.Util.AttributeSource.GetAttribute<T>()
    Lucene.Net.Util.AttributeSource.ClearAttributes()
    Lucene.Net.Util.AttributeSource.CaptureState()
    Lucene.Net.Util.AttributeSource.RestoreState(Lucene.Net.Util.AttributeSource.State)
    Lucene.Net.Util.AttributeSource.GetHashCode()
    AttributeSource.Equals(Object)
    AttributeSource.ReflectAsString(Boolean)
    Lucene.Net.Util.AttributeSource.ReflectWith(Lucene.Net.Util.IAttributeReflector)
    Lucene.Net.Util.AttributeSource.CloneAttributes()
    Lucene.Net.Util.AttributeSource.CopyTo(Lucene.Net.Util.AttributeSource)
    Lucene.Net.Util.AttributeSource.ToString()
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    Namespace: Lucene.Net.Analysis.NGram
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public class EdgeNGramTokenizer : NGramTokenizer, IDisposable

    Constructors

    | Improve this Doc View Source

    EdgeNGramTokenizer(LuceneVersion, AttributeSource.AttributeFactory, TextReader, Int32, Int32)

    Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range

    Declaration
    public EdgeNGramTokenizer(LuceneVersion version, AttributeSource.AttributeFactory factory, TextReader input, int minGram, int maxGram)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion version

    the Lucene match version - See Lucene.Net.Util.LuceneVersion

    Lucene.Net.Util.AttributeSource.AttributeFactory factory

    Lucene.Net.Util.AttributeSource.AttributeFactory to use

    System.IO.TextReader input

    System.IO.TextReader holding the input to be tokenized

    System.Int32 minGram

    the smallest n-gram to generate

    System.Int32 maxGram

    the largest n-gram to generate

    | Improve this Doc View Source

    EdgeNGramTokenizer(LuceneVersion, TextReader, Int32, Int32)

    Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range

    Declaration
    public EdgeNGramTokenizer(LuceneVersion version, TextReader input, int minGram, int maxGram)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion version

    the Lucene match version - See Lucene.Net.Util.LuceneVersion

    System.IO.TextReader input

    System.IO.TextReader holding the input to be tokenized

    System.Int32 minGram

    the smallest n-gram to generate

    System.Int32 maxGram

    the largest n-gram to generate

    Fields

    | Improve this Doc View Source

    DEFAULT_MAX_GRAM_SIZE

    Declaration
    public const int DEFAULT_MAX_GRAM_SIZE = 1
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    DEFAULT_MIN_GRAM_SIZE

    Declaration
    public const int DEFAULT_MIN_GRAM_SIZE = 1
    Field Value
    Type Description
    System.Int32

    Implements

    System.IDisposable
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)