Show / Hide Table of Contents

    Class TokenStreamToAutomaton

    Consumes a TokenStream and creates an Automaton where the transition labels are UTF8 bytes (or Unicode code points if unicodeArcs is true) from the ITermToBytesRefAttribute. Between tokens we insert POS_SEP and for holes we insert HOLE.

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk
    Inheritance
    System.Object
    TokenStreamToAutomaton
    Namespace: Lucene.Net.Analysis
    Assembly: Lucene.Net.dll
    Syntax
    public class TokenStreamToAutomaton : object

    Constructors

    | Improve this Doc View Source

    TokenStreamToAutomaton()

    Sole constructor.

    Declaration
    public TokenStreamToAutomaton()

    Fields

    | Improve this Doc View Source

    HOLE

    We add this arc to represent a hole.

    Declaration
    public const int HOLE = null
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    POS_SEP

    We create transition between two adjacent tokens.

    Declaration
    public const int POS_SEP = null
    Field Value
    Type Description
    System.Int32

    Properties

    | Improve this Doc View Source

    PreservePositionIncrements

    Whether to generate holes in the automaton for missing positions, true by default.

    Declaration
    public virtual bool PreservePositionIncrements { get; set; }
    Property Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    UnicodeArcs

    Whether to make transition labels Unicode code points instead of UTF8 bytes, false by default

    Declaration
    public virtual bool UnicodeArcs { get; set; }
    Property Value
    Type Description
    System.Boolean

    Methods

    | Improve this Doc View Source

    ChangeToken(BytesRef)

    Subclass & implement this if you need to change the token (such as escaping certain bytes) before it's turned into a graph.

    Declaration
    protected virtual BytesRef ChangeToken(BytesRef in)
    Parameters
    Type Name Description
    BytesRef in
    Returns
    Type Description
    BytesRef
    | Improve this Doc View Source

    ToAutomaton(TokenStream)

    Pulls the graph (including IPositionLengthAttribute from the provided TokenStream, and creates the corresponding automaton where arcs are bytes (or Unicode code points if unicodeArcs = true) from each term.

    Declaration
    public virtual Automaton ToAutomaton(TokenStream in)
    Parameters
    Type Name Description
    TokenStream in
    Returns
    Type Description
    Automaton
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)