Show / Hide Table of Contents

    Class PatternTokenizerFactory

    Factory for PatternTokenizer. This tokenizer uses regex pattern matching to construct distinct tokens for the input stream. It takes two arguments: "pattern" and "group".

    • "pattern" is the regular expression.
    • "group" says which group to extract into tokens.

    group=-1 (the default) is equivalent to "split". In this case, the tokens will be equivalent to the output from (without empty tokens):

    Using group >= 0 selects the matching group as the token. For example, if you have:

        pattern = \'([^\']+)\'
        group = 0
        input = aaa 'bbb' 'ccc'

    the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)

    NOTE: This Tokenizer does not output tokens that are of zero length.

    <fieldType name="text_ptn" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.PatternTokenizerFactory" pattern="\'([^\']+)\'" group="1"/>
      </analyzer>
    </fieldType>

    @since solr1.2

    Inheritance
    System.Object
    AbstractAnalysisFactory
    TokenizerFactory
    PatternTokenizerFactory
    Inherited Members
    TokenizerFactory.ForName(String, IDictionary<String, String>)
    TokenizerFactory.LookupClass(String)
    TokenizerFactory.AvailableTokenizers
    TokenizerFactory.ReloadTokenizers()
    TokenizerFactory.Create(TextReader)
    AbstractAnalysisFactory.LUCENE_MATCH_VERSION_PARAM
    AbstractAnalysisFactory.m_luceneMatchVersion
    AbstractAnalysisFactory.OriginalArgs
    AbstractAnalysisFactory.AssureMatchVersion()
    AbstractAnalysisFactory.LuceneMatchVersion
    AbstractAnalysisFactory.Require(IDictionary<String, String>, String)
    AbstractAnalysisFactory.Require(IDictionary<String, String>, String, ICollection<String>)
    AbstractAnalysisFactory.Require(IDictionary<String, String>, String, ICollection<String>, Boolean)
    AbstractAnalysisFactory.Get(IDictionary<String, String>, String, String)
    AbstractAnalysisFactory.Get(IDictionary<String, String>, String, ICollection<String>)
    AbstractAnalysisFactory.Get(IDictionary<String, String>, String, ICollection<String>, String)
    AbstractAnalysisFactory.Get(IDictionary<String, String>, String, ICollection<String>, String, Boolean)
    AbstractAnalysisFactory.RequireInt32(IDictionary<String, String>, String)
    AbstractAnalysisFactory.GetInt32(IDictionary<String, String>, String, Int32)
    AbstractAnalysisFactory.RequireBoolean(IDictionary<String, String>, String)
    AbstractAnalysisFactory.GetBoolean(IDictionary<String, String>, String, Boolean)
    AbstractAnalysisFactory.RequireSingle(IDictionary<String, String>, String)
    AbstractAnalysisFactory.GetSingle(IDictionary<String, String>, String, Single)
    AbstractAnalysisFactory.RequireChar(IDictionary<String, String>, String)
    AbstractAnalysisFactory.GetChar(IDictionary<String, String>, String, Char)
    AbstractAnalysisFactory.GetSet(IDictionary<String, String>, String)
    AbstractAnalysisFactory.GetPattern(IDictionary<String, String>, String)
    AbstractAnalysisFactory.GetCulture(IDictionary<String, String>, String, CultureInfo)
    AbstractAnalysisFactory.GetWordSet(IResourceLoader, String, Boolean)
    AbstractAnalysisFactory.GetLines(IResourceLoader, String)
    AbstractAnalysisFactory.GetSnowballWordSet(IResourceLoader, String, Boolean)
    AbstractAnalysisFactory.SplitFileNames(String)
    AbstractAnalysisFactory.GetClassArg()
    AbstractAnalysisFactory.IsExplicitLuceneMatchVersion
    Namespace: Lucene.Net.Analysis.Pattern
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public class PatternTokenizerFactory : TokenizerFactory

    Constructors

    | Improve this Doc View Source

    PatternTokenizerFactory(IDictionary<String, String>)

    Creates a new PatternTokenizerFactory

    Declaration
    public PatternTokenizerFactory(IDictionary<string, string> args)
    Parameters
    Type Name Description
    IDictionary<System.String, System.String> args

    Fields

    | Improve this Doc View Source

    GROUP

    Declaration
    public const string GROUP = null
    Field Value
    Type Description
    System.String
    | Improve this Doc View Source

    m_group

    Declaration
    protected readonly int m_group
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    m_pattern

    Declaration
    protected readonly Regex m_pattern
    Field Value
    Type Description
    Regex
    | Improve this Doc View Source

    PATTERN

    Declaration
    public const string PATTERN = null
    Field Value
    Type Description
    System.String

    Methods

    | Improve this Doc View Source

    Create(AttributeSource.AttributeFactory, TextReader)

    Split the input using configured pattern

    Declaration
    public override Tokenizer Create(AttributeSource.AttributeFactory factory, TextReader input)
    Parameters
    Type Name Description
    AttributeSource.AttributeFactory factory
    TextReader input
    Returns
    Type Description
    Tokenizer
    Overrides
    TokenizerFactory.Create(AttributeSource.AttributeFactory, TextReader)

    See Also

    PatternTokenizer
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)