Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class AnalyzerFactoryTask

    Analyzer factory construction task. The name given to the constructed factory may be given to NewAnalyzerTask, which will call Create().

    Inheritance
    object
    PerfTask
    AnalyzerFactoryTask
    Implements
    IDisposable
    Inherited Members
    PerfTask.m_logStep
    PerfTask.m_params
    PerfTask.NEW_LINE
    PerfTask.SetRunInBackground(int)
    PerfTask.RunInBackground
    PerfTask.BackgroundDeltaPriority
    PerfTask.Stop
    PerfTask.StopNow()
    PerfTask.Clone()
    PerfTask.Dispose()
    PerfTask.Dispose(bool)
    PerfTask.RunAndMaybeStats(bool)
    PerfTask.GetName()
    PerfTask.SetName(string)
    PerfTask.RunData
    PerfTask.Depth
    PerfTask.ToString()
    PerfTask.GetLogMessage(int)
    PerfTask.ShouldNeverLogAtStart
    PerfTask.ShouldNotRecordStats
    PerfTask.Setup()
    PerfTask.TearDown()
    PerfTask.Params
    PerfTask.DisableCounting
    PerfTask.AlgLineNum
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    Namespace: Lucene.Net.Benchmarks.ByTask.Tasks
    Assembly: Lucene.Net.Benchmark.dll
    Syntax
    public class AnalyzerFactoryTask : PerfTask, IDisposable
    Remarks

    Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,

    Specify params in a comma separated list of the following, in order:
      • Required: name:analyzer-factory-name
      • Optional: positionIncrementGap:int value (default: 0)
      • Optional: offsetGap:int value (default: 1)
    1. zero or more CharFilterFactory's, followed by
    2. exactly one TokenizerFactory, followed by
    3. zero or more TokenFilterFactory's

    Each component analysis factory map specify luceneMatchVersion (defaults to Lucene.Net.Util.LuceneVersion.LUCENE_CURRENT) and any of the args understood by the specified *Factory class, in the above-describe param format.

    Example:
    -AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
                     positionIncrementGap:100,
                     HTMLStripCharFilter,
                     MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
                     WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
                     TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
    [...]
    -NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')

    AnalyzerFactory will direct analysis component factories to look for resources under the directory specified in the "work.dir" property.

    Constructors

    AnalyzerFactoryTask(PerfRunData)

    Analyzer factory construction task. The name given to the constructed factory may be given to NewAnalyzerTask, which will call Create().

    Declaration
    public AnalyzerFactoryTask(PerfRunData runData)
    Parameters
    Type Name Description
    PerfRunData runData
    Remarks

    Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,

    Specify params in a comma separated list of the following, in order:
      • Required: name:analyzer-factory-name
      • Optional: positionIncrementGap:int value (default: 0)
      • Optional: offsetGap:int value (default: 1)
    1. zero or more CharFilterFactory's, followed by
    2. exactly one TokenizerFactory, followed by
    3. zero or more TokenFilterFactory's

    Each component analysis factory map specify luceneMatchVersion (defaults to Lucene.Net.Util.LuceneVersion.LUCENE_CURRENT) and any of the args understood by the specified *Factory class, in the above-describe param format.

    Example:
    -AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
                     positionIncrementGap:100,
                     HTMLStripCharFilter,
                     MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
                     WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
                     TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
    [...]
    -NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')

    AnalyzerFactory will direct analysis component factories to look for resources under the directory specified in the "work.dir" property.

    Properties

    SupportsParams

    Analyzer factory construction task. The name given to the constructed factory may be given to NewAnalyzerTask, which will call Create().

    Declaration
    public override bool SupportsParams { get; }
    Property Value
    Type Description
    bool
    Overrides
    PerfTask.SupportsParams
    Remarks

    Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,

    Specify params in a comma separated list of the following, in order:
      • Required: name:analyzer-factory-name
      • Optional: positionIncrementGap:int value (default: 0)
      • Optional: offsetGap:int value (default: 1)
    1. zero or more CharFilterFactory's, followed by
    2. exactly one TokenizerFactory, followed by
    3. zero or more TokenFilterFactory's

    Each component analysis factory map specify luceneMatchVersion (defaults to Lucene.Net.Util.LuceneVersion.LUCENE_CURRENT) and any of the args understood by the specified *Factory class, in the above-describe param format.

    Example:
    -AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
                     positionIncrementGap:100,
                     HTMLStripCharFilter,
                     MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
                     WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
                     TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
    [...]
    -NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')

    AnalyzerFactory will direct analysis component factories to look for resources under the directory specified in the "work.dir" property.
    See Also
    SupportsParams

    Methods

    DoLogic()

    Perform the task once (ignoring repetitions specification). Return number of work items done by this task. For indexing that can be number of docs added. For warming that can be number of scanned items, etc.

    Declaration
    public override int DoLogic()
    Returns
    Type Description
    int

    Number of work items done by this task.

    Overrides
    PerfTask.DoLogic()
    Remarks

    Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,

    Specify params in a comma separated list of the following, in order:
      • Required: name:analyzer-factory-name
      • Optional: positionIncrementGap:int value (default: 0)
      • Optional: offsetGap:int value (default: 1)
    1. zero or more CharFilterFactory's, followed by
    2. exactly one TokenizerFactory, followed by
    3. zero or more TokenFilterFactory's

    Each component analysis factory map specify luceneMatchVersion (defaults to Lucene.Net.Util.LuceneVersion.LUCENE_CURRENT) and any of the args understood by the specified *Factory class, in the above-describe param format.

    Example:
    -AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
                     positionIncrementGap:100,
                     HTMLStripCharFilter,
                     MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
                     WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
                     TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
    [...]
    -NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')

    AnalyzerFactory will direct analysis component factories to look for resources under the directory specified in the "work.dir" property.

    GetLineNumber(StreamTokenizer)

    Returns the current line in the algorithm file

    Declaration
    public virtual int GetLineNumber(StreamTokenizer stok)
    Parameters
    Type Name Description
    StreamTokenizer stok
    Returns
    Type Description
    int
    Remarks

    Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,

    Specify params in a comma separated list of the following, in order:
      • Required: name:analyzer-factory-name
      • Optional: positionIncrementGap:int value (default: 0)
      • Optional: offsetGap:int value (default: 1)
    1. zero or more CharFilterFactory's, followed by
    2. exactly one TokenizerFactory, followed by
    3. zero or more TokenFilterFactory's

    Each component analysis factory map specify luceneMatchVersion (defaults to Lucene.Net.Util.LuceneVersion.LUCENE_CURRENT) and any of the args understood by the specified *Factory class, in the above-describe param format.

    Example:
    -AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
                     positionIncrementGap:100,
                     HTMLStripCharFilter,
                     MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
                     WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
                     TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
    [...]
    -NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')

    AnalyzerFactory will direct analysis component factories to look for resources under the directory specified in the "work.dir" property.

    LookupAnalysisClass(string, Type)

    This method looks up a class with its fully qualified name (FQN), or a short-name class-simplename, or with a package suffix, assuming "Lucene.Net.Analysis." as the namespace prefix (e.g. "standard.ClassicTokenizerFactory" -> "Lucene.Net.Analysis.Standard.ClassicTokenizerFactory").

    Declaration
    public virtual Type LookupAnalysisClass(string className, Type expectedType)
    Parameters
    Type Name Description
    string className

    The namespace qualified name or the short name of the class.

    Type expectedType

    The superclass className is expected to extend.

    Returns
    Type Description
    Type

    The loaded type.

    Remarks

    If className contains a period, the class is first looked up as-is, assuming that it is an FQN. If this fails, lookup is retried after prepending the Lucene analysis package prefix to the class name.

    If className does not contain a period, the analysis SPI *Factory.LookupClass() methods are used to find the class.
    Exceptions
    Type Condition
    TypeLoadException

    If lookup fails.

    SetParams(string)

    Sets the params. Analysis component factory names may optionally include the "Factory" suffix.

    Declaration
    public override void SetParams(string @params)
    Parameters
    Type Name Description
    string params

    analysis pipeline specification: name, (optional) positionIncrementGap, (optional) offsetGap, 0+ CharFilterFactory's, 1 TokenizerFactory, and 0+ TokenFilterFactory's

    Overrides
    PerfTask.SetParams(string)
    Remarks

    Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,

    Specify params in a comma separated list of the following, in order:
      • Required: name:analyzer-factory-name
      • Optional: positionIncrementGap:int value (default: 0)
      • Optional: offsetGap:int value (default: 1)
    1. zero or more CharFilterFactory's, followed by
    2. exactly one TokenizerFactory, followed by
    3. zero or more TokenFilterFactory's

    Each component analysis factory map specify luceneMatchVersion (defaults to Lucene.Net.Util.LuceneVersion.LUCENE_CURRENT) and any of the args understood by the specified *Factory class, in the above-describe param format.

    Example:
    -AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
                     positionIncrementGap:100,
                     HTMLStripCharFilter,
                     MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
                     WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
                     TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
    [...]
    -NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')

    AnalyzerFactory will direct analysis component factories to look for resources under the directory specified in the "work.dir" property.

    Implements

    IDisposable
    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.