Class AnalyzerFactoryTask
Analyzer factory construction task. The name given to the constructed factory may be given to NewAnalyzerTask, which will call Create().
Implements
Inherited Members
Namespace: Lucene.Net.Benchmarks.ByTask.Tasks
Assembly: Lucene.Net.Benchmark.dll
Syntax
public class AnalyzerFactoryTask : PerfTask, IDisposable
Remarks
Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,
Specify params in a comma separated list of the following, in order:- Required:
name:analyzer-factory-name
- Optional:
positionIncrementGap:int value
(default: 0) - Optional:
offsetGap:int value
(default: 1)
- Required:
- zero or more CharFilterFactory's, followed by
- exactly one TokenizerFactory, followed by
- zero or more TokenFilterFactory's
-AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
positionIncrementGap:100,
HTMLStripCharFilter,
MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
[...]
-NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')
AnalyzerFactory will direct analysis component factories to look for resources
under the directory specified in the "work.dir" property.
Constructors
AnalyzerFactoryTask(PerfRunData)
Analyzer factory construction task. The name given to the constructed factory may be given to NewAnalyzerTask, which will call Create().
Declaration
public AnalyzerFactoryTask(PerfRunData runData)
Parameters
Type | Name | Description |
---|---|---|
PerfRunData | runData |
Remarks
Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,
Specify params in a comma separated list of the following, in order:- Required:
name:analyzer-factory-name
- Optional:
positionIncrementGap:int value
(default: 0) - Optional:
offsetGap:int value
(default: 1)
- Required:
- zero or more CharFilterFactory's, followed by
- exactly one TokenizerFactory, followed by
- zero or more TokenFilterFactory's
-AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
positionIncrementGap:100,
HTMLStripCharFilter,
MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
[...]
-NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')
AnalyzerFactory will direct analysis component factories to look for resources
under the directory specified in the "work.dir" property.
Properties
SupportsParams
Analyzer factory construction task. The name given to the constructed factory may be given to NewAnalyzerTask, which will call Create().
Declaration
public override bool SupportsParams { get; }
Property Value
Type | Description |
---|---|
bool |
Overrides
Remarks
Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,
Specify params in a comma separated list of the following, in order:- Required:
name:analyzer-factory-name
- Optional:
positionIncrementGap:int value
(default: 0) - Optional:
offsetGap:int value
(default: 1)
- Required:
- zero or more CharFilterFactory's, followed by
- exactly one TokenizerFactory, followed by
- zero or more TokenFilterFactory's
-AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
positionIncrementGap:100,
HTMLStripCharFilter,
MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
[...]
-NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')
AnalyzerFactory will direct analysis component factories to look for resources
under the directory specified in the "work.dir" property.
See Also
Methods
DoLogic()
Perform the task once (ignoring repetitions specification). Return number of work items done by this task. For indexing that can be number of docs added. For warming that can be number of scanned items, etc.
Declaration
public override int DoLogic()
Returns
Type | Description |
---|---|
int | Number of work items done by this task. |
Overrides
Remarks
Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,
Specify params in a comma separated list of the following, in order:- Required:
name:analyzer-factory-name
- Optional:
positionIncrementGap:int value
(default: 0) - Optional:
offsetGap:int value
(default: 1)
- Required:
- zero or more CharFilterFactory's, followed by
- exactly one TokenizerFactory, followed by
- zero or more TokenFilterFactory's
-AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
positionIncrementGap:100,
HTMLStripCharFilter,
MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
[...]
-NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')
AnalyzerFactory will direct analysis component factories to look for resources
under the directory specified in the "work.dir" property.
GetLineNumber(StreamTokenizer)
Returns the current line in the algorithm file
Declaration
public virtual int GetLineNumber(StreamTokenizer stok)
Parameters
Type | Name | Description |
---|---|---|
StreamTokenizer | stok |
Returns
Type | Description |
---|---|
int |
Remarks
Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,
Specify params in a comma separated list of the following, in order:- Required:
name:analyzer-factory-name
- Optional:
positionIncrementGap:int value
(default: 0) - Optional:
offsetGap:int value
(default: 1)
- Required:
- zero or more CharFilterFactory's, followed by
- exactly one TokenizerFactory, followed by
- zero or more TokenFilterFactory's
-AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
positionIncrementGap:100,
HTMLStripCharFilter,
MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
[...]
-NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')
AnalyzerFactory will direct analysis component factories to look for resources
under the directory specified in the "work.dir" property.
LookupAnalysisClass(string, Type)
This method looks up a class with its fully qualified name (FQN), or a short-name class-simplename, or with a package suffix, assuming "Lucene.Net.Analysis." as the namespace prefix (e.g. "standard.ClassicTokenizerFactory" -> "Lucene.Net.Analysis.Standard.ClassicTokenizerFactory").
Declaration
public virtual Type LookupAnalysisClass(string className, Type expectedType)
Parameters
Type | Name | Description |
---|---|---|
string | className | The namespace qualified name or the short name of the class. |
Type | expectedType | The superclass |
Returns
Type | Description |
---|---|
Type | The loaded type. |
Remarks
If className
contains a period, the class is first looked up as-is, assuming that it
is an FQN. If this fails, lookup is retried after prepending the Lucene analysis
package prefix to the class name.
className
does not contain a period, the analysis SPI *Factory.LookupClass()
methods are used to find the class.
Exceptions
Type | Condition |
---|---|
TypeLoadException | If lookup fails. |
SetParams(string)
Sets the params. Analysis component factory names may optionally include the "Factory" suffix.
Declaration
public override void SetParams(string @params)
Parameters
Type | Name | Description |
---|---|---|
string | params | analysis pipeline specification: name, (optional) positionIncrementGap, (optional) offsetGap, 0+ CharFilterFactory's, 1 TokenizerFactory, and 0+ TokenFilterFactory's |
Overrides
Remarks
Params are in the form argname:argvalue or argname:"argvalue" or argname:'argvalue'; use backslashes to escape '"' or "'" inside a quoted value when it's used as the enclosing quotation mark,
Specify params in a comma separated list of the following, in order:- Required:
name:analyzer-factory-name
- Optional:
positionIncrementGap:int value
(default: 0) - Optional:
offsetGap:int value
(default: 1)
- Required:
- zero or more CharFilterFactory's, followed by
- exactly one TokenizerFactory, followed by
- zero or more TokenFilterFactory's
-AnalyzerFactory(name:'strip html, fold to ascii, whitespace tokenize, max 10k tokens',
positionIncrementGap:100,
HTMLStripCharFilter,
MappingCharFilter(mapping:'mapping-FoldToASCII.txt'),
WhitespaceTokenizer(luceneMatchVersion:LUCENE_43),
TokenLimitFilter(maxTokenCount:10000, consumeAllTokens:false))
[...]
-NewAnalyzer('strip html, fold to ascii, whitespace tokenize, max 10k tokens')
AnalyzerFactory will direct analysis component factories to look for resources
under the directory specified in the "work.dir" property.