Class StandardAnalyzer

Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of English stop words.

You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating StandardAnalyzer:

As of 3.4, Hiragana and Han characters are no longer wrongly split from their combining characters. If you use a previous version number, you get the exact broken behavior for backwards compatibility.
As of 3.1, StandardTokenizer implements Unicode text segmentation, and StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords. ClassicTokenizer and ClassicAnalyzer are the pre-3.1 implementations of StandardTokenizer and StandardAnalyzer.
As of 2.9, StopFilter preserves position increments
As of 2.4, Lucene.Net.Analysis.Tokens incorrectly identified as acronyms are corrected (see LUCENE-1068)

Inheritance

object

Analyzer

StopwordAnalyzerBase

StandardAnalyzer

Implements

IDisposable

Inherited Members

StopwordAnalyzerBase.StopwordSet

Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>)

Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, ReuseStrategy)

Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, Func<string, TextReader, TextReader>)

Analyzer.NewAnonymous(Func<string, TextReader, TokenStreamComponents>, Func<string, TextReader, TextReader>, ReuseStrategy)

Analyzer.GetTokenStream(string, TextReader)

Analyzer.GetTokenStream(string, string)

Analyzer.GetPositionIncrementGap(string)

Analyzer.GetOffsetGap(string)

Analyzer.Strategy

Analyzer.Dispose()

Analyzer.GLOBAL_REUSE_STRATEGY

Analyzer.PER_FIELD_REUSE_STRATEGY

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.ReferenceEquals(object, object)

object.ToString()

Namespace: Lucene.Net.Analysis.Standard

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

public sealed class StandardAnalyzer : StopwordAnalyzerBase, IDisposable

Constructors

StandardAnalyzer(LuceneVersion)

Builds an analyzer with the default stop words (STOP_WORDS_SET).

Declaration

public StandardAnalyzer(LuceneVersion matchVersion)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	Lucene compatibility version - See StandardAnalyzer

StandardAnalyzer(LuceneVersion, CharArraySet)

Builds an analyzer with the given stop words.

Declaration

public StandardAnalyzer(LuceneVersion matchVersion, CharArraySet stopWords)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	Lucene compatibility version - See StandardAnalyzer
CharArraySet	stopWords	stop words

StandardAnalyzer(LuceneVersion, TextReader)

Builds an analyzer with the stop words from the given reader.

Declaration

public StandardAnalyzer(LuceneVersion matchVersion, TextReader stopwords)

Parameters

Type	Name	Description
LuceneVersion	matchVersion	Lucene compatibility version - See StandardAnalyzer
TextReader	stopwords	TextReader to read stop words from

Fields

DEFAULT_MAX_TOKEN_LENGTH

Default maximum allowed token length

Declaration

public const int DEFAULT_MAX_TOKEN_LENGTH = 255

Field Value

Type	Description
int

STOP_WORDS_SET

An unmodifiable set containing some common English words that are usually not useful for searching.

Declaration

public static readonly CharArraySet STOP_WORDS_SET

Field Value

Type	Description
CharArraySet

Properties

MaxTokenLength

Set maximum allowed token length. If a token is seen that exceeds this length then it is discarded. This setting only takes effect the next time tokenStream or tokenStream is called.

Declaration

public int MaxTokenLength { get; set; }

Property Value

Type	Description
int

Methods

CreateComponents(string, TextReader)

Creates a new Lucene.Net.Analysis.TokenStreamComponents instance for this analyzer.

Declaration

protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)

Parameters

Type	Name	Description
string	fieldName	the name of the fields content passed to the Lucene.Net.Analysis.TokenStreamComponents sink as a reader
TextReader	reader	the reader passed to the Lucene.Net.Analysis.Tokenizer constructor

Returns

Type	Description
TokenStreamComponents	the Lucene.Net.Analysis.TokenStreamComponents for this analyzer.

Overrides

Analyzer.CreateComponents(string, TextReader)

Implements

IDisposable

Class StandardAnalyzer

Inheritance

Implements

Inherited Members

Namespace: Lucene.Net.Analysis.Standard

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

Constructors

StandardAnalyzer(LuceneVersion)

Declaration

Parameters

StandardAnalyzer(LuceneVersion, CharArraySet)

Declaration

Parameters

StandardAnalyzer(LuceneVersion, TextReader)

Declaration

Parameters

See Also

Fields

DEFAULT_MAX_TOKEN_LENGTH

Declaration

Field Value

STOP_WORDS_SET

Declaration

Field Value

Properties

MaxTokenLength

Declaration

Property Value

Methods

CreateComponents(string, TextReader)

Declaration

Parameters

Returns

Overrides

Implements