Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.En

    Analyzer for English.

    Classes

    EnglishAnalyzer

    Lucene.Net.Analysis.Analyzer for English.

    EnglishMinimalStemFilter

    A Lucene.Net.Analysis.TokenFilter that applies EnglishMinimalStemmer to stem English words.

    To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the Lucene.Net.Analysis.TokenAttributes.KeywordAttribute before this Lucene.Net.Analysis.TokenStream.

    EnglishMinimalStemFilterFactory

    Factory for EnglishMinimalStemFilter.

    <fieldType name="text_enminstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
      </analyzer>
    </fieldType>

    EnglishMinimalStemmer

    Minimal plural stemmer for English.

    This stemmer implements the "S-Stemmer" from How Effective Is Suffixing? Donna Harman.

    EnglishPossessiveFilter

    TokenFilter that removes possessives (trailing 's) from words.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating EnglishPossessiveFilter:

    • As of 3.6, U+2019 RIGHT SINGLE QUOTATION MARK and U+FF07 FULLWIDTH APOSTROPHE are also treated as quotation marks.

    EnglishPossessiveFilterFactory

    Factory for EnglishPossessiveFilter.

    <fieldType name="text_enpossessive" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPossessiveFilterFactory"/>
      </analyzer>
    </fieldType>

    KStemFilter

    A high-performance kstem filter for english.

    See "Viewing Morphology as an Inference Process" (Krovetz, R., Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 191-203, 1993).

    All terms must already be lowercased for this filter to work correctly.

    Note: This filter is aware of the Lucene.Net.Analysis.TokenAttributes.KeywordAttribute. To prevent certain terms from being passed to the stemmer IsKeyword should be set to

    true
    in a previous Lucene.Net.Analysis.TokenStream. Note: For including the original term as well as the stemmed version, see KeywordRepeatFilterFactory

    KStemFilterFactory

    Factory for KStemFilter.

    <fieldType name="text_kstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KStemFilterFactory"/>
      </analyzer>
    </fieldType>

    KStemmer

    This class implements the Kstem algorithm

    PorterStemFilter

    Transforms the token stream as per the Porter stemming algorithm.

    Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly!

    To use this filter with other analyzers, you'll want to write an Analyzer class that sets up the TokenStream chain as you want it. To use this with LowerCaseTokenizer, for example, you'd write an analyzer like this:

    class MyAnalyzer : Analyzer {
      protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader) {
        Tokenizer source = new LowerCaseTokenizer(version, reader);
        return new TokenStreamComponents(source, new PorterStemFilter(source));
      }
    }

    Note: This filter is aware of the Lucene.Net.Analysis.TokenAttributes.KeywordAttribute. To prevent certain terms from being passed to the stemmer IsKeyword should be set to

    true
    in a previous Lucene.Net.Analysis.TokenStream. Note: For including the original term as well as the stemmed version, see KeywordRepeatFilterFactory

    PorterStemFilterFactory

    Factory for PorterStemFilter.

    <fieldType name="text_porterstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>
    • Improve this Doc
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.