Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.De

    Analyzer for German.

    Classes

    GermanAnalyzer

    Lucene.Net.Analysis.Analyzer for German language.

    Supports an external list of stopwords (words that will not be indexed at all) and an external list of exclusions (word that will not be stemmed, but indexed). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.

    You must specify the required Lucene.Net.Util.LuceneVersion compatibility when creating GermanAnalyzer:

    NOTE: This class uses the same Lucene.Net.Util.LuceneVersion dependent settings as StandardAnalyzer.

    GermanLightStemFilter

    A Lucene.Net.Analysis.TokenFilter that applies GermanLightStemmer to stem German words.

    To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the KeywordAttribute before this Lucene.Net.Analysis.TokenStream.

    GermanLightStemFilterFactory

    Factory for GermanLightStemFilter.

    <fieldType name="text_delgtstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.GermanLightStemFilterFactory"/>
      </analyzer>
    </fieldType>

    GermanLightStemmer

    Light Stemmer for German.

    This stemmer implements the "UniNE" algorithm in: Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages Jacques Savoy

    GermanMinimalStemFilter

    A Lucene.Net.Analysis.TokenFilter that applies GermanMinimalStemmer to stem German words.

    To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the KeywordAttribute before this Lucene.Net.Analysis.TokenStream.

    GermanMinimalStemFilterFactory

    Factory for GermanMinimalStemFilter.

    <fieldType name="text_deminstem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.GermanMinimalStemFilterFactory"/>
      </analyzer>
    </fieldType>

    GermanMinimalStemmer

    Minimal Stemmer for German.

    This stemmer implements the following algorithm: Morphologie et recherche d'information Jacques Savoy.

    GermanNormalizationFilter

    Normalizes German characters according to the heuristics of the http://snowball.tartarus.org/algorithms/german2/stemmer.html German2 snowball algorithm. It allows for the fact that ä, ö and ü are sometimes written as ae, oe and ue.

    This is useful if you want this normalization without using the German2 stemmer, or perhaps no stemming at all.

    GermanNormalizationFilterFactory

    Factory for GermanNormalizationFilter.

    <fieldType name="text_denorm" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.GermanNormalizationFilterFactory"/>
      </analyzer>
    </fieldType>

    GermanStemFilter

    A Lucene.Net.Analysis.TokenFilter that stems German words.

    It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a GermanStemmer).

    To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom Lucene.Net.Analysis.TokenFilter that sets the KeywordAttribute before this Lucene.Net.Analysis.TokenStream.

    GermanStemFilterFactory

    Factory for GermanStemFilter.

    <fieldType name="text_destem" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.GermanStemFilterFactory"/>
      </analyzer>
    </fieldType>

    GermanStemmer

    A stemmer for German words.

    The algorithm is based on the report "A Fast and Simple Stemming Algorithm for German Words" by Jörg Caumanns (joerg.caumanns at isst.fhg.de).

    • Improve this Doc
    Back to top Copyright © 2020 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.