Lucene.Net  3.0.3
Lucene.Net is a port of the Lucene search engine library, written in C# and targeted at .NET runtime users.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Properties Pages
Protected Member Functions | Protected Attributes | List of all members
Lucene.Net.Analysis.De.GermanStemmer Class Reference

A stemmer for German words. The algorithm is based on the report "A Fast and Simple Stemming Algorithm for German Words" by Jörg Caumanns (joerg.nosp@m..cau.nosp@m.manns.nosp@m.@iss.nosp@m.t.fhg.nosp@m..de). More...

Inherited by Lucene.Net.Analysis.De.GermanDIN2Stemmer.

Protected Member Functions

virtual void Substitute (StringBuilder buffer)
 Do some substitutions for the term to reduce overstemming:
 

Protected Attributes

int substCount = 0
 Amount of characters that are removed with Substitute() while stemming.
 

Detailed Description

A stemmer for German words. The algorithm is based on the report "A Fast and Simple Stemming Algorithm for German Words" by Jörg Caumanns (joerg.nosp@m..cau.nosp@m.manns.nosp@m.@iss.nosp@m.t.fhg.nosp@m..de).

Definition at line 34 of file GermanStemmer.cs.

Member Function Documentation

virtual void Lucene.Net.Analysis.De.GermanStemmer.Substitute ( StringBuilder  buffer)
protectedvirtual

Do some substitutions for the term to reduce overstemming:

  • Substitute Umlauts with their corresponding vowel: äöü -> aou, "ß" is substituted by "ss"
  • Substitute a second char of a pair of equal characters with an asterisk: ?? -> ?*
  • Substitute some common character combinations with a token: sch/ch/ei/ie/ig/st -> $/В§/%/&/#/!

Reimplemented in Lucene.Net.Analysis.De.GermanDIN2Stemmer.

Definition at line 184 of file GermanStemmer.cs.

Member Data Documentation

int Lucene.Net.Analysis.De.GermanStemmer.substCount = 0
protected

Amount of characters that are removed with Substitute() while stemming.

Definition at line 44 of file GermanStemmer.cs.


The documentation for this class was generated from the following file: