Namespace Lucene.Net.Analysis.Lv

Analyzer for Latvian.

Classes

LatvianAnalyzer

Analyzer for Latvian.

LatvianStemFilter

A TokenFilter that applies LatvianStemmer to stem Latvian words.

To prevent terms from being stemmed use an instance of SetKeywordMarkerFilter or a custom TokenFilter that sets the KeywordAttribute before this TokenStream.

LatvianStemFilterFactory

Factory for LatvianStemFilter.

<fieldType name="text_lvstem" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.LatvianStemFilterFactory"/>
  </analyzer>
</fieldType>

LatvianStemmer

Light stemmer for Latvian.

This is a light version of the algorithm in Karlis Kreslin's PhD thesis A stemming algorithm for Latvian with the following modifications:

Only explicitly stems noun and adjective morphology
Stricter length/vowel checks for the resulting stems (verb etc suffix stripping is removed)
Removes only the primary inflectional suffixes: case and number for nouns ; case, number, gender, and definitiveness for adjectives.
Palatalization is only handled when a declension II,V,VI noun suffix is removed.