[Missing <summary> documentation for "N:Lucene.Net.Analysis.Ru"]

Classes

  ClassDescription
Public classRussianAnalyzer
Analyzer for Russian language. Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.
Public classRussianCharsets
RussianCharsets class contains encodings schemes (charsets) and ToLowerCase() method implementation for russian characters in Unicode, KOI8 and CP1252. Each encoding scheme contains lowercase (positions 0-31) and uppercase (position 32-63) characters. One should be able to add other encoding schemes (like ISO-8859-5 or customized) by adding a new charset and adding logic to ToLowerCase() method for that charset.
Public classRussianLetterTokenizer
A RussianLetterTokenizer is a tokenizer that extends LetterTokenizer by additionally looking up letters in a given "russian charset". The problem with LeterTokenizer is that it uses Character.isLetter() method, which doesn't know how to detect letters in encodings like CP1252 and KOI8 (well-known problems with 0xD7 and 0xF7 chars)
Public classRussianLowerCaseFilter
Normalizes token text to lower case, analyzing given ("russian") charset.
Public classRussianStemFilter
A filter that stems Russian words. The implementation was inspired by GermanStemFilter. The input should be filtered by RussianLowerCaseFilter before passing it to RussianStemFilter, because RussianStemFilter only works with lowercase part of any "russian" charset.
Public classRussianStemmer
Russian stemming algorithm implementation (see http://snowball.sourceforge.net for detailed description).