Namespace Lucene.Net.Analysis.Cz
Analyzer for Czech.
Classes
CzechAnalyzer
Lucene.
Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.
You must specify the required Lucene.
- As of 3.1, words are stemmed with Czech
Stem Filter - As of 2.9, StopFilter preserves position increments
- As of 2.4, Tokens incorrectly identified as acronyms are corrected (see LUCENE-1068)
CzechStemFilter
A Lucene.
To prevent terms from being stemmed use an instance of
Set
NOTE: Input is expected to be in lowercase, but with diacritical marks
CzechStemFilterFactory
Factory for Czech
<fieldType name="text_czstem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.CzechStemFilterFactory"/>
</analyzer>
</fieldType>
CzechStemmer
Light Stemmer for Czech.
Implements the algorithm described in:Indexing and stemming approaches for the Czech language
http://portal.acm.org/citation.cfm?id=1598600