Namespace Lucene.Net.Analysis.In
Analysis components for Indian languages.
Classes
IndicNormalizationFilter
A Lucene.Net.Analysis.TokenFilter that applies IndicNormalizer to normalize text in Indian Languages.
IndicNormalizationFilterFactory
Factory for IndicNormalizationFilter.
<fieldType name="text_innormal" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.IndicNormalizationFilterFactory"/>
</analyzer>
</fieldType>
IndicNormalizer
Normalizes the Unicode representation of text in Indian languages.
Follows guidelines from Unicode 5.2, chapter 6, South Asian Scripts I and graphical decompositions from http://ldc.upenn.edu/myl/IndianScriptsUnicode.html
IndicTokenizer
Simple Tokenizer for text in Indian Languages.