Namespace Lucene.Net.Analysis.Hi
Classes
HindiAnalyzer
Analyzer for Hindi.
You must specify the required LuceneVersion compatibility when creating HindiAnalyzer:
- As of 3.6, StandardTokenizer is used for tokenization
HindiNormalizationFilter
A TokenFilter that applies HindiNormalizer to normalize the orthography.
In some cases the normalization may cause unrelated terms to conflate, so to prevent terms from being normalized use an instance of SetKeywordMarkerFilter or a custom TokenFilter that sets the KeywordAttribute before this TokenStream.
HindiNormalizationFilterFactory
Factory for HindiNormalizationFilter.
<fieldType name="text_hinormal" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.HindiNormalizationFilterFactory"/>
</analyzer>
</fieldType>
HindiNormalizer
Normalizer for Hindi.
Normalizes text to remove some differences in spelling variations.
Implements the Hindi-language specific algorithm specified in:
Word normalization in Indian languages
Prasad Pingali and Vasudeva Varma.
http://web2py.iiit.ac.in/publications/default/download/inproceedings.pdf.3fe5b38c-02ee-41ce-9a8f-3e745670be32.pdf
with the following additions from Hindi CLIR in Thirty Days
Leah S. Larkey, Margaret E. Connell, and Nasreen AbdulJaleel.
http://maroo.cs.umass.edu/pub/web/getpdf.php?id=454:
- Internal Zero-width joiner and Zero-width non-joiners are removed
- In addition to chandrabindu, NA+halant is normalized to anusvara
HindiStemFilter
A TokenFilter that applies HindiStemmer to stem Hindi words.
HindiStemFilterFactory
Factory for HindiStemFilter.
<fieldType name="text_histem" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.HindiStemFilterFactory"/>
</analyzer>
</fieldType>
HindiStemmer
Light Stemmer for Hindi.
Implements the algorithm specified in:
A Lightweight Stemmer for Hindi
Ananthakrishnan Ramanathan and Durgesh D Rao.
http://computing.open.ac.uk/Sites/EACLSouthAsia/Papers/p6-Ramanathan.pdf