Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.In

    Analysis components for Indian languages.

    Classes

    IndicNormalizationFilter

    A Lucene.Net.Analysis.TokenFilter that applies IndicNormalizer to normalize text in Indian Languages.

    IndicNormalizationFilterFactory

    Factory for IndicNormalizationFilter.

    <fieldType name="text_innormal" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.IndicNormalizationFilterFactory"/>
      </analyzer>
    </fieldType>

    IndicNormalizer

    Normalizes the Unicode representation of text in Indian languages.

    Follows guidelines from Unicode 5.2, chapter 6, South Asian Scripts I and graphical decompositions from http://ldc.upenn.edu/myl/IndianScriptsUnicode.html

    IndicTokenizer

    Simple Tokenizer for text in Indian Languages.

    • Improve this Doc
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.