Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class ArabicNormalizer

    Normalizer for Arabic.

    Normalization is done in-place for efficiency, operating on a termbuffer.

    Normalization is defined as:

    • Normalization of hamza with alef seat to a bare alef.
    • Normalization of teh marbuta to heh
    • Normalization of dotless yeh (alef maksura) to yeh.
    • Removal of Arabic diacritics (the harakat)
    • Removal of tatweel (stretching character).

    Inheritance
    System.Object
    ArabicNormalizer
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: Lucene.Net.Analysis.Ar
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public class ArabicNormalizer

    Fields

    | Improve this Doc View Source

    ALEF

    Declaration
    public const char ALEF = 'ا'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    ALEF_HAMZA_ABOVE

    Declaration
    public const char ALEF_HAMZA_ABOVE = 'أ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    ALEF_HAMZA_BELOW

    Declaration
    public const char ALEF_HAMZA_BELOW = 'إ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    ALEF_MADDA

    Declaration
    public const char ALEF_MADDA = 'آ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    DAMMA

    Declaration
    public const char DAMMA = 'ُ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    DAMMATAN

    Declaration
    public const char DAMMATAN = 'ٌ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    DOTLESS_YEH

    Declaration
    public const char DOTLESS_YEH = 'ى'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    FATHA

    Declaration
    public const char FATHA = 'َ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    FATHATAN

    Declaration
    public const char FATHATAN = 'ً'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    HEH

    Declaration
    public const char HEH = 'ه'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    KASRA

    Declaration
    public const char KASRA = 'ِ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    KASRATAN

    Declaration
    public const char KASRATAN = 'ٍ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    SHADDA

    Declaration
    public const char SHADDA = 'ّ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    SUKUN

    Declaration
    public const char SUKUN = 'ْ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    TATWEEL

    Declaration
    public const char TATWEEL = 'ـ'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    TEH_MARBUTA

    Declaration
    public const char TEH_MARBUTA = 'ة'
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    YEH

    Declaration
    public const char YEH = 'ي'
    Field Value
    Type Description
    System.Char

    Methods

    | Improve this Doc View Source

    Normalize(Char[], Int32)

    Normalize an input buffer of Arabic text

    Declaration
    public virtual int Normalize(char[] s, int len)
    Parameters
    Type Name Description
    System.Char[] s

    input buffer

    System.Int32 len

    length of input buffer

    Returns
    Type Description
    System.Int32

    length of input buffer after normalization

    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.