Class ArabicNormalizer
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer.
Normalization is defined as:
- Normalization of hamza with alef seat to a bare alef.
- Normalization of teh marbuta to heh
- Normalization of dotless yeh (alef maksura) to yeh.
- Removal of Arabic diacritics (the harakat)
- Removal of tatweel (stretching character).
Inheritance
System.Object
ArabicNormalizer
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public class ArabicNormalizer : object
Fields
|
Improve this Doc
View Source
ALEF
Declaration
public const char ALEF = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
ALEF_HAMZA_ABOVE
Declaration
public const char ALEF_HAMZA_ABOVE = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
ALEF_HAMZA_BELOW
Declaration
public const char ALEF_HAMZA_BELOW = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
ALEF_MADDA
Declaration
public const char ALEF_MADDA = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
DAMMA
Declaration
public const char DAMMA = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
DAMMATAN
Declaration
public const char DAMMATAN = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
DOTLESS_YEH
Declaration
public const char DOTLESS_YEH = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
FATHA
Declaration
public const char FATHA = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
FATHATAN
Declaration
public const char FATHATAN = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
HEH
Declaration
public const char HEH = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
KASRA
Declaration
public const char KASRA = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
KASRATAN
Declaration
public const char KASRATAN = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
SHADDA
Declaration
public const char SHADDA = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
SUKUN
Declaration
public const char SUKUN = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
TATWEEL
Declaration
public const char TATWEEL = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
TEH_MARBUTA
Declaration
public const char TEH_MARBUTA = null
Field Value
Type |
Description |
System.Char |
|
|
Improve this Doc
View Source
YEH
Declaration
public const char YEH = null
Field Value
Type |
Description |
System.Char |
|
Methods
|
Improve this Doc
View Source
Normalize(Char[], Int32)
Normalize an input buffer of Arabic text
Declaration
public virtual int Normalize(char[] s, int len)
Parameters
Type |
Name |
Description |
System.Char[] |
s |
input buffer
|
System.Int32 |
len |
length of input buffer
|
Returns
Type |
Description |
System.Int32 |
length of input buffer after normalization
|