Class ArabicNormalizer
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Inherited Members
Namespace: Lucene.Net.Analysis.Ar
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public class ArabicNormalizer
  Fields
ALEF
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char ALEF = 'ا'
  Field Value
| Type | Description | 
|---|---|
| char | 
ALEF_HAMZA_ABOVE
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char ALEF_HAMZA_ABOVE = 'أ'
  Field Value
| Type | Description | 
|---|---|
| char | 
ALEF_HAMZA_BELOW
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char ALEF_HAMZA_BELOW = 'إ'
  Field Value
| Type | Description | 
|---|---|
| char | 
ALEF_MADDA
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char ALEF_MADDA = 'آ'
  Field Value
| Type | Description | 
|---|---|
| char | 
DAMMA
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char DAMMA = 'ُ'
  Field Value
| Type | Description | 
|---|---|
| char | 
DAMMATAN
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char DAMMATAN = 'ٌ'
  Field Value
| Type | Description | 
|---|---|
| char | 
DOTLESS_YEH
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char DOTLESS_YEH = 'ى'
  Field Value
| Type | Description | 
|---|---|
| char | 
FATHA
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char FATHA = 'َ'
  Field Value
| Type | Description | 
|---|---|
| char | 
FATHATAN
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char FATHATAN = 'ً'
  Field Value
| Type | Description | 
|---|---|
| char | 
HEH
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char HEH = 'ه'
  Field Value
| Type | Description | 
|---|---|
| char | 
KASRA
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char KASRA = 'ِ'
  Field Value
| Type | Description | 
|---|---|
| char | 
KASRATAN
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char KASRATAN = 'ٍ'
  Field Value
| Type | Description | 
|---|---|
| char | 
SHADDA
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char SHADDA = 'ّ'
  Field Value
| Type | Description | 
|---|---|
| char | 
SUKUN
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char SUKUN = 'ْ'
  Field Value
| Type | Description | 
|---|---|
| char | 
TATWEEL
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char TATWEEL = 'ـ'
  Field Value
| Type | Description | 
|---|---|
| char | 
TEH_MARBUTA
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char TEH_MARBUTA = 'ة'
  Field Value
| Type | Description | 
|---|---|
| char | 
YEH
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer. Normalization is defined as:- Normalization of hamza with alef seat to a bare alef.
 - Normalization of teh marbuta to heh
 - Normalization of dotless yeh (alef maksura) to yeh.
 - Removal of Arabic diacritics (the harakat)
 - Removal of tatweel (stretching character).
 
Declaration
public const char YEH = 'ي'
  Field Value
| Type | Description | 
|---|---|
| char | 
Methods
Normalize(char[], int)
Normalize an input buffer of Arabic text
Declaration
public virtual int Normalize(char[] s, int len)
  Parameters
| Type | Name | Description | 
|---|---|---|
| char[] | s | input buffer  | 
      
| int | len | length of input buffer  | 
      
Returns
| Type | Description | 
|---|---|
| int | length of input buffer after normalization  |