Class ColognePhonetic
Encodes a string into a Cologne Phonetic value.
Inheritance
System.Object
ColognePhonetic
Implements
Namespace: Lucene.Net.Analysis.Phonetic.Language
Assembly: Lucene.Net.Analysis.Phonetic.dll
Syntax
public class ColognePhonetic : object, IStringEncoder
Remarks
Implements the KÖlner Phonetik (Cologne Phonetic) algorithm issued by Hans Joachim Postel in 1969.
The KÖlner Phonetik is a phonetic algorithm which is optimized for the German language. It is related to the well-known soundex algorithm.
Algorithm
- Step 1:
After preprocessing (conversion to upper case, transcription of germanic umlauts, removal of non alphabetical characters) the
letters of the supplied text are replaced by their phonetic code according to the following table.
LetterContextCode A, E, I, J, O, U, Y0 H- B1 Pnot before H1 D, Tnot before C, S, Z2 F, V, W3 Pbefore H3 G, K, Q4 Ct onset before A, H, K, L, O, Q, R, U, X OR
before A, H, K, O, Q, U, X except after S, Z4Xnot after C, K, Q48 L5 M, N6 R7 S, Z8 Cafter S, Z OR
at onset except before A, H, K, L, O, Q, R, U, XOR
not before A, H, K, O, Q, U, X 8D, Tbefore C, S, Z8 Xafter C, K, Q8 (Source: Wikipedia (de): KÖlner Phonetik -- Buchstabencodes)
Example:
"MÜller-LÜdenscheidt" => "MULLERLUDENSCHEIDT" => "6005507500206880022"
- Step 2:
Collapse of all multiple consecutive code digits.
Example:
"6005507500206880022" => "6050750206802"
- Step 3:
Removal of all codes "0" except at the beginning. This means that two or more identical consecutive digits can occur
if they occur after removing the "0" digits.
Example:
"6050750206802" => "65752682"
This class is thread-safe.
See: Wikipedia (de): Kölner Phonetik (in German)
since 1.5
Methods
| Improve this Doc View SourceEncode(String)
Declaration
public virtual string Encode(string text)
Parameters
Type | Name | Description |
---|---|---|
System.String | text |
Returns
Type | Description |
---|---|
System.String |
GetColognePhonetic(String)
Implements the Kölner Phonetik algorithm.
In contrast to the initial description of the algorithm, this implementation does the encoding in one pass.
Declaration
public virtual string GetColognePhonetic(string text)
Parameters
Type | Name | Description |
---|---|---|
System.String | text |
Returns
Type | Description |
---|---|
System.String | The corresponding encoding according to the Kölner Phonetik algorithm |
IsEncodeEqual(String, String)
Declaration
public virtual bool IsEncodeEqual(string text1, string text2)
Parameters
Type | Name | Description |
---|---|---|
System.String | text1 | |
System.String | text2 |
Returns
Type | Description |
---|---|
System.Boolean |