Show / Hide Table of Contents

    Class DaitchMokotoffSoundex

    Encodes a string into a Daitch-Mokotoff Soundex value.

    Inheritance
    System.Object
    DaitchMokotoffSoundex
    Implements
    IStringEncoder
    Namespace: Lucene.Net.Analysis.Phonetic.Language
    Assembly: Lucene.Net.Analysis.Phonetic.dll
    Syntax
    public class DaitchMokotoffSoundex : object, IStringEncoder
    Remarks

    The Daitch-Mokotoff Soundex algorithm is a refinement of the Russel and American Soundex algorithms, yielding greater accuracy in matching especially Slavish and Yiddish surnames with similar pronunciation but differences in spelling.

    The main differences compared to the other soundex variants are:

    • coded names are 6 digits long
    • the initial character of the name is coded
    • rules to encoded multi-character n-grams
    • multiple possible encodings for the same name (branching)

    This implementation supports branching, depending on the used method:

    • Encode(String)branching disabled, only the first code will be returned
    • GetSoundex(String)branching enabled, all codes will be returned, separated by '|'

    Note: this implementation has additional branching rules compared to the original description of the algorithm. The rules can be customized by overriding the default rules contained in the resource file Lucene.Net.Analysis.Phonetic.Language.dmrules.txt.

    This class is thread-safe.

    See: Wikipedia - Daitch-Mokotoff Soundex

    See: Avotaynu - Soundexing and Genealogy

    since 1.10

    Constructors

    | Improve this Doc View Source

    DaitchMokotoffSoundex()

    Creates a new instance with ASCII-folding enabled.

    Declaration
    public DaitchMokotoffSoundex()
    | Improve this Doc View Source

    DaitchMokotoffSoundex(Boolean)

    Creates a new instance.

    With ASCII-folding enabled, certain accented characters will be transformed to equivalent ASCII characters, e.g. è -> e.

    Declaration
    public DaitchMokotoffSoundex(bool folding)
    Parameters
    Type Name Description
    System.Boolean folding

    If ASCII-folding shall be performed before encoding.

    Methods

    | Improve this Doc View Source

    Encode(String)

    Encodes a string using the Daitch-Mokotoff soundex algorithm without branching.

    Declaration
    public virtual string Encode(string source)
    Parameters
    Type Name Description
    System.String source

    A string to encode.

    Returns
    Type Description
    System.String

    A DM Soundex code corresponding to the string supplied.

    See Also
    GetSoundex(String)
    | Improve this Doc View Source

    GetSoundex(String)

    Encodes a string using the Daitch-Mokotoff soundex algorithm with branching.

    In case a string is encoded into multiple codes (see branching rules), the result will contain all codes, separated by '|'.

    Example: the name "AUERBACH" is encoded as both

    • 097400
    • 097500

    Thus the result will be "097400|097500".

    Declaration
    public virtual string GetSoundex(string source)
    Parameters
    Type Name Description
    System.String source

    A string to encode.

    Returns
    Type Description
    System.String

    A string containing a set of DM Soundex codes corresponding to the string supplied.

    Implements

    IStringEncoder

    See Also

    Soundex
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)