Show / Hide Table of Contents

    Class Soundex

    Encodes a string into a Soundex value. Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes.

    This class is thread-safe. Although not strictly immutable, the Lucene.Net.Analysis.Phonetic.Language.Soundex.maxLength field is not actually used.

    Inheritance
    System.Object
    Soundex
    Implements
    IStringEncoder
    Namespace: Lucene.Net.Analysis.Phonetic.Language
    Assembly: Lucene.Net.Analysis.Phonetic.dll
    Syntax
    public class Soundex : object, IStringEncoder

    Constructors

    | Improve this Doc View Source

    Soundex()

    Creates an instance using Lucene.Net.Analysis.Phonetic.Language.Soundex.US_ENGLISH_MAPPING.

    Declaration
    public Soundex()
    See Also
    Soundex(Char[])
    Lucene.Net.Analysis.Phonetic.Language.Soundex.US_ENGLISH_MAPPING
    | Improve this Doc View Source

    Soundex(Char[])

    Creates a soundex instance using the given mapping. This constructor can be used to provide an internationalized mapping for a non-Western character set.

    Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation contains a default map for US_ENGLISH.

    If the mapping contains an instance of SILENT_MARKER then H and W are not given special treatment.

    Declaration
    public Soundex(char[] mapping)
    Parameters
    Type Name Description
    System.Char[] mapping

    Mapping array to use when finding the corresponding code for a given character.

    | Improve this Doc View Source

    Soundex(String)

    Creates a refined soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set.

    If the mapping contains an instance of SILENT_MARKER then H and W are not given special treatment.

    since 1.4

    Declaration
    public Soundex(string mapping)
    Parameters
    Type Name Description
    System.String mapping

    Mapping string to use when finding the corresponding code for a given character.

    | Improve this Doc View Source

    Soundex(String, Boolean)

    Creates a refined soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set.

    since 1.11

    Declaration
    public Soundex(string mapping, bool specialCaseHW)
    Parameters
    Type Name Description
    System.String mapping

    Mapping string to use when finding the corresponding code for a given character.

    System.Boolean specialCaseHW

    if true, then

    Fields

    | Improve this Doc View Source

    SILENT_MARKER

    The marker character used to indicate a silent (ignored) character. These are ignored except when they appear as the first character.

    Note: the US_ENGLISH_MAPPING_STRING does not use this mechanism because changing it might break existing code. Mappings that don't contain a silent marker code are treated as though H and W are silent.

    To override this, use the Soundex(String, Boolean) constructor.

    since 1.11

    Declaration
    public static readonly char SILENT_MARKER
    Field Value
    Type Description
    System.Char
    | Improve this Doc View Source

    US_ENGLISH

    An instance of Soundex using the US_ENGLISH_MAPPING mapping. This treats H and W as silent letters. Apart from when they appear as the first letter, they are ignored. They don't act as separators between duplicate codes.

    Declaration
    public static readonly Soundex US_ENGLISH
    Field Value
    Type Description
    Soundex
    See Also
    Lucene.Net.Analysis.Phonetic.Language.Soundex.US_ENGLISH_MAPPING
    US_ENGLISH_MAPPING_STRING
    | Improve this Doc View Source

    US_ENGLISH_GENEALOGY

    An instance of Soundex using the mapping as per the Genealogy site: http://www.genealogy.com/articles/research/00000060.html

    This treats vowels (AEIOUY), H and W as silent letters. Such letters are ignored (after the first) and do not act as separators when dropping duplicate codes.

    The codes for consonants are otherwise the same as for US_ENGLISH_MAPPING_STRING and US_ENGLISH_SIMPLIFIED.

    since 1.11

    Declaration
    public static readonly Soundex US_ENGLISH_GENEALOGY
    Field Value
    Type Description
    Soundex
    | Improve this Doc View Source

    US_ENGLISH_MAPPING_STRING

    This is a default mapping of the 26 letters used in US English. A value of 0 for a letter position means do not encode, but treat as a separator when it occurs between consonants with the same code.

    (This constant is provided as both an implementation convenience and to allow documentation to pick up the value for the constant values page.)

    Note that letters H and W are treated specially. They are ignored (after the first letter) and don't act as separators between consonants with the same code.

    Declaration
    public static readonly string US_ENGLISH_MAPPING_STRING
    Field Value
    Type Description
    System.String
    See Also
    Lucene.Net.Analysis.Phonetic.Language.Soundex.US_ENGLISH_MAPPING
    | Improve this Doc View Source

    US_ENGLISH_SIMPLIFIED

    An instance of Soundex using the Simplified Soundex mapping, as described here: http://west-penwith.org.uk/misc/soundex.htm

    This treats H and W the same as vowels (AEIOUY). Such letters aren't encoded (after the first), but they do act as separators when dropping duplicate codes. The mapping is otherwise the same as for US_ENGLISH.

    since 1.11

    Declaration
    public static readonly Soundex US_ENGLISH_SIMPLIFIED
    Field Value
    Type Description
    Soundex

    Properties

    | Improve this Doc View Source

    MaxLength

    Gets or Sets the maxLength. Standard Soundex

    Declaration
    public virtual int MaxLength { get; set; }
    Property Value
    Type Description
    System.Int32

    Methods

    | Improve this Doc View Source

    Difference(String, String)

    Encodes the strings and returns the number of characters in the two encoded strings that are the same. This return value ranges from 0 through 4: 0 indicates little or no similarity, and 4 indicates strong similarity or identical values.

    See: MS T-SQL DIFFERENCE

    since 1.3

    Declaration
    public virtual int Difference(string s1, string s2)
    Parameters
    Type Name Description
    System.String s1

    A string that will be encoded and compared.

    System.String s2

    A string that will be encoded and compared.

    Returns
    Type Description
    System.Int32

    The number of characters in the two encoded strings that are the same from 0 to 4.

    See Also
    Lucene.Net.Analysis.Phonetic.Language.SoundexUtils.Difference(Lucene.Net.Analysis.Phonetic.Language.IStringEncoder,System.String,System.String)
    | Improve this Doc View Source

    Encode(String)

    Encodes a string using the soundex algorithm.

    Declaration
    public virtual string Encode(string str)
    Parameters
    Type Name Description
    System.String str

    A string to encode.

    Returns
    Type Description
    System.String

    A Soundex code corresponding to the string supplied.

    | Improve this Doc View Source

    GetSoundex(String)

    Retrieves the Soundex code for a given string.

    Declaration
    public virtual string GetSoundex(string str)
    Parameters
    Type Name Description
    System.String str

    String to encode using the Soundex algorithm.

    Returns
    Type Description
    System.String

    A soundex code for the string supplied.

    Implements

    IStringEncoder
    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)