Namespace Lucene.Net.Analysis.Phonetic
Analysis for indexing phonetic signatures (for sounds-alike search)
For an introduction to Lucene's analysis API, see the Lucene.Net.Analysis namespace documentation.
This module provides analysis components (using encoders ported to .NET from Apache Commons Codec) that index and search phonetic signatures.
Classes
BeiderMorseFilter
TokenFilter for Beider-Morse phonetic encoding.
Note
This API is experimental and might change in incompatible ways in the next release.
BeiderMorseFilterFactory
Factory for Beider
<fieldType name="text_bm" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.BeiderMorseFilterFactory"
nameType="GENERIC" ruleType="APPROX"
concat="true" languageSet="auto"
</filter>
</analyzer>
</fieldType>
DoubleMetaphoneFilter
Filter for DoubleMetaphone (supporting secondary codes)
DoubleMetaphoneFilterFactory
Factory for Double
<fieldType name="text_dblmtphn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.DoubleMetaphoneFilterFactory" inject="true" maxCodeLength="4"/>
</analyzer>
</fieldType>
PhoneticFilter
Create tokens for phonetic matches. See the Language namespace.
PhoneticFilterFactory
Factory for Phonetic
- encoder required, one of "DoubleMetaphone", "Metaphone", "Soundex", "RefinedSoundex", "Caverphone" (v2.0), or "ColognePhonetic" (case insensitive). If encoder isn't one of these, it'll be resolved as a class name either by itself if it already contains a '.' or otherwise as in the same package as these others.
- inject (default=true) add tokens to the stream with the offset=0
- maxCodeLength The maximum length of the phonetic codes, as defined by the encoder. If an encoder doesn't support this then specifying this is an error.
<fieldType name="text_phonetic" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>
</analyzer>
</fieldType>