Class CapitalizationFilterFactory
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Inherited Members
Namespace: Lucene.Net.Analysis.Miscellaneous
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public class CapitalizationFilterFactory : TokenFilterFactory
Constructors
CapitalizationFilterFactory(IDictionary<string, string>)
Creates a new CapitalizationFilterFactory
Declaration
public CapitalizationFilterFactory(IDictionary<string, string> args)
Parameters
Type | Name | Description |
---|---|---|
IDictionary<string, string> | args |
Fields
CULTURE
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string CULTURE = "culture"
Field Value
Type | Description |
---|---|
string |
FORCE_FIRST_LETTER
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string FORCE_FIRST_LETTER = "forceFirstLetter"
Field Value
Type | Description |
---|---|
string |
KEEP
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string KEEP = "keep"
Field Value
Type | Description |
---|---|
string |
KEEP_IGNORE_CASE
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string KEEP_IGNORE_CASE = "keepIgnoreCase"
Field Value
Type | Description |
---|---|
string |
MAX_TOKEN_LENGTH
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string MAX_TOKEN_LENGTH = "maxTokenLength"
Field Value
Type | Description |
---|---|
string |
MAX_WORD_COUNT
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string MAX_WORD_COUNT = "maxWordCount"
Field Value
Type | Description |
---|---|
string |
MIN_WORD_LENGTH
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string MIN_WORD_LENGTH = "minWordLength"
Field Value
Type | Description |
---|---|
string |
OK_PREFIX
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string OK_PREFIX = "okPrefix"
Field Value
Type | Description |
---|---|
string |
ONLY_FIRST_WORD
Factory for CapitalizationFilter.
The factory takes parameters: "onlyFirstWord" - should each word be capitalized or all of the words? "keep" - a keep word list. Each word that should be kept separated by whitespace. "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive. "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley" "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or" "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct. "culture" - the culture to use to apply the capitalization rules. If not supplied or the string "invariant" is supplied, the invariant culture is used.<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>
@since solr 1.3
Declaration
public const string ONLY_FIRST_WORD = "onlyFirstWord"
Field Value
Type | Description |
---|---|
string |
Methods
Create(TokenStream)
Transform the specified input Lucene.Net.Analysis.TokenStream
Declaration
public override TokenStream Create(TokenStream input)
Parameters
Type | Name | Description |
---|---|---|
TokenStream | input |
Returns
Type | Description |
---|---|
TokenStream |