Namespace Lucene.Net.Analysis.Payloads
Provides various convenience classes for creating payloads on Tokens.
Classes
AbstractEncoder
Base class for payload encoders.
DelimitedPayloadTokenFilter
Characters before the delimiter are the "token", those after are the payload.
For example, if the delimiter is '|', then for the string "foo|bar", foo is the token and "bar" is a payload.
Note, you can also include a IPayloadEncoder to convert the payload in an appropriate way (from characters to bytes).
Note make sure your Tokenizer doesn't split on the delimiter, or this won't work
DelimitedPayloadTokenFilterFactory
Factory for DelimitedPayloadTokenFilter.
<fieldType name="text_dlmtd" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float" delimiter="|"/>
</analyzer>
</fieldType>
IdentityEncoder
Does nothing other than convert the char array to a byte array using the specified encoding.
IntegerEncoder
Encode a character array
NumericPayloadTokenFilter
Assigns a payload to a token based on the Type
NumericPayloadTokenFilterFactory
Factory for NumericPayloadTokenFilter.
<fieldType name="text_numpayload" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.NumericPayloadTokenFilterFactory" payload="24" typeMatch="word"/>
</analyzer>
</fieldType>
PayloadHelper
Utility methods for encoding payloads.
SingleEncoder
Encode a character array
NOTE: This was FloatEncoder in Lucene
TokenOffsetPayloadTokenFilter
Adds the StartOffset and EndOffset First 4 bytes are the start
TokenOffsetPayloadTokenFilterFactory
Factory for TokenOffsetPayloadTokenFilter.
<fieldType name="text_tokenoffset" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.TokenOffsetPayloadTokenFilterFactory"/>
</analyzer>
</fieldType>
TypeAsPayloadTokenFilter
Makes the Type a payload.
Encodes the type using System.Text.Encoding.UTF8.GetBytes(string)
TypeAsPayloadTokenFilterFactory
Factory for TypeAsPayloadTokenFilter.
<fieldType name="text_typeaspayload" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.TypeAsPayloadTokenFilterFactory"/>
</analyzer>
</fieldType>
Interfaces
IPayloadEncoder
Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to BytesRef.
NOTE: This interface is subject to change