Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.Payloads

    Provides various convenience classes for creating payloads on Tokens.

    Classes

    AbstractEncoder

    Base class for payload encoders.

    DelimitedPayloadTokenFilter

    Characters before the delimiter are the "token", those after are the payload.

    For example, if the delimiter is '|', then for the string "foo|bar", foo is the token and "bar" is a payload.

    Note, you can also include a IPayloadEncoder to convert the payload in an appropriate way (from characters to bytes).

    Note make sure your Lucene.Net.Analysis.Tokenizer doesn't split on the delimiter, or this won't work

    DelimitedPayloadTokenFilterFactory

    Factory for DelimitedPayloadTokenFilter.

    <fieldType name="text_dlmtd" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float" delimiter="|"/>
      </analyzer>
    </fieldType>

    IdentityEncoder

    Does nothing other than convert the char array to a byte array using the specified encoding.

    IntegerEncoder

    Encode a character array System.Int32 as a Lucene.Net.Util.BytesRef.

    See EncodeInt32(Int32, Byte[], Int32).

    NumericPayloadTokenFilter

    Assigns a payload to a token based on the Type

    NumericPayloadTokenFilterFactory

    Factory for NumericPayloadTokenFilter.

    <fieldType name="text_numpayload" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.NumericPayloadTokenFilterFactory" payload="24" typeMatch="word"/>
      </analyzer>
    </fieldType>

    PayloadHelper

    Utility methods for encoding payloads.

    SingleEncoder

    Encode a character array System.Single as a Lucene.Net.Util.BytesRef.

    NOTE: This was FloatEncoder in Lucene

    TokenOffsetPayloadTokenFilter

    Adds the StartOffset and EndOffset First 4 bytes are the start

    TokenOffsetPayloadTokenFilterFactory

    Factory for TokenOffsetPayloadTokenFilter.

    <fieldType name="text_tokenoffset" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.TokenOffsetPayloadTokenFilterFactory"/>
      </analyzer>
    </fieldType>

    TypeAsPayloadTokenFilter

    Makes the Type a payload.

    Encodes the type using System.Text.Encoding.UTF8.GetBytes(string)

    TypeAsPayloadTokenFilterFactory

    Factory for TypeAsPayloadTokenFilter.

    <fieldType name="text_typeaspayload" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.TypeAsPayloadTokenFilterFactory"/>
      </analyzer>
    </fieldType>

    Interfaces

    IPayloadEncoder

    Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to Lucene.Net.Util.BytesRef.

    NOTE: This interface is subject to change

    • Improve this Doc
    Back to top Copyright © 2020 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.