The Token type exposes the following members.

Constructors

  NameDescription
Public methodToken()()()()
Constructs a Token will null text.
Public methodToken(Int32, Int32)
Constructs a Token with null text and start & end offsets.
Public methodToken(Int32, Int32, Int32)
Constructs a Token with null text and start & end offsets plus flags. NOTE: flags is EXPERIMENTAL.
Public methodToken(Int32, Int32, String)
Constructs a Token with null text and start & end offsets plus the Token type.
Public methodToken(String, Int32, Int32)
Constructs a Token with the given term text, and start & end offsets. The type defaults to "word." NOTE: for better indexing speed you should instead use the char[] termBuffer methods to set the term text.
Public methodToken(String, Int32, Int32, Int32)
Constructs a Token with the given text, start and end offsets, & type. NOTE: for better indexing speed you should instead use the char[] termBuffer methods to set the term text.
Public methodToken(String, Int32, Int32, String)
Constructs a Token with the given text, start and end offsets, & type. NOTE: for better indexing speed you should instead use the char[] termBuffer methods to set the term text.
Public methodToken(array<Char>[]()[][], Int32, Int32, Int32, Int32)
Constructs a Token with the given term buffer (offset & length), start and end offsets

Methods

  NameDescription
Public methodClear
Resets the term text, payload, flags, and positionIncrement, startOffset, endOffset and token type to default.
(Overrides AttributeImpl..::..Clear()()()().)
Public methodClone()()()() (Overrides AttributeImpl..::..Clone()()()().)
Public methodClone(array<Char>[]()[][], Int32, Int32, Int32, Int32)
Makes a clone, but replaces the term buffer & start/end offset in the process. This is more efficient than doing a full clone (and then calling setTermBuffer) because it saves a wasted copy of the old termBuffer.
Public methodCopyTo (Overrides AttributeImpl..::..CopyTo(AttributeImpl).)
Public methodEndOffset
Returns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text. The length of the token in the source text is (endOffset - startOffset).
Public methodEquals (Overrides AttributeImpl..::..Equals(Object).)
Protected methodFinalize
Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetFlags
EXPERIMENTAL: While we think this is here to stay, we may want to change it to be a long.

Get the bitset for any bits that have been set. This is completely distinct from {@link #Type()}, although they do share similar purposes. The flags can be used to encode information about the token for use by other {@link Lucene.Net.Analysis.TokenFilter}s.

Public methodGetHashCode (Overrides AttributeImpl..::..GetHashCode()()()().)
Public methodGetPayload
Returns this Token's payload.
Public methodGetPositionIncrement
Returns the position increment of this Token.
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Public methodReinit(Token)
Copy the prototype token's fields into this one. Note: Payloads are shared.
Public methodReinit(Token, String)
Copy the prototype token's fields into this one, with a different term. Note: Payloads are shared.
Public methodReinit(String, Int32, Int32)
Shorthand for calling {@link #clear}, {@link #SetTermBuffer(String)}, {@link #setStartOffset}, {@link #setEndOffset} {@link #setType} on Token.DEFAULT_TYPE
Public methodReinit(String, Int32, Int32, String)
Shorthand for calling {@link #clear}, {@link #SetTermBuffer(String)}, {@link #setStartOffset}, {@link #setEndOffset} {@link #setType}
Public methodReinit(Token, array<Char>[]()[][], Int32, Int32)
Copy the prototype token's fields into this one, with a different term. Note: Payloads are shared.
Public methodReinit(array<Char>[]()[][], Int32, Int32, Int32, Int32)
Shorthand for calling {@link #clear}, {@link #SetTermBuffer(char[], int, int)}, {@link #setStartOffset}, {@link #setEndOffset} {@link #setType} on Token.DEFAULT_TYPE
Public methodReinit(String, Int32, Int32, Int32, Int32)
Shorthand for calling {@link #clear}, {@link #SetTermBuffer(String, int, int)}, {@link #setStartOffset}, {@link #setEndOffset} {@link #setType} on Token.DEFAULT_TYPE
Public methodReinit(array<Char>[]()[][], Int32, Int32, Int32, Int32, String)
Shorthand for calling {@link #clear}, {@link #SetTermBuffer(char[], int, int)}, {@link #setStartOffset}, {@link #setEndOffset}, {@link #setType}
Public methodReinit(String, Int32, Int32, Int32, Int32, String)
Shorthand for calling {@link #clear}, {@link #SetTermBuffer(String, int, int)}, {@link #setStartOffset}, {@link #setEndOffset} {@link #setType}
Public methodResizeTermBuffer
Grows the termBuffer to at least size newSize, preserving the existing content. Note: If the next operation is to change the contents of the term buffer use {@link #SetTermBuffer(char[], int, int)}, {@link #SetTermBuffer(String)}, or {@link #SetTermBuffer(String, int, int)} to optimally combine the resize with the setting of the termBuffer.
Public methodSetEndOffset
Set the ending offset.
Public methodSetFlags
Public methodSetOffset
Set the starting and ending offset. See StartOffset() and EndOffset()
Public methodSetPayload
Sets this Token's payload.
Public methodSetPositionIncrement
Set the position increment. This determines the position of this token relative to the previous Token in a {@link TokenStream}, used in phrase searching.

The default value is one.

Some common uses for this are:

  • Set it to zero to put multiple terms in the same position. This is useful if, e.g., a word has multiple stems. Searches for phrases including either stem will match. In this case, all but the first stem's increment should be set to zero: the increment of the first instance should be one. Repeating a token with an increment of zero can also be used to boost the scores of matches on that token.
  • Set it to values greater than one to inhibit exact phrase matches. If, for example, one does not want phrases to match across removed stop words, then one could build a stop word filter that removes stop words and also sets the increment to the number of stop words removed before each non-stop word. Then exact phrase queries will only match when the terms occur with no intervening stop words.
Public methodSetStartOffset
Set the starting offset.
Public methodSetTermBuffer(String)
Copies the contents of buffer into the termBuffer array.
Public methodSetTermBuffer(array<Char>[]()[][], Int32, Int32)
Copies the contents of buffer, starting at offset for length characters, into the termBuffer array.
Public methodSetTermBuffer(String, Int32, Int32)
Copies the contents of buffer, starting at offset and continuing for length characters, into the termBuffer array.
Public methodSetTermLength
Set number of valid characters (length of the term) in the termBuffer array. Use this to truncate the termBuffer or to synchronize with external manipulation of the termBuffer. Note: to grow the size of the array, use {@link #ResizeTermBuffer(int)} first.
Public methodSetTermText Obsolete.
Sets the Token's term text. NOTE: for better indexing speed you should instead use the char[] termBuffer methods to set the term text.
Public methodSetType
Set the lexical type.
Public methodStartOffset
Returns this Token's starting offset, the position of the first character corresponding to this token in the source text. Note that the difference between endOffset() and startOffset() may not be equal to termText.length(), as the term text may have been altered by a stemmer or some other filter.
Public methodTerm
Returns the Token's term text. This method has a performance penalty because the text is stored internally in a char[]. If possible, use {@link #TermBuffer()} and {@link #TermLength()} directly instead. If you really need a String, use this method, which is nothing more than a convenience call to new String(token.termBuffer(), 0, token.termLength())
Public methodTermBuffer
Returns the internal termBuffer character array which you can then directly alter. If the array is too small for your token, use {@link #ResizeTermBuffer(int)} to increase it. After altering the buffer be sure to call {@link #setTermLength} to record the number of valid characters that were placed into the termBuffer.
Public methodTermLength
Return number of valid characters (length of the term) in the termBuffer array.
Public methodTermText
Returns the Token's term text.
Public methodToString (Overrides AttributeImpl..::..ToString()()()().)
Public methodType
Returns this Token's lexical type. Defaults to "word".

Fields

  NameDescription
Public fieldStatic memberDEFAULT_TYPE

See Also