Lucene.Net: core/Analysis/TokenStream.cs File Reference

Lucene.Net 3.0.3

Lucene.Net is a port of the Lucene search engine library, written in C# and targeted at .NET runtime users.

•All Classes Namespaces Files Functions Variables Typedefs Enumerations Properties Pages

Go to the source code of this file.

Classes
class	Lucene.Net.Analysis.TokenStream
	A `TokenStream` enumerates the sequence of tokens, either from Fields of a Document or from query text. This is an abstract class. Concrete subclasses are: Tokenizer, a `TokenStream` whose input is a Reader; and TokenFilter, a `TokenStream` whose input is another `TokenStream`. A new `TokenStream` API has been introduced with Lucene 2.9. This API has moved from being Token based to IAttribute based. While Token still exists in 2.9 as a convenience class, the preferred way to store the information of a Token is to use Util.Attributes. `TokenStream` now extends AttributeSource, which provides access to all of the token IAttributes for the `TokenStream`. Note that only one instance per Util.Attribute is created and reused for every token. This approach reduces object creation and allows local caching of references to the Util.Attributes. See IncrementToken() for further details. The workflow of the new `TokenStream` API is as follows: Instantiation of `TokenStream`/TokenFilters which add/get attributes to/from the AttributeSource. The consumer calls TokenStream.Reset(). The consumer retrieves attributes from the stream and stores local references to all attributes it wants to access The consumer calls IncrementToken() until it returns false and consumes the attributes after each call. The consumer calls End() so that any end-of-stream operations can be performed. The consumer calls Close() to release any resource when finished using the `TokenStream` To make sure that filters and consumers know which attributes are available, the attributes must be added during instantiation. Filters and consumers are not required to check for availability of attributes in IncrementToken(). You can find some example code for the new API in the analysis package level Javadoc. Sometimes it is desirable to capture a current state of a `TokenStream` , e. g. for buffering purposes (see CachingTokenFilter, TeeSinkTokenFilter). For this usecase AttributeSource.CaptureState and AttributeSource.RestoreState can be used. More...

Namespaces
package	Lucene.Net.Analysis

Typedefs
using	Document = Lucene.Net.Documents.Document

using	Field = Lucene.Net.Documents.Field

using	IndexWriter = Lucene.Net.Index.IndexWriter

using	AttributeSource = Lucene.Net.Util.AttributeSource

Typedef Documentation

using AttributeSource = Lucene.Net.Util.AttributeSource

Definition at line 23 of file TokenStream.cs.

using Document = Lucene.Net.Documents.Document

Definition at line 20 of file TokenStream.cs.

using Field = Lucene.Net.Documents.Field

Definition at line 21 of file TokenStream.cs.

using IndexWriter = Lucene.Net.Index.IndexWriter

Definition at line 22 of file TokenStream.cs.