• API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.Util

    Classes

    CharArrayIterator

    A CharacterIterator used internally for use with ICU4N.Text.BreakIterator

    This is a Lucene.NET INTERNAL API, use at your own risk

    SegmentingTokenizerBase

    Breaks text into sentences with a ICU4N.Text.BreakIterator and allows subclasses to decompose these sentences into words.

    This can be used by subclasses that need sentence context for tokenization purposes, such as CJK segmenters.

    Additionally it can be used by subclasses that want to mark sentence boundaries (with a custom attribute, extra token, position increment, etc) for downstream processing.

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)