Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Analysis.Util

    Classes

    CharArrayIterator

    A CharacterIterator used internally for use with ICU4N.Text.BreakIterator

    Note

    This API is for internal purposes only and might change in incompatible ways in the next release.

    SegmentingTokenizerBase

    Breaks text into sentences with a ICU4N.Text.BreakIterator and allows subclasses to decompose these sentences into words.

    This can be used by subclasses that need sentence context for tokenization purposes, such as CJK segmenters.

    Additionally it can be used by subclasses that want to mark sentence boundaries (with a custom attribute, extra token, position increment, etc) for downstream processing.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Back to top Copyright © 2022 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.