Show / Hide Table of Contents

    Class CharacterUtils

    CharacterUtils provides a unified interface to Character-related operations to implement backwards compatible character operations based on a LuceneVersion instance.

    This is a Lucene.NET INTERNAL API, use at your own risk
    Inheritance
    System.Object
    CharacterUtils
    Namespace: Lucene.Net.Analysis.Util
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public abstract class CharacterUtils : object

    Methods

    | Improve this Doc View Source

    CodePointAt(ICharSequence, Int32)

    Declaration
    public abstract int CodePointAt(ICharSequence seq, int offset)
    Parameters
    Type Name Description
    ICharSequence seq
    System.Int32 offset
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    CodePointAt(Char[], Int32, Int32)

    Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the LuceneVersion passed to GetInstance(LuceneVersion) this method mimics the behavior of Character.CodePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

    Declaration
    public abstract int CodePointAt(char[] chars, int offset, int limit)
    Parameters
    Type Name Description
    System.Char[] chars

    a character array

    System.Int32 offset

    the offset to the char values in the chars array to be converted

    System.Int32 limit

    the index afer the last element that should be used to calculate codepoint.

    Returns
    Type Description
    System.Int32

    the Unicode code point at the given index

    | Improve this Doc View Source

    CodePointAt(String, Int32)

    Returns the code point at the given index of the ICharSequence. Depending on the LuceneVersion passed to GetInstance(LuceneVersion) this method mimics the behavior of Character.CodePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

    Declaration
    public abstract int CodePointAt(string seq, int offset)
    Parameters
    Type Name Description
    System.String seq

    a character sequence

    System.Int32 offset

    the offset to the char values in the chars array to be converted

    Returns
    Type Description
    System.Int32

    the Unicode code point at the given index

    | Improve this Doc View Source

    CodePointCount(String)

    Return the number of characters in seq.

    Declaration
    public abstract int CodePointCount(string seq)
    Parameters
    Type Name Description
    System.String seq
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    Fill(CharacterUtils.CharacterBuffer, TextReader)

    Convenience method which calls Fill(buffer, reader, buffer.Buffer.Length).

    Declaration
    public virtual bool Fill(CharacterUtils.CharacterBuffer buffer, TextReader reader)
    Parameters
    Type Name Description
    CharacterUtils.CharacterBuffer buffer
    TextReader reader
    Returns
    Type Description
    System.Boolean
    | Improve this Doc View Source

    Fill(CharacterUtils.CharacterBuffer, TextReader, Int32)

    Fills the CharacterUtils.CharacterBuffer with characters read from the given reader . This method tries to read

    numChars
    characters into the CharacterUtils.CharacterBuffer, each call to fill will start filling the buffer from offset 0 up to numChars. In case code points can span across 2 java characters, this method may only fill numChars - 1 characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the .

    Depending on the LuceneVersion passed to GetInstance(LuceneVersion) this method implements supplementary character awareness when filling the given buffer. For all LuceneVersion > 3.0 Fill(CharacterUtils.CharacterBuffer, TextReader, Int32) guarantees that the given CharacterUtils.CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.

    A return value of false means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether buffer.Length > 0.

    Declaration
    public abstract bool Fill(CharacterUtils.CharacterBuffer buffer, TextReader reader, int numChars)
    Parameters
    Type Name Description
    CharacterUtils.CharacterBuffer buffer

    the buffer to fill.

    TextReader reader

    the reader to read characters from.

    System.Int32 numChars

    the number of chars to read

    Returns
    Type Description
    System.Boolean
    false
    if and only if reader.read returned -1 while trying to fill the buffer
    | Improve this Doc View Source

    GetInstance(LuceneVersion)

    Returns a CharacterUtils implementation according to the given LuceneVersion instance.

    Declaration
    public static CharacterUtils GetInstance(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    LuceneVersion matchVersion

    a version instance

    Returns
    Type Description
    CharacterUtils

    a CharacterUtils implementation according to the given LuceneVersion instance.

    | Improve this Doc View Source

    GetJava4Instance(LuceneVersion)

    Return a CharacterUtils instance compatible with Java 1.4.

    Declaration
    public static CharacterUtils GetJava4Instance(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    LuceneVersion matchVersion
    Returns
    Type Description
    CharacterUtils
    | Improve this Doc View Source

    NewCharacterBuffer(Int32)

    Creates a new CharacterUtils.CharacterBuffer and allocates a char[] of the given bufferSize.

    Declaration
    public static CharacterUtils.CharacterBuffer NewCharacterBuffer(int bufferSize)
    Parameters
    Type Name Description
    System.Int32 bufferSize

    the internal char buffer size, must be >= 2

    Returns
    Type Description
    CharacterUtils.CharacterBuffer

    a new CharacterUtils.CharacterBuffer instance.

    | Improve this Doc View Source

    OffsetByCodePoints(Char[], Int32, Int32, Int32, Int32)

    Return the index within buf[start:start+count] which is by offset code points from index.

    Declaration
    public abstract int OffsetByCodePoints(char[] buf, int start, int count, int index, int offset)
    Parameters
    Type Name Description
    System.Char[] buf
    System.Int32 start
    System.Int32 count
    System.Int32 index
    System.Int32 offset
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    ToChars(Int32[], Int32, Int32, Char[], Int32)

    Converts a sequence of unicode code points to a sequence of .NET characters.

    Declaration
    public int ToChars(int[] src, int srcOff, int srcLen, char[] dest, int destOff)
    Parameters
    Type Name Description
    System.Int32[] src
    System.Int32 srcOff
    System.Int32 srcLen
    System.Char[] dest
    System.Int32 destOff
    Returns
    Type Description
    System.Int32

    the number of chars written to the destination buffer

    | Improve this Doc View Source

    ToCodePoints(Char[], Int32, Int32, Int32[], Int32)

    Converts a sequence of .NET characters to a sequence of unicode code points.

    Declaration
    public int ToCodePoints(char[] src, int srcOff, int srcLen, int[] dest, int destOff)
    Parameters
    Type Name Description
    System.Char[] src
    System.Int32 srcOff
    System.Int32 srcLen
    System.Int32[] dest
    System.Int32 destOff
    Returns
    Type Description
    System.Int32

    the number of code points written to the destination buffer

    | Improve this Doc View Source

    ToLower(Char[], Int32, Int32)

    Converts each unicode codepoint to lowerCase via starting at the given offset.

    Declaration
    public virtual void ToLower(char[] buffer, int offset, int limit)
    Parameters
    Type Name Description
    System.Char[] buffer

    the char buffer to lowercase

    System.Int32 offset

    the offset to start at

    System.Int32 limit

    the max char in the buffer to lower case

    | Improve this Doc View Source

    ToUpper(Char[], Int32, Int32)

    Converts each unicode codepoint to UpperCase via starting at the given offset.

    Declaration
    public virtual void ToUpper(char[] buffer, int offset, int limit)
    Parameters
    Type Name Description
    System.Char[] buffer

    the char buffer to UPPERCASE

    System.Int32 offset

    the offset to start at

    System.Int32 limit

    the max char in the buffer to lower case

    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)