Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class CharacterUtils

    CharacterUtils provides a unified interface to Character-related operations to implement backwards compatible character operations based on a Lucene.Net.Util.LuceneVersion instance.

    This is a Lucene.NET INTERNAL API, use at your own risk
    Inheritance
    System.Object
    CharacterUtils
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: Lucene.Net.Analysis.Util
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public abstract class CharacterUtils

    Methods

    | Improve this Doc View Source

    CodePointAt(ICharSequence, Int32)

    Returns the code point at the given index of the J2N.Text.ICharSequence. Depending on the Lucene.Net.Util.LuceneVersion passed to GetInstance(LuceneVersion) this method mimics the behavior of Character.CodePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

    Declaration
    public abstract int CodePointAt(ICharSequence seq, int offset)
    Parameters
    Type Name Description
    J2N.Text.ICharSequence seq

    a character sequence

    System.Int32 offset

    the offset to the char values in the chars array to be converted

    Returns
    Type Description
    System.Int32

    the Unicode code point at the given index

    Exceptions
    Type Condition
    System.NullReferenceException
    • if the sequence is null.
    System.ArgumentOutOfRangeException
    • if the value offset is negative or not less than the length of the character sequence.
    | Improve this Doc View Source

    CodePointAt(Char[], Int32, Int32)

    Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the Lucene.Net.Util.LuceneVersion passed to GetInstance(LuceneVersion) this method mimics the behavior of Character.CodePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

    Declaration
    public abstract int CodePointAt(char[] chars, int offset, int limit)
    Parameters
    Type Name Description
    System.Char[] chars

    a character array

    System.Int32 offset

    the offset to the char values in the chars array to be converted

    System.Int32 limit

    the index afer the last element that should be used to calculate codepoint.

    Returns
    Type Description
    System.Int32

    the Unicode code point at the given index

    Exceptions
    Type Condition
    System.NullReferenceException
    • if the array is null.
    System.ArgumentOutOfRangeException
    • if the value offset is negative or not less than the length of the char array.
    | Improve this Doc View Source

    CodePointAt(String, Int32)

    Returns the code point at the given index of the System.String. Depending on the Lucene.Net.Util.LuceneVersion passed to GetInstance(LuceneVersion) this method mimics the behavior of Character.CodePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.

    Declaration
    public abstract int CodePointAt(string seq, int offset)
    Parameters
    Type Name Description
    System.String seq

    a character sequence

    System.Int32 offset

    the offset to the char values in the chars array to be converted

    Returns
    Type Description
    System.Int32

    the Unicode code point at the given index

    Exceptions
    Type Condition
    System.NullReferenceException
    • if the sequence is null.
    System.ArgumentOutOfRangeException
    • if the value offset is negative or not less than the length of the character sequence.
    | Improve this Doc View Source

    CodePointCount(ICharSequence)

    Return the number of characters in seq.

    Declaration
    public abstract int CodePointCount(ICharSequence seq)
    Parameters
    Type Name Description
    J2N.Text.ICharSequence seq
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    CodePointCount(Char[])

    Return the number of characters in seq.

    Declaration
    public abstract int CodePointCount(char[] seq)
    Parameters
    Type Name Description
    System.Char[] seq
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    CodePointCount(String)

    Return the number of characters in seq.

    Declaration
    public abstract int CodePointCount(string seq)
    Parameters
    Type Name Description
    System.String seq
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    CodePointCount(StringBuilder)

    Return the number of characters in seq.

    Declaration
    public abstract int CodePointCount(StringBuilder seq)
    Parameters
    Type Name Description
    System.Text.StringBuilder seq
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    Fill(CharacterUtils.CharacterBuffer, TextReader)

    Convenience method which calls Fill(buffer, reader, buffer.Buffer.Length).

    Declaration
    public virtual bool Fill(CharacterUtils.CharacterBuffer buffer, TextReader reader)
    Parameters
    Type Name Description
    CharacterUtils.CharacterBuffer buffer
    System.IO.TextReader reader
    Returns
    Type Description
    System.Boolean
    | Improve this Doc View Source

    Fill(CharacterUtils.CharacterBuffer, TextReader, Int32)

    Fills the CharacterUtils.CharacterBuffer with characters read from the given reader System.IO.TextReader. This method tries to read

    numChars
    characters into the CharacterUtils.CharacterBuffer, each call to fill will start filling the buffer from offset 0 up to numChars. In case code points can span across 2 java characters, this method may only fill numChars - 1 characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the System.IO.TextReader.

    Depending on the Lucene.Net.Util.LuceneVersion passed to GetInstance(LuceneVersion) this method implements supplementary character awareness when filling the given buffer. For all Lucene.Net.Util.LuceneVersion > 3.0 Fill(CharacterUtils.CharacterBuffer, TextReader, Int32) guarantees that the given CharacterUtils.CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.

    A return value of false means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether buffer.Length > 0.

    Declaration
    public abstract bool Fill(CharacterUtils.CharacterBuffer buffer, TextReader reader, int numChars)
    Parameters
    Type Name Description
    CharacterUtils.CharacterBuffer buffer

    the buffer to fill.

    System.IO.TextReader reader

    the reader to read characters from.

    System.Int32 numChars

    the number of chars to read

    Returns
    Type Description
    System.Boolean
    false
    if and only if reader.read returned -1 while trying to fill the buffer
    Exceptions
    Type Condition
    System.IO.IOException

    if the reader throws an System.IO.IOException.

    | Improve this Doc View Source

    GetInstance(LuceneVersion)

    Returns a CharacterUtils implementation according to the given Lucene.Net.Util.LuceneVersion instance.

    Declaration
    public static CharacterUtils GetInstance(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion matchVersion

    a version instance

    Returns
    Type Description
    CharacterUtils

    a CharacterUtils implementation according to the given Lucene.Net.Util.LuceneVersion instance.

    | Improve this Doc View Source

    GetJava4Instance(LuceneVersion)

    Return a CharacterUtils instance compatible with Java 1.4.

    Declaration
    public static CharacterUtils GetJava4Instance(LuceneVersion matchVersion)
    Parameters
    Type Name Description
    Lucene.Net.Util.LuceneVersion matchVersion
    Returns
    Type Description
    CharacterUtils
    | Improve this Doc View Source

    NewCharacterBuffer(Int32)

    Creates a new CharacterUtils.CharacterBuffer and allocates a char[] of the given bufferSize.

    Declaration
    public static CharacterUtils.CharacterBuffer NewCharacterBuffer(int bufferSize)
    Parameters
    Type Name Description
    System.Int32 bufferSize

    the internal char buffer size, must be >= 2

    Returns
    Type Description
    CharacterUtils.CharacterBuffer

    a new CharacterUtils.CharacterBuffer instance.

    | Improve this Doc View Source

    OffsetByCodePoints(Char[], Int32, Int32, Int32, Int32)

    Return the index within buf[start:start+count] which is by offset code points from index.

    Declaration
    public abstract int OffsetByCodePoints(char[] buf, int start, int count, int index, int offset)
    Parameters
    Type Name Description
    System.Char[] buf
    System.Int32 start
    System.Int32 count
    System.Int32 index
    System.Int32 offset
    Returns
    Type Description
    System.Int32
    | Improve this Doc View Source

    ToChars(Int32[], Int32, Int32, Char[], Int32)

    Converts a sequence of unicode code points to a sequence of .NET characters.

    Declaration
    public int ToChars(int[] src, int srcOff, int srcLen, char[] dest, int destOff)
    Parameters
    Type Name Description
    System.Int32[] src
    System.Int32 srcOff
    System.Int32 srcLen
    System.Char[] dest
    System.Int32 destOff
    Returns
    Type Description
    System.Int32

    the number of chars written to the destination buffer

    | Improve this Doc View Source

    ToCodePoints(Char[], Int32, Int32, Int32[], Int32)

    Converts a sequence of .NET characters to a sequence of unicode code points.

    Declaration
    public int ToCodePoints(char[] src, int srcOff, int srcLen, int[] dest, int destOff)
    Parameters
    Type Name Description
    System.Char[] src
    System.Int32 srcOff
    System.Int32 srcLen
    System.Int32[] dest
    System.Int32 destOff
    Returns
    Type Description
    System.Int32

    The number of code points written to the destination buffer.

    | Improve this Doc View Source

    ToLower(Char[], Int32, Int32)

    Converts each unicode codepoint to lowerCase via System.Globalization.TextInfo.ToLower(System.String) in the invariant culture starting at the given offset.

    Declaration
    public virtual void ToLower(char[] buffer, int offset, int length)
    Parameters
    Type Name Description
    System.Char[] buffer

    the char buffer to lowercase

    System.Int32 offset

    the offset to start at

    System.Int32 length

    the number of characters in the buffer to lower case

    | Improve this Doc View Source

    ToUpper(Char[], Int32, Int32)

    Converts each unicode codepoint to UpperCase via System.Globalization.TextInfo.ToUpper(System.String) in the invariant culture starting at the given offset.

    Declaration
    public virtual void ToUpper(char[] buffer, int offset, int length)
    Parameters
    Type Name Description
    System.Char[] buffer

    the char buffer to UPPERCASE

    System.Int32 offset

    the offset to start at

    System.Int32 length

    the number of characters in the buffer to lower case

    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.