Class CharacterUtils
CharacterUtils provides a unified interface to Character-related operations to implement backwards compatible character operations based on a Lucene.Net.Util.LuceneVersion instance.
Inheritance
Inherited Members
Namespace: Lucene.Net.Analysis.Util
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
public abstract class CharacterUtilsMethods
| Improve this Doc View SourceCodePointAt(ICharSequence, Int32)
Returns the code point at the given index of the J2N.Text.ICharSequence.
Depending on the Lucene.Net.Util.LuceneVersion passed to
GetInstance(LuceneVersion) this method mimics the behavior
of Character.CodePointAt(char[], int) as it would have been
available on a Java 1.4 JVM or on a later virtual machine version.
Declaration
public abstract int CodePointAt(ICharSequence seq, int offset)Parameters
| Type | Name | Description | 
|---|---|---|
| J2N.Text.ICharSequence | seq | a character sequence | 
| System.Int32 | offset | the offset to the char values in the chars array to be converted | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | the Unicode code point at the given index | 
Exceptions
| Type | Condition | 
|---|---|
| System.NullReferenceException | 
 | 
| System.ArgumentOutOfRangeException | 
 | 
CodePointAt(Char[], Int32, Int32)
Returns the code point at the given index of the char array where only elements
with index less than the limit are used.
Depending on the Lucene.Net.Util.LuceneVersion passed to
GetInstance(LuceneVersion) this method mimics the behavior
of Character.CodePointAt(char[], int) as it would have been
available on a Java 1.4 JVM or on a later virtual machine version.
Declaration
public abstract int CodePointAt(char[] chars, int offset, int limit)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Char[] | chars | a character array | 
| System.Int32 | offset | the offset to the char values in the chars array to be converted | 
| System.Int32 | limit | the index afer the last element that should be used to calculate codepoint. | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | the Unicode code point at the given index | 
Exceptions
| Type | Condition | 
|---|---|
| System.NullReferenceException | 
 | 
| System.ArgumentOutOfRangeException | 
 | 
CodePointAt(String, Int32)
Returns the code point at the given index of the System.String.
Depending on the Lucene.Net.Util.LuceneVersion passed to
GetInstance(LuceneVersion) this method mimics the behavior
of Character.CodePointAt(char[], int) as it would have been
available on a Java 1.4 JVM or on a later virtual machine version.
Declaration
public abstract int CodePointAt(string seq, int offset)Parameters
| Type | Name | Description | 
|---|---|---|
| System.String | seq | a character sequence | 
| System.Int32 | offset | the offset to the char values in the chars array to be converted | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | the Unicode code point at the given index | 
Exceptions
| Type | Condition | 
|---|---|
| System.NullReferenceException | 
 | 
| System.ArgumentOutOfRangeException | 
 | 
CodePointCount(ICharSequence)
Return the number of characters in seq. 
Declaration
public abstract int CodePointCount(ICharSequence seq)Parameters
| Type | Name | Description | 
|---|---|---|
| J2N.Text.ICharSequence | seq | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | 
CodePointCount(Char[])
Return the number of characters in seq. 
Declaration
public abstract int CodePointCount(char[] seq)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Char[] | seq | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | 
CodePointCount(String)
Return the number of characters in seq. 
Declaration
public abstract int CodePointCount(string seq)Parameters
| Type | Name | Description | 
|---|---|---|
| System.String | seq | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | 
CodePointCount(StringBuilder)
Return the number of characters in seq. 
Declaration
public abstract int CodePointCount(StringBuilder seq)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Text.StringBuilder | seq | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | 
Fill(CharacterUtils.CharacterBuffer, TextReader)
Convenience method which calls Fill(buffer, reader, buffer.Buffer.Length). 
Declaration
public virtual bool Fill(CharacterUtils.CharacterBuffer buffer, TextReader reader)Parameters
| Type | Name | Description | 
|---|---|---|
| CharacterUtils.CharacterBuffer | buffer | |
| System.IO.TextReader | reader | 
Returns
| Type | Description | 
|---|---|
| System.Boolean | 
Fill(CharacterUtils.CharacterBuffer, TextReader, Int32)
Fills the CharacterUtils.CharacterBuffer with characters read from the given reader System.IO.TextReader. This method tries to read
numChars0 up to numChars.
In case code points can span across 2 java characters, this method may
only fill numChars - 1 characters in order not to split in
the middle of a surrogate pair, even if there are remaining characters in
the System.IO.TextReader.
Depending on the Lucene.Net.Util.LuceneVersion passed to GetInstance(LuceneVersion) this method implements supplementary character awareness when filling the given buffer. For all Lucene.Net.Util.LuceneVersion > 3.0 Fill(CharacterUtils.CharacterBuffer, TextReader, Int32) guarantees that the given CharacterUtils.CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.
A return value of false means that this method call exhausted
the reader, but there may be some bytes which have been read, which can be
verified by checking whether buffer.Length > 0.
Declaration
public abstract bool Fill(CharacterUtils.CharacterBuffer buffer, TextReader reader, int numChars)Parameters
| Type | Name | Description | 
|---|---|---|
| CharacterUtils.CharacterBuffer | buffer | the buffer to fill. | 
| System.IO.TextReader | reader | the reader to read characters from. | 
| System.Int32 | numChars | the number of chars to read | 
Returns
| Type | Description | 
|---|---|
| System.Boolean | if and only if reader.read returned -1 while trying to fill the buffer | 
Exceptions
| Type | Condition | 
|---|---|
| System.IO.IOException | if the reader throws an System.IO.IOException. | 
GetInstance(LuceneVersion)
Returns a CharacterUtils implementation according to the given Lucene.Net.Util.LuceneVersion instance.
Declaration
public static CharacterUtils GetInstance(LuceneVersion matchVersion)Parameters
| Type | Name | Description | 
|---|---|---|
| Lucene.Net.Util.LuceneVersion | matchVersion | a version instance | 
Returns
| Type | Description | 
|---|---|
| CharacterUtils | a CharacterUtils implementation according to the given Lucene.Net.Util.LuceneVersion instance. | 
GetJava4Instance(LuceneVersion)
Return a CharacterUtils instance compatible with Java 1.4.
Declaration
public static CharacterUtils GetJava4Instance(LuceneVersion matchVersion)Parameters
| Type | Name | Description | 
|---|---|---|
| Lucene.Net.Util.LuceneVersion | matchVersion | 
Returns
| Type | Description | 
|---|---|
| CharacterUtils | 
NewCharacterBuffer(Int32)
Creates a new CharacterUtils.CharacterBuffer and allocates a char[] of the given bufferSize.
Declaration
public static CharacterUtils.CharacterBuffer NewCharacterBuffer(int bufferSize)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Int32 | bufferSize | the internal char buffer size, must be  | 
Returns
| Type | Description | 
|---|---|
| CharacterUtils.CharacterBuffer | a new CharacterUtils.CharacterBuffer instance. | 
OffsetByCodePoints(Char[], Int32, Int32, Int32, Int32)
Return the index within buf[start:start+count] which is by offset
code points from index. 
Declaration
public abstract int OffsetByCodePoints(char[] buf, int start, int count, int index, int offset)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Char[] | buf | |
| System.Int32 | start | |
| System.Int32 | count | |
| System.Int32 | index | |
| System.Int32 | offset | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | 
ToChars(Int32[], Int32, Int32, Char[], Int32)
Converts a sequence of unicode code points to a sequence of .NET characters.
Declaration
public int ToChars(int[] src, int srcOff, int srcLen, char[] dest, int destOff)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Int32[] | src | |
| System.Int32 | srcOff | |
| System.Int32 | srcLen | |
| System.Char[] | dest | |
| System.Int32 | destOff | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | the number of chars written to the destination buffer | 
ToCodePoints(Char[], Int32, Int32, Int32[], Int32)
Converts a sequence of .NET characters to a sequence of unicode code points.
Declaration
public int ToCodePoints(char[] src, int srcOff, int srcLen, int[] dest, int destOff)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Char[] | src | |
| System.Int32 | srcOff | |
| System.Int32 | srcLen | |
| System.Int32[] | dest | |
| System.Int32 | destOff | 
Returns
| Type | Description | 
|---|---|
| System.Int32 | The number of code points written to the destination buffer. | 
ToLower(Char[], Int32, Int32)
Converts each unicode codepoint to lowerCase via System.Globalization.TextInfo.ToLower(System.String) in the invariant culture starting at the given offset.
Declaration
public virtual void ToLower(char[] buffer, int offset, int length)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Char[] | buffer | the char buffer to lowercase | 
| System.Int32 | offset | the offset to start at | 
| System.Int32 | length | the number of characters in the buffer to lower case | 
ToUpper(Char[], Int32, Int32)
Converts each unicode codepoint to UpperCase via System.Globalization.TextInfo.ToUpper(System.String) in the invariant culture starting at the given offset.
Declaration
public virtual void ToUpper(char[] buffer, int offset, int length)Parameters
| Type | Name | Description | 
|---|---|---|
| System.Char[] | buffer | the char buffer to UPPERCASE | 
| System.Int32 | offset | the offset to start at | 
| System.Int32 | length | the number of characters in the buffer to lower case |