Class StringTokenizer
The StringTokenizer class allows an application to break a string into tokens by performing code point comparison. The StringTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
Inheritance
Inherited Members
Namespace: Lucene.Net.Support
Assembly: Lucene.Net.dll
Syntax
public class StringTokenizer
Remarks
The set of delimiters (the codepoints that separate tokens) may be specified either at creation time or on a per-token basis.
An instance of StringTokenizer behaves in one of three ways,
depending on whether it was created with the returnDelimiters
flag
having the value true
or false
:
- If returnDelims is
false
, delimiter code points serve to separate tokens. A token is a maximal sequence of consecutive code points that are not delimiters. - If returnDelims is
true
, delimiter code points are themselves considered to be tokens. In this case a token will be received for each delimiter code point.
A token is thus either one delimiter code point, or a maximal sequence of consecutive code points that are not delimiters.
A StringTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the code point processed.
A token is returned by taking a substring of the string that was used to create the StringTokenizer object.
Here's an example of the use of the default delimiter StringTokenizer:
StringTokenizer st = new StringTokenizer("this is a test");
while (st.HasMoreTokens()) {
println(st.NextToken());
}
This prints the following output:
this
is
a
test
Here's an example of how to use a StringTokenizer with a user specified delimiter:
StringTokenizer st = new StringTokenizer(
"this is a test with supplementary characters \ud800\ud800\udc00\udc00",
" \ud800\udc00");
while (st.HasMoreTokens()) {
println(st.NextToken());
}
This prints the following output:
this
is
a
test
with
supplementary
characters
\ud800
\udc00
Constructors
| Improve this Doc View SourceStringTokenizer(String)
Constructs a new StringTokenizer for the parameter string using
whitespace as the delimiter. The Lucene.Net.Support.StringTokenizer.returnDelimiters flag is set to
false
.
Declaration
public StringTokenizer(string str)
Parameters
Type | Name | Description |
---|---|---|
System.String | str | The string to be tokenized. |
StringTokenizer(String, String)
Constructs a new StringTokenizer for the parameter string using
the specified delimiters. The Lucene.Net.Support.StringTokenizer.returnDelimiters flag is set to
false
. If delimiters
is null
, this constructor
doesn't throw an System.Exception, but later calls to some methods might
throw an System.ArgumentNullException or System.InvalidOperationException.
Declaration
public StringTokenizer(string str, string delimiters)
Parameters
Type | Name | Description |
---|---|---|
System.String | str | The string to be tokenized. |
System.String | delimiters | The delimiters to use. |
StringTokenizer(String, String, Boolean)
Constructs a new StringTokenizer for the parameter string using
the specified delimiters, returning the delimiters as tokens if the
parameter returnDelimiters
is true
. If delimiters
is null this constructor doesn't throw an System.Exception, but later
calls to some methods might throw an System.ArgumentNullException or System.InvalidOperationException.
Declaration
public StringTokenizer(string str, string delimiters, bool returnDelimiters)
Parameters
Type | Name | Description |
---|---|---|
System.String | str | The string to be tokenized. |
System.String | delimiters | The delimiters to use. |
System.Boolean | returnDelimiters |
|
Methods
| Improve this Doc View SourceCountTokens()
Returns the number of unprocessed tokens remaining in the string.
Declaration
public virtual int CountTokens()
Returns
Type | Description |
---|---|
System.Int32 | number of tokens that can be retreived before an System.Exception will result from a call to NextToken(). |
HasMoreTokens()
Returns true
if unprocessed tokens remain.
Declaration
public bool HasMoreTokens()
Returns
Type | Description |
---|---|
System.Boolean |
|
NextToken()
Returns the next token in the string as a System.String.
Declaration
public string NextToken()
Returns
Type | Description |
---|---|
System.String | Next token in the string as a System.String. |
Exceptions
Type | Condition |
---|---|
System.InvalidOperationException | If no tokens remain. |
NextToken(String)
Returns the next token in the string as a System.String. The delimiters used are changed to the specified delimiters.
Declaration
public string NextToken(string delims)
Parameters
Type | Name | Description |
---|---|---|
System.String | delims | The new delimiters to use. |
Returns
Type | Description |
---|---|
System.String | Next token in the string as a System.String. |
Exceptions
Type | Condition |
---|---|
System.InvalidOperationException | If no tokens remain. |