Show / Hide Table of Contents

    Class StreamTokenizer

    Parses a stream into a set of defined tokens, one at a time. The different types of tokens that can be found are numbers, identifiers, quoted strings, and different comment styles. The class can be used for limited processing of source code of programming languages like Java, although it is nowhere near a full parser.

    Inheritance
    System.Object
    StreamTokenizer
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    Namespace: Lucene.Net.Support.IO
    Assembly: Lucene.Net.dll
    Syntax
    public class StreamTokenizer

    Constructors

    | Improve this Doc View Source

    StreamTokenizer(Stream)

    Constructs a new StreamTokenizer with input as source input stream. This constructor is deprecated; instead, the constructor that takes a System.IO.TextReader as an arugment should be used.

    Declaration
    [Obsolete("Use StreamTokenizer(TextReader)")]
    public StreamTokenizer(Stream input)
    Parameters
    Type Name Description
    System.IO.Stream input

    the source stream from which to parse tokens.

    Exceptions
    Type Condition
    System.ArgumentNullException

    If input is null.

    | Improve this Doc View Source

    StreamTokenizer(TextReader)

    Constructs a new {@code StreamTokenizer} with {@code r} as source reader. The tokenizer's initial state is as follows:

    • All byte values 'A' through 'Z', 'a' through 'z', and '\u00A0' through '\u00FF' are considered to be alphabetic.
    • All byte values '\u0000' through '\u0020' are considered to be white space. '/' is a comment character.
    • Single quote ''' and double quote '"' are string quote characters.
    • Numbers are parsed.
    • End of lines are considered to be white space rather than separate tokens.
    • C-style and C++-style comments are not recognized.

    Declaration
    public StreamTokenizer(TextReader reader)
    Parameters
    Type Name Description
    System.IO.TextReader reader

    The source text reader from which to parse tokens.

    Fields

    | Improve this Doc View Source

    TT_EOF

    The constant representing the end of the stream.

    Declaration
    public const int TT_EOF = -1
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    TT_EOL

    The constant representing the end of the line.

    Declaration
    public const int TT_EOL = 10
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    TT_NUMBER

    The constant representing a number token.

    Declaration
    public const int TT_NUMBER = -2
    Field Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    TT_WORD

    The constant representing a word token.

    Declaration
    public const int TT_WORD = -3
    Field Value
    Type Description
    System.Int32

    Properties

    | Improve this Doc View Source

    IsEOLSignificant

    Specifies whether the end of a line is significant and should be returned as TT_EOF in TokenType by this tokenizer. true if EOL is significant, false otherwise.

    Declaration
    public virtual bool IsEOLSignificant { get; set; }
    Property Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    LineNumber

    Gets the current line number.

    Declaration
    public int LineNumber { get; }
    Property Value
    Type Description
    System.Int32
    | Improve this Doc View Source

    LowerCaseMode

    Specifies whether word tokens should be converted to lower case when they are stored in StringValue. true if StringValue should be converted to lower case, false otherwise.

    Declaration
    public bool LowerCaseMode { get; set; }
    Property Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    NumberValue

    Contains a number if the current token is a number (TokenType == TT_NUMBER).

    Declaration
    public double NumberValue { get; set; }
    Property Value
    Type Description
    System.Double
    | Improve this Doc View Source

    SlashSlashComments

    Specifies whether "slash-slash" (C++-style) comments shall be recognized. This kind of comment ends at the end of the line. true if // should be recognized as the start of a comment, false otherwise.

    Declaration
    public bool SlashSlashComments { get; set; }
    Property Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    SlashStarComments

    Specifies whether "slash-star" (C-style) comments shall be recognized. Slash-star comments cannot be nested and end when a star-slash combination is found. true if /* should be recognized as the start of a comment, false otherwise.

    Declaration
    public bool SlashStarComments { get; set; }
    Property Value
    Type Description
    System.Boolean
    | Improve this Doc View Source

    StringValue

    Contains a string if the current token is a word (TokenType == TT_WORD).

    Declaration
    public string StringValue { get; set; }
    Property Value
    Type Description
    System.String
    | Improve this Doc View Source

    TokenType

    After calling {@code nextToken()}, {@code ttype} contains the type of token that has been read. When a single character is read, its value converted to an integer is stored in {@code ttype}. For a quoted string, the value is the quoted character. Otherwise, its value is one of the following:

    • TT_WORD - the token is a word.
    • TT_NUMBER - the token is a number.
    • TT_EOL - the end of line has been reached. Depends on whether IsEOLSignificant is true.
    • TT_EOF - the end of the stream has been reached.

    Declaration
    public int TokenType { get; }
    Property Value
    Type Description
    System.Int32

    Methods

    | Improve this Doc View Source

    CommentChar(Int32)

    Specifies that the character ch shall be treated as a comment character.

    Declaration
    public virtual void CommentChar(int ch)
    Parameters
    Type Name Description
    System.Int32 ch

    The character to be considered a comment character.

    | Improve this Doc View Source

    NextToken()

    Parses the next token from this tokenizer's source stream or reader. The type of the token is stored in the TokenType field, additional information may be stored in the NumberValue or StringValue fields.

    Declaration
    public int NextToken()
    Returns
    Type Description
    System.Int32

    The value of TokenType.

    Exceptions
    Type Condition
    System.IO.IOException

    If an I/O error occurs while parsing the next token.

    | Improve this Doc View Source

    OrdinaryChar(Int32)

    Specifies that the character ch shall be treated as an ordinary character by this tokenizer. That is, it has no special meaning as a comment character, word component, white space, string delimiter or number.

    Declaration
    public void OrdinaryChar(int ch)
    Parameters
    Type Name Description
    System.Int32 ch

    The character to be considered an ordinary character.

    | Improve this Doc View Source

    OrdinaryChars(Int32, Int32)

    Specifies that the characters in the range from low to hi shall be treated as an ordinary character by this tokenizer. That is, they have no special meaning as a comment character, word component, white space, string delimiter or number.

    Declaration
    public void OrdinaryChars(int low, int hi)
    Parameters
    Type Name Description
    System.Int32 low

    The first character in the range of ordinary characters.

    System.Int32 hi

    The last character in the range of ordinary characters.

    | Improve this Doc View Source

    ParseNumbers()

    Specifies that this tokenizer shall parse numbers.

    Declaration
    public void ParseNumbers()
    | Improve this Doc View Source

    PushBack()

    Indicates that the current token should be pushed back and returned again the next time NextToken() is called.

    Declaration
    public void PushBack()
    | Improve this Doc View Source

    QuoteChar(Int32)

    Specifies that the character ch shall be treated as a quote character.

    Declaration
    public void QuoteChar(int ch)
    Parameters
    Type Name Description
    System.Int32 ch

    The character to be considered a quote character.

    | Improve this Doc View Source

    ResetSyntax()

    Specifies that all characters shall be treated as ordinary characters.

    Declaration
    public void ResetSyntax()
    | Improve this Doc View Source

    ToString()

    Returns the state of this tokenizer in a readable format.

    Declaration
    public override string ToString()
    Returns
    Type Description
    System.String

    The current state of this tokenizer.

    Overrides
    System.Object.ToString()
    | Improve this Doc View Source

    WhitespaceChars(Int32, Int32)

    Specifies that the characters in the range from low to hi shall be treated as whitespace characters by this tokenizer.

    Declaration
    public void WhitespaceChars(int low, int hi)
    Parameters
    Type Name Description
    System.Int32 low

    The first character in the range of whitespace characters.

    System.Int32 hi

    The last character in the range of whitespace characters.

    | Improve this Doc View Source

    WordChars(Int32, Int32)

    Specifies that the characters in the range from low to hi shall be treated as word characters by this tokenizer. A word consists of a word character followed by zero or more word or number characters.

    Declaration
    public void WordChars(int low, int hi)
    Parameters
    Type Name Description
    System.Int32 low

    The first character in the range of word characters.

    System.Int32 hi

    The last character in the range of word characters.

    Extension Methods

    Number.IsNumber(Object)
    • Improve this Doc
    • View Source
    Back to top Copyright © 2019 Licensed to the Apache Software Foundation (ASF)