Show / Hide Table of Contents

    Namespace TagSoup

    Classes

    Element

    The internal representation of an actual element (not an element type). An Element has an element type, attributes, and a successor Element for use in constructing stacks and queues of Elements.

    ElementType

    This class represents an element type in the schema. An element type has a name, a content model vector, a member-of vector, a flags vector, default attributes, and a schema to which it belongs.

    HTMLScanner

    This class implements a table-driven scanner for HTML, allowing for lots of defects. It implements the Scanner interface, which accepts a Reader object to fetch characters from and a ScanHandler object to report lexical events to.

    HTMLSchema

    This class provides a Schema that has been preinitialized with HTML elements, attributes, and character entity declarations. All the declarations normally provided with HTML 4.01 are given, plus some that are IE-specific and NS4-specific. Attribute declarations of type CDATA with no default value are not included.

    Parser

    The SAX parser class.

    PYXScanner

    A IScanner that accepts PYX format instead of HTML. Useful primarily for debugging.

    PYXWriter

    A IContentHandler that generates PYX format instead of XML. Primarily useful for debugging.

    Schema

    Abstract class representing a TSSL schema. Actual TSSL schemas are compiled into concrete subclasses of this class.

    XMLWriter

    Filter to write an XML document from a SAX event stream.

    Interfaces

    IAutoDetector

    Classes which accept an and provide a which figures out the encoding of the and reads characters from it should conform to this interface.

    IScanHandler

    An interface that Scanners use to report events in the input stream.

    IScanner

    An interface allowing Parser to invoke scanners.

    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)