Class HTMLScanner
This class implements a table-driven scanner for HTML, allowing for lots of defects. It implements the Scanner interface, which accepts a Reader object to fetch characters from and a ScanHandler object to report lexical events to.
Inheritance
System.Object
HTMLScanner
Inherited Members
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.ToString()
Namespace: TagSoup
Assembly: Lucene.Net.Benchmark.dll
Syntax
public class HTMLScanner : IScanner, ILocator
Constructors
| Improve this Doc View SourceHTMLScanner()
Declaration
public HTMLScanner()
Properties
| Improve this Doc View SourceColumnNumber
Declaration
public virtual int ColumnNumber { get; }
Property Value
Type | Description |
---|---|
System.Int32 |
LineNumber
Declaration
public virtual int LineNumber { get; }
Property Value
Type | Description |
---|---|
System.Int32 |
PublicId
Declaration
public virtual string PublicId { get; }
Property Value
Type | Description |
---|---|
System.String |
SystemId
Declaration
public virtual string SystemId { get; }
Property Value
Type | Description |
---|---|
System.String |
Methods
| Improve this Doc View SourceResetDocumentLocator(String, String)
Reset document locator, supplying systemid and publicid.
Declaration
public virtual void ResetDocumentLocator(string publicid, string systemid)
Parameters
Type | Name | Description |
---|---|---|
System.String | publicid | Public id |
System.String | systemid | System id |
Scan(TextReader, IScanHandler)
Scan HTML source, reporting lexical events.
Declaration
public virtual void Scan(TextReader r, IScanHandler h)
Parameters
Type | Name | Description |
---|---|---|
System.IO.TextReader | r | Reader that provides characters |
IScanHandler | h | ScanHandler that accepts lexical events. |
StartCDATA()
A callback for the ScanHandler that allows it to force the lexer state to CDATA content (no markup is recognized except the end of element.
Declaration
public virtual void StartCDATA()