Interface IHTMLParser
HTML Parsing Interface for test purposes.
Namespace: Lucene.Net.Benchmarks.ByTask.Feeds
Assembly: Lucene.Net.Benchmark.dll
Syntax
public interface IHTMLParser
Methods
Parse(DocData, string, DateTime?, TextReader, TrecContentSource)
Parse the input TextReader and return DocData. The provided name, title, date are used for the result, unless when they're null, in which case an attempt is made to set them from the parsed data.
Declaration
DocData Parse(DocData docData, string name, DateTime? date, TextReader reader, TrecContentSource trecSrc)
Parameters
Type | Name | Description |
---|---|---|
DocData | docData | Result reused. |
string | name | Name of the result doc data. |
DateTime? | date | Date of the result doc data. If null, attempt to set by parsed data. |
TextReader | reader | Reader of html text to parse. |
TrecContentSource | trecSrc | The TrecContentSource used to parse dates. |
Returns
Type | Description |
---|---|
DocData | Parsed doc data. |
Exceptions
Type | Condition |
---|---|
IOException | If there is a low-level I/O error. |