Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Interface IHTMLParser

    HTML Parsing Interface for test purposes.

    Namespace: Lucene.Net.Benchmarks.ByTask.Feeds
    Assembly: Lucene.Net.Benchmark.dll
    Syntax
    public interface IHTMLParser

    Methods

    Parse(DocData, string, DateTime?, TextReader, TrecContentSource)

    Parse the input TextReader and return DocData. The provided name, title, date are used for the result, unless when they're null, in which case an attempt is made to set them from the parsed data.

    Declaration
    DocData Parse(DocData docData, string name, DateTime? date, TextReader reader, TrecContentSource trecSrc)
    Parameters
    Type Name Description
    DocData docData

    Result reused.

    string name

    Name of the result doc data.

    DateTime? date

    Date of the result doc data. If null, attempt to set by parsed data.

    TextReader reader

    Reader of html text to parse.

    TrecContentSource trecSrc

    The TrecContentSource used to parse dates.

    Returns
    Type Description
    DocData

    Parsed doc data.

    Exceptions
    Type Condition
    IOException

    If there is a low-level I/O error.

    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.