Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Class ExtractReuters

    Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body

    Inheritance
    System.Object
    ExtractReuters
    Inherited Members
    System.Object.Equals(System.Object)
    System.Object.Equals(System.Object, System.Object)
    System.Object.GetHashCode()
    System.Object.GetType()
    System.Object.MemberwiseClone()
    System.Object.ReferenceEquals(System.Object, System.Object)
    System.Object.ToString()
    Namespace: Lucene.Net.Benchmarks.Utils
    Assembly: Lucene.Net.Benchmark.dll
    Syntax
    public class ExtractReuters

    Constructors

    | Improve this Doc View Source

    ExtractReuters(DirectoryInfo, DirectoryInfo)

    Declaration
    public ExtractReuters(DirectoryInfo reutersDir, DirectoryInfo outputDir)
    Parameters
    Type Name Description
    System.IO.DirectoryInfo reutersDir
    System.IO.DirectoryInfo outputDir

    Methods

    | Improve this Doc View Source

    Extract()

    Declaration
    public virtual void Extract()
    | Improve this Doc View Source

    ExtractFile(FileInfo)

    Override if you wish to change what is extracted

    Declaration
    protected virtual void ExtractFile(FileInfo sgmFile)
    Parameters
    Type Name Description
    System.IO.FileInfo sgmFile
    | Improve this Doc View Source

    Main(String[])

    Declaration
    public static void Main(string[] args)
    Parameters
    Type Name Description
    System.String[] args
    • Improve this Doc
    • View Source
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.