Fork me on GitHub
Search Results for

    Show / Hide Table of Contents

    extract-reuters

    Name

    benchmark-extract-reuters - Splits Reuters SGML documents into simple text files containing: Title, Date, Dateline, Body.

    Synopsis

    lucene benchmark extract-reuters <INPUT_DIRECTORY> <OUTPUT_DIRECTORY> [?|-h|--help]
    

    Arguments

    INPUT_DIRECTORY

    Path to Reuters SGML files.

    OUTPUT_DIRECTORY

    Path to a directory where the output files will be written.

    Options

    ?|-h|--help

    Prints out a short help for the command.

    Example

    Extracts the reuters SGML files in the z:\input directory and places the content in the z:\output directory.

    lucene benchmark extract-reuters z:\input z:\output
    
    • Improve this Doc
    In This Article
    Back to top Copyright © 2021 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.