Class ExtractReuters
Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body
Inheritance
System.Object
ExtractReuters
Namespace: Lucene.Net.Benchmarks.Utils
Assembly: Lucene.Net.Benchmark.dll
Syntax
public class ExtractReuters : object
Constructors
| Improve this Doc View SourceExtractReuters(DirectoryInfo, DirectoryInfo)
Declaration
public ExtractReuters(DirectoryInfo reutersDir, DirectoryInfo outputDir)
Parameters
Type | Name | Description |
---|---|---|
Directory |
reutersDir | |
Directory |
outputDir |
Methods
| Improve this Doc View SourceExtract()
Declaration
public virtual void Extract()
ExtractFile(FileInfo)
Override if you wish to change what is extracted
Declaration
protected virtual void ExtractFile(FileInfo sgmFile)
Parameters
Type | Name | Description |
---|---|---|
File |
sgmFile |
Main(String[])
Declaration
public static void Main(string[] args)
Parameters
Type | Name | Description |
---|---|---|
System. |
args |