Class WriteLineDocTask
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Implements
Inherited Members
Namespace: Lucene.Net.Benchmarks.ByTask.Tasks
Assembly: Lucene.Net.Benchmark.dll
Syntax
public class WriteLineDocTask : PerfTask, IDisposable
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
Constructors
WriteLineDocTask(PerfRunData)
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
public WriteLineDocTask(PerfRunData runData)
Parameters
Type | Name | Description |
---|---|---|
PerfRunData | runData |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
WriteLineDocTask(PerfRunData, bool)
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
public WriteLineDocTask(PerfRunData runData, bool performWriteHeader)
Parameters
Type | Name | Description |
---|---|---|
PerfRunData | runData | |
bool | performWriteHeader |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
Fields
DEFAULT_FIELDS
Fields to be written by default
Declaration
public static readonly string[] DEFAULT_FIELDS
Field Value
Type | Description |
---|---|
string[] |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
DEFAULT_SUFFICIENT_FIELDS
Default fields which at least one of them is required to not skip the doc.
Declaration
public static readonly string DEFAULT_SUFFICIENT_FIELDS
Field Value
Type | Description |
---|---|
string |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
FIELDS_HEADER_INDICATOR
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
public const string FIELDS_HEADER_INDICATOR = "FIELDS_HEADER_INDICATOR###"
Field Value
Type | Description |
---|---|
string |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
SEP
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
public const char SEP = '\t'
Field Value
Type | Description |
---|---|
char |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
m_fname
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
protected readonly string m_fname
Field Value
Type | Description |
---|---|
string |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
m_lineFileOut
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
protected readonly TextWriter m_lineFileOut
Field Value
Type | Description |
---|---|
TextWriter |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
Properties
SupportsParams
Sub classes that support parameters must override this method to return
true
if this task supports command line params.
Declaration
public override bool SupportsParams { get; }
Property Value
Type | Description |
---|---|
bool |
Overrides
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
Methods
Dispose(bool)
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
protected override void Dispose(bool disposing)
Parameters
Type | Name | Description |
---|---|---|
bool | disposing |
Overrides
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
DoLogic()
Perform the task once (ignoring repetitions specification). Return number of work items done by this task. For indexing that can be number of docs added. For warming that can be number of scanned items, etc.
Declaration
public override int DoLogic()
Returns
Type | Description |
---|---|
int | Number of work items done by this task. |
Overrides
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
GetLogMessage(int)
A task which writes documents, one line per document. Each line is in the following format: title <TAB> date <TAB> body. The output of this task can be consumed by LineDocSource and is intended to save the IO overhead of opening a file per document to be indexed.
Declaration
protected override string GetLogMessage(int recsCount)
Parameters
Type | Name | Description |
---|---|---|
int | recsCount |
Returns
Type | Description |
---|---|
string |
Overrides
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
LineFileOut(Document)
Selects output line file by written doc. Default: original output line file.
Declaration
protected virtual TextWriter LineFileOut(Document doc)
Parameters
Type | Name | Description |
---|---|---|
Document | doc |
Returns
Type | Description |
---|---|
TextWriter |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
SetParams(string)
Set the params (docSize only)
Declaration
public override void SetParams(string @params)
Parameters
Type | Name | Description |
---|---|---|
string | params | docSize, or 0 for no limit. |
Overrides
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).
WriteHeader(TextWriter)
Write header to the lines file - indicating how to read the file later.
Declaration
protected virtual void WriteHeader(TextWriter @out)
Parameters
Type | Name | Description |
---|---|---|
TextWriter | out |
Remarks
The format of the output is set according to the output file extension. Compression is recommended when the output file is expected to be large. See info on file extensions in FileType.
Supports the following parameters:- line.file.outthe name of the file to write the output to. That parameter is mandatory. NOTE: the file is re-created.
- line.fieldswhich fields should be written in each line. (optional, default: DEFAULT_FIELDS).
- sufficient.fields list of field names, separated by comma, which, if all of them are missing, the document will be skipped. For example, to require that at least one of f1,f2 is not empty, specify: "f1,f2" in this field. To specify that no field is required, i.e. that even empty docs should be emitted, specify "," (optional, default: DEFAULT_SUFFICIENT_FIELDS).