Namespace Lucene.Net.Codecs.SimpleText

Simpletext Codec: writes human readable postings.

Classes

SimpleTextCodec

Plain text index format.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextDocValuesFormat

Plain text doc values format.

FOR RECREATIONAL USE ONLY

The .dat file contains the data. For numbers this is a "fixed-width" file, for example a single byte range:

 field myField
   type NUMERIC
   minvalue 0
   pattern 000
 005
 T
 234
 T
 123
 T
 ...

So a document's value (delta encoded from minvalue) can be retrieved by seeking to startOffset + (1+pattern.length()+2)*docid. The extra 1 is the newline. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.

for bytes this is also a "fixed-width" file, for example:

 field myField
   type BINARY
   maxlength 6
   pattern 0
 length 6
 foobar[space][space]
 T
 length 3
 baz[space][space][space][space][space]
 T
 ...

So a doc's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength+2)*doc the extra 9 is 2 newlines, plus "length " itself. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.

For sorted bytes this is a fixed-width file, for example:

 field myField
   type SORTED
   numvalues 10
   maxLength 8
   pattern 0
   ordpattern 00
 length 6
 foobar[space][space]
 length 3
 baz[space][space][space][space][space]
 ...
 03
 06
 01
 10
 ...

So the "ord section" begins at startOffset + (9+pattern.length+maxlength)numValues. A document's ord can be retrieved by seeking to "ord section" + (1+ordpattern.length())docid an ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord

For sorted set this is a fixed-width file very similar to the SORTED case, for example:

 field myField
   type SORTED_SET
   numvalues 10
   maxLength 8
   pattern 0
   ordpattern XXXXX
 length 6
 foobar[space][space]
 length 3
 baz[space][space][space][space][space]
 ...
 0,3,5   
 1,2

 10
 ...

So the "ord section" begins at startOffset + (9+pattern.length+maxlength)numValues. A document's ord list can be retrieved by seeking to "ord section" + (1+ordpattern.length())docid this is a comma-separated list, and its padded with spaces to be fixed width. so trim() and split() it. and beware the empty string! An ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord

The reader can just scan this file when it opens, skipping over the data blocks and saving the offset/etc for each field.

@lucene.experimental

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextFieldInfosReader

Reads plain text field infos files.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextFieldInfosWriter

Writes plain text field infos files.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextLiveDocsFormat

Reads/writes plain text live docs.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextNormsFormat

Plain-text norms format.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextNormsFormat.SimpleTextNormsConsumer

Writes plain-text norms.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextNormsFormat.SimpleTextNormsProducer

Reads plain-text norms.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextPostingsFormat

For debugging, curiosity, transparency only!! Do not use this codec in production.

This codec stores all postings data in a single human-readable text file (_N.pst). You can view this in any text editor, and even edit it to alter your index.

@lucene.experimental

SimpleTextSegmentInfoFormat

Plain text segments file format.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextSegmentInfoReader

Reads plaintext segments files.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextSegmentInfoWriter

Writes plain text segments files.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextStoredFieldsFormat

Plain text stored fields format.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextStoredFieldsReader

Reads plain text stored fields.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextStoredFieldsWriter

Writes plain-text stored fields.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextTermVectorsFormat

Plain text term vectors format.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextTermVectorsReader

Reads plain-text term vectors.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

SimpleTextTermVectorsWriter

Writes plain-text term vectors.

FOR RECREATIONAL USE ONLY

This is a Lucene.NET EXPERIMENTAL API, use at your own risk

Namespace Lucene.Net.Codecs.SimpleText

Classes

SimpleTextCodec

SimpleTextDocValuesFormat

SimpleTextDocValuesReader

SimpleTextDocValuesWriter

SimpleTextFieldInfosFormat

SimpleTextFieldInfosReader

SimpleTextFieldInfosWriter

SimpleTextLiveDocsFormat

SimpleTextNormsFormat

SimpleTextNormsFormat.SimpleTextNormsConsumer

SimpleTextNormsFormat.SimpleTextNormsProducer

SimpleTextPostingsFormat

SimpleTextSegmentInfoFormat

SimpleTextSegmentInfoReader

SimpleTextSegmentInfoWriter

SimpleTextStoredFieldsFormat

SimpleTextStoredFieldsReader

SimpleTextStoredFieldsWriter

SimpleTextTermVectorsFormat

SimpleTextTermVectorsReader

SimpleTextTermVectorsWriter