Namespace Lucene.Net.Codecs.SimpleText
Simpletext Codec: writes human readable postings.
Classes
SimpleTextCodec
Plain text index format.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextDocValuesFormat
Plain text doc values format.
FOR RECREATIONAL USE ONLY
The .dat file contains the data. For numbers this is a "fixed-width" file, for example a single byte range:
field myField
type NUMERIC
minvalue 0
pattern 000
005
T
234
T
123
T
...
So a document's value (delta encoded from minvalue) can be retrieved by
seeking to startOffset + (1+pattern.length()+2)*docid. The extra 1 is the newline.
The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.
for bytes this is also a "fixed-width" file, for example:
field myField
type BINARY
maxlength 6
pattern 0
length 6
foobar[space][space]
T
length 3
baz[space][space][space][space][space]
T
...
So a doc's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength+2)*doc the extra 9 is 2 newlines, plus "length " itself. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.
For sorted bytes this is a fixed-width file, for example:
field myField
type SORTED
numvalues 10
maxLength 8
pattern 0
ordpattern 00
length 6
foobar[space][space]
length 3
baz[space][space][space][space][space]
...
03
06
01
10
...
So the "ord section" begins at startOffset + (9+pattern.length+maxlength)*numValues. A document's ord can be retrieved by seeking to "ord section" + (1+ordpattern.length())*docid an ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord
For sorted set this is a fixed-width file very similar to the SORTED case, for example:
field myField
type SORTED_SET
numvalues 10
maxLength 8
pattern 0
ordpattern XXXXX
length 6
foobar[space][space]
length 3
baz[space][space][space][space][space]
...
0,3,5
1,2
10
...
So the "ord section" begins at startOffset + (9+pattern.length+maxlength)*numValues. A document's ord list can be retrieved by seeking to "ord section" + (1+ordpattern.length())*docid this is a comma-separated list, and its padded with spaces to be fixed width. so trim() and split() it. and beware the empty string! An ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord
The reader can just scan this file when it opens, skipping over the data blocks and saving the offset/etc for each field.Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextDocValuesReader
Simpletext Codec: writes human readable postings.
SimpleTextDocValuesWriter
Simpletext Codec: writes human readable postings.
SimpleTextFieldInfosFormat
Plain text field infos format.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextFieldInfosReader
Reads plain text field infos files.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextFieldInfosWriter
Writes plain text field infos files.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextLiveDocsFormat
Reads/writes plain text live docs.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextNormsFormat
Plain-text norms format.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextNormsFormat.SimpleTextNormsConsumer
Writes plain-text norms.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextNormsFormat.SimpleTextNormsProducer
Reads plain-text norms.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextPostingsFormat
For debugging, curiosity, transparency only!! Do not use this codec in production.
This codec stores all postings data in a single human-readable text file (_N.pst). You can view this in any text editor, and even edit it to alter your index.
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextSegmentInfoFormat
Plain text segments file format.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextSegmentInfoReader
Reads plaintext segments files.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextSegmentInfoWriter
Writes plain text segments files.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextStoredFieldsFormat
Plain text stored fields format.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextStoredFieldsReader
Reads plain text stored fields.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextStoredFieldsWriter
Writes plain-text stored fields.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextTermVectorsFormat
Plain text term vectors format.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextTermVectorsReader
Reads plain-text term vectors.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.
SimpleTextTermVectorsWriter
Writes plain-text term vectors.
FOR RECREATIONAL USE ONLY
Note
This API is experimental and might change in incompatible ways in the next release.