Show / Hide Table of Contents

    Namespace Lucene.Net.Codecs.SimpleText

    Simpletext Codec: writes human readable postings.

    Classes

    SimpleTextCodec

    Plain text index format.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextDocValuesFormat

    Plain text doc values format.

    FOR RECREATIONAL USE ONLY

    The .dat file contains the data. For numbers this is a "fixed-width" file, for example a single byte range:

     field myField
       type NUMERIC
       minvalue 0
       pattern 000
     005
     T
     234
     T
     123
     T
     ...

    So a document's value (delta encoded from minvalue) can be retrieved by seeking to startOffset + (1+pattern.length()+2)*docid. The extra 1 is the newline. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.

    for bytes this is also a "fixed-width" file, for example:

     field myField
       type BINARY
       maxlength 6
       pattern 0
     length 6
     foobar[space][space]
     T
     length 3
     baz[space][space][space][space][space]
     T
     ...

    So a doc's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength+2)*doc the extra 9 is 2 newlines, plus "length " itself. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.

    For sorted bytes this is a fixed-width file, for example:

     field myField
       type SORTED
       numvalues 10
       maxLength 8
       pattern 0
       ordpattern 00
     length 6
     foobar[space][space]
     length 3
     baz[space][space][space][space][space]
     ...
     03
     06
     01
     10
     ...

    So the "ord section" begins at startOffset + (9+pattern.length+maxlength)numValues. A document's ord can be retrieved by seeking to "ord section" + (1+ordpattern.length())docid an ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord

    For sorted set this is a fixed-width file very similar to the SORTED case, for example:

     field myField
       type SORTED_SET
       numvalues 10
       maxLength 8
       pattern 0
       ordpattern XXXXX
     length 6
     foobar[space][space]
     length 3
     baz[space][space][space][space][space]
     ...
     0,3,5   
     1,2
    
     10
     ...

    So the "ord section" begins at startOffset + (9+pattern.length+maxlength)numValues. A document's ord list can be retrieved by seeking to "ord section" + (1+ordpattern.length())docid this is a comma-separated list, and its padded with spaces to be fixed width. so trim() and split() it. and beware the empty string! An ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord

    The reader can just scan this file when it opens, skipping over the data blocks and saving the offset/etc for each field.

    @lucene.experimental

    SimpleTextDocValuesReader

    SimpleTextDocValuesWriter

    SimpleTextFieldInfosFormat

    Plain text field infos format.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextFieldInfosReader

    Reads plain text field infos files.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextFieldInfosWriter

    Writes plain text field infos files.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextLiveDocsFormat

    Reads/writes plain text live docs.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextNormsFormat

    Plain-text norms format.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextNormsFormat.SimpleTextNormsConsumer

    Writes plain-text norms.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextNormsFormat.SimpleTextNormsProducer

    Reads plain-text norms.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextPostingsFormat

    For debugging, curiosity, transparency only!! Do not use this codec in production.

    This codec stores all postings data in a single human-readable text file (_N.pst). You can view this in any text editor, and even edit it to alter your index.

    @lucene.experimental

    SimpleTextSegmentInfoFormat

    Plain text segments file format.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextSegmentInfoReader

    Reads plaintext segments files.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextSegmentInfoWriter

    Writes plain text segments files.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextStoredFieldsFormat

    Plain text stored fields format.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextStoredFieldsReader

    Reads plain text stored fields.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextStoredFieldsWriter

    Writes plain-text stored fields.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextTermVectorsFormat

    Plain text term vectors format.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextTermVectorsReader

    Reads plain-text term vectors.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk

    SimpleTextTermVectorsWriter

    Writes plain-text term vectors.

    FOR RECREATIONAL USE ONLY

    This is a Lucene.NET EXPERIMENTAL API, use at your own risk
    • Improve this Doc
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)