Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Codecs.SimpleText

    Simpletext Codec: writes human readable postings.

    Classes

    SimpleTextCodec

    Plain text index format.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextDocValuesFormat

    Plain text doc values format.

    FOR RECREATIONAL USE ONLY

    The .dat file contains the data. For numbers this is a "fixed-width" file, for example a single byte range:

    field myField
      type NUMERIC
      minvalue 0
      pattern 000
    005
    T
    234
    T
    123
    T
    ...
    So a document's value (delta encoded from minvalue) can be retrieved by seeking to startOffset + (1+pattern.length()+2)*docid. The extra 1 is the newline. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.

    for bytes this is also a "fixed-width" file, for example:

    field myField
      type BINARY
      maxlength 6
      pattern 0
    length 6
    foobar[space][space]
    T
    length 3
    baz[space][space][space][space][space]
    T
    ...

    So a doc's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength+2)*doc the extra 9 is 2 newlines, plus "length " itself. The extra 2 is another newline and 'T' or 'F': true if the value is real, false if missing.

    For sorted bytes this is a fixed-width file, for example:

    field myField
      type SORTED
      numvalues 10
      maxLength 8
      pattern 0
      ordpattern 00
    length 6
    foobar[space][space]
    length 3
    baz[space][space][space][space][space]
    ...
    03
    06
    01
    10
    ...

    So the "ord section" begins at startOffset + (9+pattern.length+maxlength)*numValues. A document's ord can be retrieved by seeking to "ord section" + (1+ordpattern.length())*docid an ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord

    For sorted set this is a fixed-width file very similar to the SORTED case, for example:

    field myField
      type SORTED_SET
      numvalues 10
      maxLength 8
      pattern 0
      ordpattern XXXXX
    length 6
    foobar[space][space]
    length 3
    baz[space][space][space][space][space]
    ...
    0,3,5   
    1,2
    
    10
    ...

    So the "ord section" begins at startOffset + (9+pattern.length+maxlength)*numValues. A document's ord list can be retrieved by seeking to "ord section" + (1+ordpattern.length())*docid this is a comma-separated list, and its padded with spaces to be fixed width. so trim() and split() it. and beware the empty string! An ord's value can be retrieved by seeking to startOffset + (9+pattern.length+maxlength)*ord

    The reader can just scan this file when it opens, skipping over the data blocks and saving the offset/etc for each field.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextDocValuesReader

    Simpletext Codec: writes human readable postings.

    SimpleTextDocValuesWriter

    Simpletext Codec: writes human readable postings.

    SimpleTextFieldInfosFormat

    Plain text field infos format.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextFieldInfosReader

    Reads plain text field infos files.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextFieldInfosWriter

    Writes plain text field infos files.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextLiveDocsFormat

    Reads/writes plain text live docs.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextNormsFormat

    Plain-text norms format.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextNormsFormat.SimpleTextNormsConsumer

    Writes plain-text norms.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextNormsFormat.SimpleTextNormsProducer

    Reads plain-text norms.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextPostingsFormat

    For debugging, curiosity, transparency only!! Do not use this codec in production.

    This codec stores all postings data in a single human-readable text file (_N.pst). You can view this in any text editor, and even edit it to alter your index.

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextSegmentInfoFormat

    Plain text segments file format.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextSegmentInfoReader

    Reads plaintext segments files.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextSegmentInfoWriter

    Writes plain text segments files.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextStoredFieldsFormat

    Plain text stored fields format.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextStoredFieldsReader

    Reads plain text stored fields.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextStoredFieldsWriter

    Writes plain-text stored fields.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextTermVectorsFormat

    Plain text term vectors format.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextTermVectorsReader

    Reads plain-text term vectors.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    SimpleTextTermVectorsWriter

    Writes plain-text term vectors.

    FOR RECREATIONAL USE ONLY

    Note

    This API is experimental and might change in incompatible ways in the next release.

    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.