Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Enum WordDelimiterFlags

    Configuration options for the WordDelimiterFilter.

    LUCENENET specific - these options were passed as int constant flags in Lucene.

    Namespace: Lucene.Net.Analysis.Miscellaneous
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    [Flags]
    public enum WordDelimiterFlags

    Fields

    Name Description
    CATENATE_ALL

    Causes all subword parts to be catenated:

    "wi-fi-4000" => "wifi4000"

    CATENATE_NUMBERS

    Causes maximum runs of word parts to be catenated:

    "wi-fi" => "wifi"

    CATENATE_WORDS

    Causes maximum runs of word parts to be catenated:

    "wi-fi" => "wifi"

    GENERATE_NUMBER_PARTS

    Causes number subwords to be generated:

    "500-42" => "500" "42"

    GENERATE_WORD_PARTS

    Causes parts of words to be generated:

    "PowerShot" => "Power" "Shot"

    PRESERVE_ORIGINAL

    Causes original words are preserved and added to the subword list (Defaults to false)

    "500-42" => "500" "42" "500-42"

    SPLIT_ON_CASE_CHANGE

    If not set, causes case changes to be ignored (subwords will only be generated given SUBWORD_DELIM tokens)

    SPLIT_ON_NUMERICS

    If not set, causes numeric changes to be ignored (subwords will only be generated given SUBWORD_DELIM tokens).

    STEM_ENGLISH_POSSESSIVE

    Causes trailing "'s" to be removed for each subword

    "O'Neil's" => "O", "Neil"

    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.