Show / Hide Table of Contents

    Enum WordDelimiterFlags

    Configuration options for the WordDelimiterFilter.

    LUCENENET specific - these options were passed as int constant flags in Lucene.

    Namespace: Lucene.Net.Analysis.Miscellaneous
    Assembly: Lucene.Net.Analysis.Common.dll
    Syntax
    public enum WordDelimiterFlags : int

    Fields

    Name Description
    CATENATE_ALL

    Causes all subword parts to be catenated:

    "wi-fi-4000" => "wifi4000"

    CATENATE_NUMBERS

    Causes maximum runs of word parts to be catenated:

    "wi-fi" => "wifi"

    CATENATE_WORDS

    Causes maximum runs of word parts to be catenated:

    "wi-fi" => "wifi"

    GENERATE_NUMBER_PARTS

    Causes number subwords to be generated:

    "500-42" => "500" "42"

    GENERATE_WORD_PARTS

    Causes parts of words to be generated:

    "PowerShot" => "Power" "Shot"

    PRESERVE_ORIGINAL

    Causes original words are preserved and added to the subword list (Defaults to false)

    "500-42" => "500" "42" "500-42"

    SPLIT_ON_CASE_CHANGE

    If not set, causes case changes to be ignored (subwords will only be generated given SUBWORD_DELIM tokens)

    SPLIT_ON_NUMERICS

    If not set, causes numeric changes to be ignored (subwords will only be generated given SUBWORD_DELIM tokens).

    STEM_ENGLISH_POSSESSIVE

    Causes trailing "'s" to be removed for each subword

    "O'Neil's" => "O", "Neil"

    • Improve this Doc
    • View Source
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)