Class FreeTextSuggester
Builds an ngram model from the text sent to Build(IInputIterator, Double)
and predicts based on the last grams-1 tokens in
the request sent to DoLookup(String, IEnumerable<BytesRef>, Boolean, Int32). This tries to
handle the "long tail" of suggestions for when the
incoming query is a never before seen query string.
Likely this suggester would only be used as a
fallback, when the primary suggester fails to find
any suggestions.
Note that the weight for each suggestion is unused,
and the suggestions are the analyzed forms (so your
analysis process should normally be very "light").
This uses the stupid backoff language model to smooth
scores across ngram models; see
"Large language models in machine translation" for details.
From DoLookup(String, IEnumerable<BytesRef>, Boolean, Int32), the key of each result is the
ngram token; the value is System.Int64.MaxValue * score (fixed
point, cast to long). Divide by System.Int64.MaxValue to get
the score back, which ranges from 0.0 to 1.0.
onlyMorePopular
is unused.
@lucene.experimental
Inheritance
System.Object
FreeTextSuggester
Inherited Members
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.ToString()
Assembly: Lucene.Net.Suggest.dll
Syntax
public class FreeTextSuggester : Lookup
Constructors
|
Improve this Doc
View Source
FreeTextSuggester(Analyzer)
Instantiate, using the provided analyzer for both
indexing and lookup, using bigram model by default.
Declaration
public FreeTextSuggester(Analyzer analyzer)
Parameters
|
Improve this Doc
View Source
FreeTextSuggester(Analyzer, Analyzer)
Instantiate, using the provided indexing and lookup
analyzers, using bigram model by default.
Declaration
public FreeTextSuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer)
Parameters
|
Improve this Doc
View Source
FreeTextSuggester(Analyzer, Analyzer, Int32)
Instantiate, using the provided indexing and lookup
analyzers, with the specified model (2
= bigram, 3 = trigram, etc.).
Declaration
public FreeTextSuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer, int grams)
Parameters
Type |
Name |
Description |
Analyzer |
indexAnalyzer |
|
Analyzer |
queryAnalyzer |
|
System.Int32 |
grams |
|
|
Improve this Doc
View Source
FreeTextSuggester(Analyzer, Analyzer, Int32, Byte)
Instantiate, using the provided indexing and lookup
analyzers, and specified model (2 = bigram, 3 =
trigram ,etc.). The separator
is passed to SetTokenSeparator(String)
to join multiple
tokens into a single ngram token; it must be an ascii
(7-bit-clean) byte. No input tokens should have this
byte, otherwise System.ArgumentException is
thrown.
Declaration
public FreeTextSuggester(Analyzer indexAnalyzer, Analyzer queryAnalyzer, int grams, byte separator)
Parameters
Type |
Name |
Description |
Analyzer |
indexAnalyzer |
|
Analyzer |
queryAnalyzer |
|
System.Int32 |
grams |
|
System.Byte |
separator |
|
Fields
|
Improve this Doc
View Source
ALPHA
The constant used for backoff smoothing; during
lookup, this means that if a given trigram did not
occur, and we backoff to the bigram, the overall score
will be 0.4 times what the bigram model would have
assigned.
Declaration
public const double ALPHA = 0.4
Field Value
Type |
Description |
System.Double |
|
|
Improve this Doc
View Source
CODEC_NAME
Codec name used in the header for the saved model.
Declaration
public const string CODEC_NAME = "freetextsuggest"
Field Value
Type |
Description |
System.String |
|
|
Improve this Doc
View Source
DEFAULT_GRAMS
By default we use a bigram model.
Declaration
public const int DEFAULT_GRAMS = 2
Field Value
Type |
Description |
System.Int32 |
|
|
Improve this Doc
View Source
DEFAULT_SEPARATOR
The default character used to join multiple tokens
into a single ngram token. The input tokens produced
by the analyzer must not contain this character.
Declaration
public const byte DEFAULT_SEPARATOR = 30
Field Value
Type |
Description |
System.Byte |
|
|
Improve this Doc
View Source
VERSION_CURRENT
Current version of the the saved model file format.
Declaration
public const int VERSION_CURRENT = 0
Field Value
Type |
Description |
System.Int32 |
|
|
Improve this Doc
View Source
VERSION_START
Initial version of the the saved model file format.
Declaration
public const int VERSION_START = 0
Field Value
Type |
Description |
System.Int32 |
|
Properties
|
Improve this Doc
View Source
Count
Declaration
public override long Count { get; }
Property Value
Type |
Description |
System.Int64 |
|
Overrides
Methods
|
Improve this Doc
View Source
Build(IInputIterator)
Declaration
public override void Build(IInputIterator iterator)
Parameters
Overrides
|
Improve this Doc
View Source
Build(IInputIterator, Double)
Build the suggest index, using up to the specified
amount of temporary RAM while building. Note that
the weights for the suggestions are ignored.
Declaration
public virtual void Build(IInputIterator iterator, double ramBufferSizeMB)
Parameters
Type |
Name |
Description |
IInputIterator |
iterator |
|
System.Double |
ramBufferSizeMB |
|
|
Improve this Doc
View Source
DoLookup(String, Boolean, Int32)
Declaration
public override IList<Lookup.LookupResult> DoLookup(string key, bool onlyMorePopular, int num)
Parameters
Type |
Name |
Description |
System.String |
key |
|
System.Boolean |
onlyMorePopular |
|
System.Int32 |
num |
|
Returns
Overrides
|
Improve this Doc
View Source
DoLookup(String, IEnumerable<BytesRef>, Boolean, Int32)
Declaration
public override IList<Lookup.LookupResult> DoLookup(string key, IEnumerable<BytesRef> contexts, bool onlyMorePopular, int num)
Parameters
Type |
Name |
Description |
System.String |
key |
|
System.Collections.Generic.IEnumerable<BytesRef> |
contexts |
|
System.Boolean |
onlyMorePopular |
|
System.Int32 |
num |
|
Returns
Overrides
|
Improve this Doc
View Source
DoLookup(String, IEnumerable<BytesRef>, Int32)
Declaration
public virtual IList<Lookup.LookupResult> DoLookup(string key, IEnumerable<BytesRef> contexts, int num)
Parameters
Type |
Name |
Description |
System.String |
key |
|
System.Collections.Generic.IEnumerable<BytesRef> |
contexts |
|
System.Int32 |
num |
|
Returns
|
Improve this Doc
View Source
DoLookup(String, Int32)
Lookup, without any context.
Declaration
public virtual IList<Lookup.LookupResult> DoLookup(string key, int num)
Parameters
Type |
Name |
Description |
System.String |
key |
|
System.Int32 |
num |
|
Returns
|
Improve this Doc
View Source
Get(String)
Returns the weight associated with an input string,
or null if it does not exist.
Declaration
public virtual object Get(string key)
Parameters
Type |
Name |
Description |
System.String |
key |
|
Returns
Type |
Description |
System.Object |
|
|
Improve this Doc
View Source
GetSizeInBytes()
Returns byte size of the underlying FST.
Declaration
public override long GetSizeInBytes()
Returns
Type |
Description |
System.Int64 |
|
Overrides
|
Improve this Doc
View Source
Load(DataInput)
Declaration
public override bool Load(DataInput input)
Parameters
Returns
Type |
Description |
System.Boolean |
|
Overrides
|
Improve this Doc
View Source
Store(DataOutput)
Declaration
public override bool Store(DataOutput output)
Parameters
Returns
Type |
Description |
System.Boolean |
|
Overrides