Class Builder<T>
Builds a minimal FST (maps an Int32s
NOTE: The algorithm is described at http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.3698
The parameterized type
FSTs larger than 2.1GB are now possible (as of Lucene 4.2). FSTs containing more than 2.1B nodes are also now possible, however they cannot be packed.
Namespace: Lucene.Net.Util.Fst
Assembly: Lucene.Net.dll
Syntax
public class Builder<T> : Builder
Type Parameters
Name | Description |
---|---|
T |
Constructors
| Improve this Doc View SourceBuilder(FST.INPUT_TYPE, Outputs<T>)
Instantiates an FST/FSA builder without any pruning. A shortcut to Builder(FST.INPUT_TYPE, Int32, Int32, Boolean, Boolean, Int32, Outputs<T>, Builder.FreezeTail<T>, Boolean, Single, Boolean, Int32) with pruning options turned off.
Declaration
public Builder(FST.INPUT_TYPE inputType, Outputs<T> outputs)
Parameters
Type | Name | Description |
---|---|---|
FST. |
inputType | |
Outputs<T> | outputs |
Builder(FST.INPUT_TYPE, Int32, Int32, Boolean, Boolean, Int32, Outputs<T>, Builder.FreezeTail<T>, Boolean, Single, Boolean, Int32)
Instantiates an FST/FSA builder with all the possible tuning and construction tweaks. Read parameter documentation carefully.
Declaration
public Builder(FST.INPUT_TYPE inputType, int minSuffixCount1, int minSuffixCount2, bool doShareSuffix, bool doShareNonSingletonNodes, int shareMaxTailLength, Outputs<T> outputs, Builder.FreezeTail<T> freezeTail, bool doPackFST, float acceptableOverheadRatio, bool allowArrayArcs, int bytesPageBits)
Parameters
Type | Name | Description |
---|---|---|
FST. |
inputType | The input type (transition labels). Can be anything from FST. |
System. |
minSuffixCount1 | If pruning the input graph during construction, this threshold is used for telling if a node is kept or pruned. If transition_count(node) >= minSuffixCount1, the node is kept. |
System. |
minSuffixCount2 | (Note: only Mike McCandless knows what this one is really doing...) |
System. |
doShareSuffix | If |
System. |
doShareNonSingletonNodes | Only used if |
System. |
shareMaxTailLength | Only used if |
Outputs<T> | outputs | The output type for each input sequence. Applies only if building an FST. For
FSA, use Singleton and No |
Builder. |
freezeTail | |
System. |
doPackFST | Pass |
System. |
acceptableOverheadRatio | How to trade speed for space when building the FST. this option
is only relevant when doPackFST is true. Get |
System. |
allowArrayArcs | Pass false to disable the array arc optimization while building the FST; this will make the resulting FST smaller but slower to traverse. |
System. |
bytesPageBits | How many bits wide to make each
byte[] block in the Lucene. |
Properties
| Improve this Doc View SourceMappedStateCount
Declaration
public virtual long MappedStateCount { get; }
Property Value
Type | Description |
---|---|
System. |
TermCount
Declaration
public virtual long TermCount { get; }
Property Value
Type | Description |
---|---|
System. |
TotStateCount
Declaration
public virtual long TotStateCount { get; }
Property Value
Type | Description |
---|---|
System. |
Methods
| Improve this Doc View SourceAdd(Int32sRef, T)
It's OK to add the same input twice in a row with
different outputs, as long as outputs impls the merge
method. Note that input is fully consumed after this
method is returned (so caller is free to reuse), but
output is not. So if your outputs are changeable (eg
Byte
Declaration
public virtual void Add(Int32sRef input, T output)
Parameters
Type | Name | Description |
---|---|---|
Int32s |
input | |
T | output |
Finish()
Returns final FST. NOTE: this will return null if nothing is accepted by the FST.
Declaration
public virtual FST<T> Finish()
Returns
Type | Description |
---|---|
FST<T> |
GetFstSizeInBytes()
Declaration
public virtual long GetFstSizeInBytes()
Returns
Type | Description |
---|---|
System. |