Namespace Lucene.Net.Misc
Misc Tools
The misc package has various tools for splitting/merging indices, changing norms, finding high freq terms, and others.
Classes
GetTermInfo
Utility to get document frequency and total number of occurrences (sum of the tf for each doc) of a term.
LUCENENET specific: In the Java implementation, this class' Main method was intended to be called from the command line. However, in .NET a method within a DLL can't be directly called from the command line so we provide a .NET tool, lucene-cli, with a command that maps to that method: index list-term-infoHighFreqTerms
High
HighFreqTerms.DocFreqComparer
Compares terms by Doc
HighFreqTerms.TotalTermFreqComparer
Compares terms by Total
IndexMergeTool
Merges indices specified on the command line into the index specified as the first command line argument.
LUCENENET specific: In the Java implementation, this class' Main method was intended to be called from the command line. However, in .NET a method within a DLL can't be directly called from the command line so we provide a .NET tool, lucene-cli, with a command that maps to that method: index mergeSweetSpotSimilarity
A similarity with a lengthNorm that provides for a "plateau" of equally good lengths, and tf helper functions.
For lengthNorm, A min/max can be specified to define the plateau of lengths that should all have a norm of 1.0. Below the min, and above the max the lengthNorm drops off in a sqrt function.
For tf, baselineTf and hyperbolicTf functions are provided, which subclasses can choose between.
TermStats
Holder for a term along with its statistics
(Doc