Lucene.Net  3.0.3
Lucene.Net is a .NET port of the Java Lucene Indexing Library
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Properties
Static Public Member Functions | List of all members
Lucene.Net.Analysis.WordlistLoader Class Reference

Loader for text files that represent a list of stopwords. More...

Static Public Member Functions

static ISet< string > GetWordSet (System.IO.FileInfo wordfile)
 Loads a text file and adds every line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the file should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
 
static ISet< string > GetWordSet (System.IO.FileInfo wordfile, System.String comment)
 Loads a text file and adds every non-comment line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the file should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
 
static ISet< string > GetWordSet (System.IO.TextReader reader)
 Reads lines from a Reader and adds every line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the Reader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
 
static ISet< string > GetWordSet (System.IO.TextReader reader, System.String comment)
 Reads lines from a Reader and adds every non-comment line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the Reader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).
 
static Dictionary< string, string > GetStemDict (System.IO.FileInfo wordstemfile)
 Reads a stem dictionary. Each line contains: wordstem (i.e. two tab seperated words)
 

Detailed Description

Loader for text files that represent a list of stopwords.

Definition at line 24 of file WordlistLoader.cs.

Member Function Documentation

static Dictionary<string, string> Lucene.Net.Analysis.WordlistLoader.GetStemDict ( System.IO.FileInfo  wordstemfile)
static

Reads a stem dictionary. Each line contains: wordstem (i.e. two tab seperated words)

Returns
stem dictionary that overrules the stemming algorithm

<throws> IOException </throws>

Definition at line 117 of file WordlistLoader.cs.

static ISet<string> Lucene.Net.Analysis.WordlistLoader.GetWordSet ( System.IO.FileInfo  wordfile)
static

Loads a text file and adds every line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the file should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).

Parameters
wordfileFile containing the wordlist
Returns
A HashSet with the file's words

Definition at line 34 of file WordlistLoader.cs.

static ISet<string> Lucene.Net.Analysis.WordlistLoader.GetWordSet ( System.IO.FileInfo  wordfile,
System.String  comment 
)
static

Loads a text file and adds every non-comment line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the file should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).

Parameters
wordfileFile containing the wordlist
commentThe comment string to ignore
Returns
A HashSet with the file's words

Definition at line 50 of file WordlistLoader.cs.

static ISet<string> Lucene.Net.Analysis.WordlistLoader.GetWordSet ( System.IO.TextReader  reader)
static

Reads lines from a Reader and adds every line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the Reader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).

Parameters
readerReader containing the wordlist
Returns
A HashSet with the reader's words

Definition at line 66 of file WordlistLoader.cs.

static ISet<string> Lucene.Net.Analysis.WordlistLoader.GetWordSet ( System.IO.TextReader  reader,
System.String  comment 
)
static

Reads lines from a Reader and adds every non-comment line as an entry to a HashSet (omitting leading and trailing whitespace). Every line of the Reader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer).

Parameters
readerReader containing the wordlist
commentThe string representing a comment.
Returns
A HashSet with the reader's words

Definition at line 91 of file WordlistLoader.cs.


The documentation for this class was generated from the following file: