Namespace Lucene.Net.Benchmarks.Quality
Search Quality Benchmarking.
This package allows to benchmark search quality of a Lucene application.
In order to use this package you should provide:
For benchmarking TREC collections with TREC QRels, take a look at the trec package.
Here is a sample code used to run the TREC 2006 queries 701-850 on the .Gov2 collection:
File topicsFile = new File("topics-701-850.txt");
File qrelsFile = new File("qrels-701-850.txt");
IndexReader ir = DirectoryReader.open(directory):
IndexSearcher searcher = new IndexSearcher(ir);
int maxResults = 1000;
String docNameField = "docname";
PrintWriter logger = new PrintWriter(System.out,true);
// use trec utilities to read trec topics into quality queries
TrecTopicsReader qReader = new TrecTopicsReader();
QualityQuery qqs[] = qReader.readQueries(new BufferedReader(new FileReader(topicsFile)));
// prepare judge, with trec utilities that read from a QRels file
Judge judge = new TrecJudge(new BufferedReader(new FileReader(qrelsFile)));
// validate topics & judgments match each other
judge.validateData(qqs, logger);
// set the parsing of quality queries into Lucene queries.
QualityQueryParser qqParser = new SimpleQQParser("title", "body");
// run the benchmark
QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser, searcher, docNameField);
SubmissionReport submitLog = null;
QualityStats stats[] = qrun.execute(maxResults, judge, submitLog, logger);
// print an avarage sum of the results
QualityStats avg = QualityStats.average(stats);
avg.log("SUMMARY",2,logger, " ");
Some immediate ways to modify this program to your needs are:
To run on different formats of queries and judgements provide your own Judge and Quality queries.
Create sophisticated Lucene queries by supplying a different Quality query parser.
Classes
QualityBenchmark
Main entry point for running a quality benchmark.
There are two main configurations for running a quality benchmark:- Against existing judgements.
- For submission (e.g. for a contest).
QualityQuery
A QualityQuery has an ID and some name-value pairs.
The ID allows to map the quality query with its judgements. The name-value pairs are used by a IQualityQueryParser to create a Lucene Lucene.Net.Search.Query. It is very likely that name-value-pairs would be mapped into fields in a Lucene query, but it is up to the QualityQueryParser how to map - e.g. all values in a single field, or each pair as its own field, etc., - and this of course must match the way the searched index was constructed.QualityStats
Results of quality benchmark run for a single query or for a set of queries.
QualityStats.RecallPoint
A certain rank in which a relevant doc was found.
Interfaces
IJudge
Judge if a document is relevant for a quality query.
IQualityQueryParser
Parse a QualityQuery into a Lucene query.