Show / Hide Table of Contents

    Namespace Lucene.Net.Benchmarks.Quality

    Search Quality Benchmarking.

    This package allows to benchmark search quality of a Lucene application.

    In order to use this package you should provide:

    • A IndexSearcher.

    • Quality queries.

    • Judging object.

    • Reporting object.

    For benchmarking TREC collections with TREC QRels, take a look at the trec package.

    Here is a sample code used to run the TREC 2006 queries 701-850 on the .Gov2 collection:

        File topicsFile = new File("topics-701-850.txt");
        File qrelsFile = new File("qrels-701-850.txt");
        IndexReader ir = DirectoryReader.open(directory):
        IndexSearcher searcher = new IndexSearcher(ir);
    
    int maxResults = 1000;
        String docNameField = "docname"; 
    
        PrintWriter logger = new PrintWriter(System.out,true); 
    
    // use trec utilities to read trec topics into quality queries
        TrecTopicsReader qReader = new TrecTopicsReader();
        QualityQuery qqs[] = qReader.readQueries(new BufferedReader(new FileReader(topicsFile)));
    
        // prepare judge, with trec utilities that read from a QRels file
        Judge judge = new TrecJudge(new BufferedReader(new FileReader(qrelsFile)));
    
        // validate topics & judgments match each other
        judge.validateData(qqs, logger);
    
        // set the parsing of quality queries into Lucene queries.
        QualityQueryParser qqParser = new SimpleQQParser("title", "body");
    
        // run the benchmark
        QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser, searcher, docNameField);
        SubmissionReport submitLog = null;
        QualityStats stats[] = qrun.execute(maxResults, judge, submitLog, logger);
    
        // print an avarage sum of the results
        QualityStats avg = QualityStats.average(stats);
        avg.log("SUMMARY",2,logger, "  ");
    

    Some immediate ways to modify this program to your needs are:

    • To run on different formats of queries and judgements provide your own Judge and Quality queries.

    • Create sophisticated Lucene queries by supplying a different Quality query parser.

    Classes

    QualityBenchmark

    Main entry point for running a quality benchmark.

    There are two main configurations for running a quality benchmark:

    • Against existing judgements.
    • For submission (e.g. for a contest).
    The first configuration requires a non null IJudge. The second configuration requires a non null SubmissionReport.

    QualityQuery

    A QualityQuery has an ID and some name-value pairs.

    The ID allows to map the quality query with its judgements.

    The name-value pairs are used by a IQualityQueryParser to create a Lucene Query.

    It is very likely that name-value-pairs would be mapped into fields in a Lucene query, but it is up to the QualityQueryParser how to map - e.g. all values in a single field, or each pair as its own field, etc., - and this of course must match the way the searched index was constructed.

    QualityStats

    Results of quality benchmark run for a single query or for a set of queries.

    QualityStats.RecallPoint

    A certain rank in which a relevant doc was found.

    Interfaces

    IJudge

    Judge if a document is relevant for a quality query.

    IQualityQueryParser

    Parse a QualityQuery into a Lucene query.

    • Improve this Doc
    Back to top Copyright © 2020 Licensed to the Apache Software Foundation (ASF)