Fork me on GitHub
  • API

    Show / Hide Table of Contents

    Namespace Lucene.Net.Benchmarks.Quality

    Search Quality Benchmarking.

    This package allows to benchmark search quality of a Lucene application.

    In order to use this package you should provide:

    • A IndexSearcher.

    • Quality queries.

    • Judging object.

    • Reporting object.

    For benchmarking TREC collections with TREC QRels, take a look at the trec package.

    Here is a sample code used to run the TREC 2006 queries 701-850 on the .Gov2 collection:

        File topicsFile = new File("topics-701-850.txt");
        File qrelsFile = new File("qrels-701-850.txt");
        IndexReader ir = DirectoryReader.open(directory):
        IndexSearcher searcher = new IndexSearcher(ir);
    
    int maxResults = 1000;
        String docNameField = "docname"; 
    
        PrintWriter logger = new PrintWriter(System.out,true); 
    
    // use trec utilities to read trec topics into quality queries
        TrecTopicsReader qReader = new TrecTopicsReader();
        QualityQuery qqs[] = qReader.readQueries(new BufferedReader(new FileReader(topicsFile)));
    
        // prepare judge, with trec utilities that read from a QRels file
        Judge judge = new TrecJudge(new BufferedReader(new FileReader(qrelsFile)));
    
        // validate topics & judgments match each other
        judge.validateData(qqs, logger);
    
        // set the parsing of quality queries into Lucene queries.
        QualityQueryParser qqParser = new SimpleQQParser("title", "body");
    
        // run the benchmark
        QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser, searcher, docNameField);
        SubmissionReport submitLog = null;
        QualityStats stats[] = qrun.execute(maxResults, judge, submitLog, logger);
    
        // print an avarage sum of the results
        QualityStats avg = QualityStats.average(stats);
        avg.log("SUMMARY",2,logger, "  ");
    

    Some immediate ways to modify this program to your needs are:

    • To run on different formats of queries and judgements provide your own Judge and Quality queries.

    • Create sophisticated Lucene queries by supplying a different Quality query parser.

    Classes

    QualityBenchmark

    Main entry point for running a quality benchmark.

    There are two main configurations for running a quality benchmark:
    • Against existing judgements.
    • For submission (e.g. for a contest).
    The first configuration requires a non null IJudge. The second configuration requires a non null SubmissionReport.

    QualityQuery

    A QualityQuery has an ID and some name-value pairs.

    The ID allows to map the quality query with its judgements.

    The name-value pairs are used by a IQualityQueryParser to create a Lucene Lucene.Net.Search.Query.

    It is very likely that name-value-pairs would be mapped into fields in a Lucene query, but it is up to the QualityQueryParser how to map - e.g. all values in a single field, or each pair as its own field, etc., - and this of course must match the way the searched index was constructed.

    QualityStats

    Results of quality benchmark run for a single query or for a set of queries.

    QualityStats.RecallPoint

    A certain rank in which a relevant doc was found.

    Interfaces

    IJudge

    Judge if a document is relevant for a quality query.

    IQualityQueryParser

    Parse a QualityQuery into a Lucene query.

    Back to top Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0
    Apache Lucene.Net, Lucene.Net, Apache, the Apache feather logo, and the Apache Lucene.Net project logo are trademarks of The Apache Software Foundation.
    All other marks mentioned may be trademarks or registered trademarks of their respective owners.