Class ICUPostingsHighlighter
Simple highlighter that does not analyze fields nor use term vectors. Instead it requires DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS.
Inheritance
Namespace: Lucene.Net.Search.PostingsHighlight
Assembly: Lucene.Net.ICU.dll
Syntax
public class ICUPostingsHighlighter : object
Remarks
PostingsHighlighter treats the single original document as the whole corpus, and then scores individual
passages as if they were documents in this corpus. It uses a
You can customize the behavior by subclassing this highlighter, some important hooks:
- Get
Break : Customize how the text is divided into passages.Iterator(String) - Get
Scorer(String) : Customize how passages are ranked. - Get
Formatter(String) : Customize how snippets are formatted. - Get
Index : Enable highlighting of MultiTermQuerys such as WildcardAnalyzer(String) Query .
WARNING: The code is very new and probably still has some exciting bugs!
Example usage:
// configure field with offsets at index time
IndexableFieldType offsetsType = new IndexableFieldType(TextField.TYPE_STORED);
offsetsType.IndexOptions = IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS;
Field body = new Field("body", "foobar", offsetsType);
// retrieve highlights at query time
ICUPostingsHighlighter highlighter = new ICUPostingsHighlighter();
Query query = new TermQuery(new Term("body", "highlighting"));
TopDocs topDocs = searcher.Search(query, n);
string highlights[] = highlighter.Highlight("body", query, searcher, topDocs);
This is thread-safe, and can be used across different readers.
Note that the .NET implementation differs from the PostingsHighlighter
in Lucene in
that it is backed by an ICU
Constructors
| Improve this Doc View SourceICUPostingsHighlighter()
Creates a new highlighter with DEFAULT_MAX_LENGTH.
Declaration
public ICUPostingsHighlighter()
ICUPostingsHighlighter(Int32)
Creates a new highlighter, specifying maximum content length.
Declaration
public ICUPostingsHighlighter(int maxLength)
Parameters
Type | Name | Description |
---|---|---|
System. |
maxLength | maximum content size to process. |
Fields
| Improve this Doc View SourceDEFAULT_MAX_LENGTH
Default maximum content size to process. Typically snippets closer to the beginning of the document better summarize its content
Declaration
public static readonly int DEFAULT_MAX_LENGTH
Field Value
Type | Description |
---|---|
System. |
Methods
| Improve this Doc View SourceGetBreakIterator(String)
Returns the
Declaration
protected virtual BreakIterator GetBreakIterator(string field)
Parameters
Type | Name | Description |
---|---|---|
System. |
field |
Returns
Type | Description |
---|---|
Break |
GetEmptyHighlight(String, BreakIterator, Int32)
Called to summarize a document when no hits were
found. By default this just returns the first
maxPassages
sentences; subclasses can override
to customize.
Declaration
protected virtual Passage[] GetEmptyHighlight(string fieldName, BreakIterator bi, int maxPassages)
Parameters
Type | Name | Description |
---|---|---|
System. |
fieldName | |
Break |
bi | |
System. |
maxPassages |
Returns
Type | Description |
---|---|
Passage[] |
GetFormatter(String)
Returns the Passage
Declaration
protected virtual PassageFormatter GetFormatter(string field)
Parameters
Type | Name | Description |
---|---|---|
System. |
field |
Returns
Type | Description |
---|---|
Passage |
GetIndexAnalyzer(String)
Returns the analyzer originally used to index the content for field
.
This is used to highlight some Multi
Declaration
protected virtual Analyzer GetIndexAnalyzer(string field)
Parameters
Type | Name | Description |
---|---|---|
System. |
field |
Returns
| Improve this Doc View SourceGetMultiValuedSeparator(String)
Returns the logical separator between values for multi-valued fields.
The default value is a space character, which means passages can span across values,
but a subclass can override, for example with U+2029 PARAGRAPH SEPARATOR (PS)
if each value holds a discrete passage for highlighting.
Declaration
protected virtual char GetMultiValuedSeparator(string field)
Parameters
Type | Name | Description |
---|---|---|
System. |
field |
Returns
Type | Description |
---|---|
System. |
GetScorer(String)
Returns the Passage
Declaration
protected virtual PassageScorer GetScorer(string field)
Parameters
Type | Name | Description |
---|---|---|
System. |
field |
Returns
Type | Description |
---|---|
Passage |
Highlight(String, Query, IndexSearcher, TopDocs)
Highlights the top passages from a single field.
Declaration
public virtual string[] Highlight(string field, Query query, IndexSearcher searcher, TopDocs topDocs)
Parameters
Type | Name | Description |
---|---|---|
System. |
field | field name to highlight. Must have a stored string value and also be indexed with offsets. |
Query | query | query to highlight. |
Index |
searcher | searcher that was previously used to execute the query. |
Top |
topDocs | TopDocs containing the summary result documents to highlight. |
Returns
Type | Description |
---|---|
System. |
Array of formatted snippets corresponding to the documents in |
Highlight(String, Query, IndexSearcher, TopDocs, Int32)
Highlights the top-N passages from a single field.
Declaration
public virtual string[] Highlight(string field, Query query, IndexSearcher searcher, TopDocs topDocs, int maxPassages)
Parameters
Type | Name | Description |
---|---|---|
System. |
field | field name to highlight. Must have a stored string value and also be indexed with offsets. |
Query | query | query to highlight. |
Index |
searcher | searcher that was previously used to execute the query. |
Top |
topDocs | TopDocs containing the summary result documents to highlight. |
System. |
maxPassages | The maximum number of top-N ranked passages used to form the highlighted snippets. |
Returns
Type | Description |
---|---|
System. |
Array of formatted snippets corresponding to the documents in |
HighlightFields(String[], Query, IndexSearcher, TopDocs)
Highlights the top passages from multiple fields.
Conceptually, this behaves as a more efficient form of:
IDictionary<string, string[]> m = new Dictionary<string, string[]>();
foreach (string field in fields)
{
m[field] = Highlight(field, query, searcher, topDocs);
}
return m;
Declaration
public virtual IDictionary<string, string[]> HighlightFields(string[] fields, Query query, IndexSearcher searcher, TopDocs topDocs)
Parameters
Type | Name | Description |
---|---|---|
System. |
fields | field names to highlight. Must have a stored string value and also be indexed with offsets. |
Query | query | query to highlight. |
Index |
searcher | searcher that was previously used to execute the query. |
Top |
topDocs | TopDocs containing the summary result documents to highlight. |
Returns
Type | Description |
---|---|
IDictionary<System. |
|
HighlightFields(String[], Query, IndexSearcher, TopDocs, Int32[])
Highlights the top-N passages from multiple fields.
Conceptually, this behaves as a more efficient form of:
IDictionary<string, string[]> m = new Dictionary<string, string[]>();
foreach (string field in fields)
{
m[field] = Highlight(field, query, searcher, topDocs, maxPassages);
}
return m;
Declaration
public virtual IDictionary<string, string[]> HighlightFields(string[] fields, Query query, IndexSearcher searcher, TopDocs topDocs, int[] maxPassages)
Parameters
Type | Name | Description |
---|---|---|
System. |
fields | field names to highlight. Must have a stored string value and also be indexed with offsets. |
Query | query | query to highlight. |
Index |
searcher | searcher that was previously used to execute the query. |
Top |
topDocs | TopDocs containing the summary result documents to highlight. |
System. |
maxPassages | The maximum number of top-N ranked passages per-field used to form the highlighted snippets. |
Returns
Type | Description |
---|---|
IDictionary<System. |
|
HighlightFields(String[], Query, IndexSearcher, Int32[], Int32[])
Highlights the top-N passages from multiple fields, for the provided int[] docids.
Declaration
public virtual IDictionary<string, string[]> HighlightFields(string[] fieldsIn, Query query, IndexSearcher searcher, int[] docidsIn, int[] maxPassagesIn)
Parameters
Type | Name | Description |
---|---|---|
System. |
fieldsIn | field names to highlight. Must have a stored string value and also be indexed with offsets. |
Query | query | query to highlight. |
Index |
searcher | searcher that was previously used to execute the query. |
System. |
docidsIn | containing the document IDs to highlight. |
System. |
maxPassagesIn | The maximum number of top-N ranked passages per-field used to form the highlighted snippets. |
Returns
Type | Description |
---|---|
IDictionary<System. |
|
HighlightFieldsAsObjects(String[], Query, IndexSearcher, Int32[], Int32[])
Expert: highlights the top-N passages from multiple fields,
for the provided int[] docids, to custom object as
returned by the Passage
Declaration
protected virtual IDictionary<string, object[]> HighlightFieldsAsObjects(string[] fieldsIn, Query query, IndexSearcher searcher, int[] docidsIn, int[] maxPassagesIn)
Parameters
Type | Name | Description |
---|---|---|
System. |
fieldsIn | field names to highlight. Must have a stored string value and also be indexed with offsets. |
Query | query | query to highlight. |
Index |
searcher | searcher that was previously used to execute the query. |
System. |
docidsIn | containing the document IDs to highlight. |
System. |
maxPassagesIn | The maximum number of top-N ranked passages per-field used to form the highlighted snippets. |
Returns
Type | Description |
---|---|
IDictionary<System. |
|
LoadFieldValues(IndexSearcher, String[], Int32[], Int32)
Loads the string values for each field X docID to be highlighted. By default this loads from stored fields, but a subclass can change the source. This method should allocate the string[fields.length][docids.length] and fill all values. The returned strings must be identical to what was indexed.
Declaration
protected virtual IList<string[]> LoadFieldValues(IndexSearcher searcher, string[] fields, int[] docids, int maxLength)
Parameters
Type | Name | Description |
---|---|---|
Index |
searcher | |
System. |
fields | |
System. |
docids | |
System. |
maxLength |
Returns
Type | Description |
---|---|
IList<System. |