The Rated Ranking Evaluator (RRE) is a search quality evaluation tool which, as the name suggests, evaluates the quality of results coming from a search infrastructure.
At the moment Apache Solr and Elasticsearch are supported (see the documentation for supported versions).
The following picture illustrates the RRE ecosystem:
As you can see, there are a lot modules already in place and planned (those with the dashed border)
The whole system has been built as a framework where metrics can be configured/activated and even plugged-in (of course, this option requires some development) The metrics that are part of the current RRE release are:
On top of those "leaf" metrics, which are computed at query level, RRE provides a rich nested data model, where the same metric can be aggregated at several levels. For example, queries are grouped in Query Groups and Query Groups are grouped in Topics. That means the same metrics listed above are also available at upper levels using the arithmetic mean as aggregation criteria. As a consequence of that, RRE provides also the following metrics:
One the most important things you can see in the screenshot above is that RRE is able to keep track (and to make comparisons) between several versions of the system under evaluation.
It encourages an incremental/iterative/immutable approach when developing and evolving a search system: assuming you're starting from version 1.0, when you apply some relevant change to your configuration, instead of changing that version, is better to clone it and apply the changes to the new version (let's call it 1.1).
In this way, when the system build happens, RRE will compute everything explained above (i.e. the metrics) for each available version.
In addition, it will provide the delta/trend between subsequent versions, so you can immediately get the overall direction where the system is going, in terms of relevance improvements.