If you are not sure what these letters mean and how they are derived, please take a look at the intro post which is more of a primer into contingency tables. If you understand contingency tables but want to get on the same page regarding the quadrants and the respective letter codes used here open the intermediate contingency table post.
RECALL is the percentage of responsive documents that the search found; the number of responsive documents in the search divided by the number of responsive documents in the population. The population contains five responsive documents, the search found three of them.
3/5 = .6
PRECISION is the percentage of retrieved documents that are responsive; it is the number of responsive documents in the search results divided by the total number of
documents in the search results. The search retrieved four documents, three of them are responsive.
3/4 = .75
ELUSION is the percentage of unretrieved documents which are responsive and should have been retrieved, or the proportion of predicted negatives that are incorrect. The search left six unretrieved documents, two of them are responsive.
Instead of counting the responsive documents that we found, we count the ones left behind. H. L. Roitblat, Measurement in eDiscovery
FALLOUT is the percent of nonresponsive documents retrieved. The population has five nonresponsive documents, the search incorrectly retrieved one of them.
FALLOUT measures how quickly PRECISION drops as RECALL increases.
NEGATIVE PREDICTIVE VALUE reflects the
percentage of non-retrieved documents that are
in fact not responsive. The search yielded six non-retrieved documents – of
these, four were not responsive.
NPV is also 100% – ELUSION
PREVALENCE is the percentage of all documents
which are true responsive. The population has ten documents, five are responsive.
NOTICE: This metric does not care about search results.
SPECIFICITY is the percentage of true nonresponsive documents that are currently identified as nonresponsive. The population has five nonresponsive documents, four were correctly identified.
FALSE NEGATIVE RATE is the percentage of Responsive documents that are missed.
False Negative Rate plus Recall = 100% (remember that Recall is aka True Positive Rate)
ACCURACY is the percentage of documents that the search correctly retrieved.
Accuracy is 100% – Error
In highly prevalent or rich data sets (Or sets with extremely low prevalence or richness), Accuracy is a poor measure. Consider a set with 95 percent nonresponsive documents – 95 percent accuracy can be achieved by marking everything nonresponsive.
ERROR is the percentage of documents that are incorrectly coded.
Error can also be calculated: 100% – Accuracy
The warning regarding extremes of prevalence or richness applies to Error as well. The utility of Error as a search metric goes down as richness gets extremely high or low.
FALSE ALARM RATE is the percentage of search retrieved documents that are nonresponsive.
This metric does not care about the null set.