Retrieval Metrics

0. Understanding AutoRAG's retrieval_gt

(sample) retrieval gt = [['test-1', 'test-2'], ['test-3']]

(sample) retrieval result = ['test-1', 'pred-1', 'test-2', 'pred-3']

As you can see from the gt, retrieval_result must contain either 'test-1 or test-2' and test-3 for all answers to be accepted as correct.

So, if we labeled the retrieval_result as ‘1’ for a correct answer and ‘0’ for an incorrect answer, it would be [1, 0, 1, 0].

For more information, please refer to the following Dataset Format Docs.

In the Apply ‘✅ Basic Examples’ below, we'll use the following sample gt and result. Hopefully you remember them well.

Precision is the percentage of what the model classifies as true that is actually true.

❗So in AutoRAG, Precision is (number of correct answers in result) / (length of result).

Retrieval resulted in the correct answer [1, 0, 1, 0].

Precision = (number of correct answers in result) / (length of result).

So in this case, precision is 2/4 = 0.5.