Currently, in AutoRAG, the Retrieval token metric is only used by the Passage Compressor Node
. It measures performance by comparing the compressed passage to Answer_gt.
When comparing Passage and Answer gt, the comparison is made on a per token basis, which you can see by looking at the example
answer gt = ['Do you want to buy some?']
result = ['Do you want to buy some?', 'I want to buy some', 'I want to buy some water']
First, let's break up gt and result into tokens
['do', 'you', 'want', 'to', 'buy', 'some']
['do', 'you', 'want', 'to', 'buy', 'some'], ['I', 'want', 'to', 'buy', 'some'], ['I', 'want', 'to', 'buy', 'some', 'water']
Next, let's look at the number of overlapping tokens in gt and result
Number of overlapping tokens / token length in result