-- May 12 In-Class Exercise Thread
Kevin, Sriramm, Mustafa
1) Set a threshold so that, from the whole corpus, we will only return 100 documents (P@100). Since X docs have score > N-X, we can say that, if X is 100, our threshold will be 1000 - 100 = 900.
2) Next, look at each batch of 100; for a given batch, see what documents have score greater than 900. For the documents Y with score over 900 in the batch, let rel_Y be the (human-determined) relevant documents. The precision for the batch is |rel_Y| / |Y|.
3) Compute the average of these precision scores over all batches. That is the aggregate precision @ 100.
(
Edited: 2021-05-12)
Kevin, Sriramm, Mustafa<br>
1) Set a threshold so that, from the whole corpus, we will only return 100 documents (P@100). Since X docs have score > N-X, we can say that, if X is 100, our threshold will be 1000 - 100 = 900.<br>
2) Next, look at each batch of 100; for a given batch, see what documents have score greater than 900. For the documents Y with score over 900 in the batch, let rel_Y be the (human-determined) relevant documents. The precision for the batch is |rel_Y| / |Y|.<br>
3) Compute the average of these precision scores over all batches. That is the aggregate precision @ 100.