Focused Relevance Judgments Using Large Language Models and Their Application to Document Retrieval
Sun 25.01 12:30 - 13:00
- Graduate Student Seminar
-
Bloomfield 527
Abstract: The ad hoc document retrieval task is to rank documents by their presumed relevance to a query. Most TREC benchmarks provide document-level relevance judgments with no markup indicating the specific segments that contain relevant information. Focused relevance judgments, which highlight relevant text at the character level, are valuable but scarce. This work presents a study of ranking approaches, including lexical, dense and zero-shot prompted large language models (LLMs), to rank passages in relevant documents based on the presumed fraction of relevant text they contain. Our analysis shows that LLM-based rankings are highly effective and outperform strong sparse and dense retrieval baselines. The progress with present for this task has downstream implications for passage-level evaluation, relevance feedback, and in-context learning with LLMs.

