large language models

Analyzing AI Evaluation Benchmarks Through Information Retrieval and Network Science

Poster Presentation - The 48th European Conference on Information Retrieval (ECIR 2026). Delft, The Netherlands.

Large Language Models as Assessors: On the Impact of Relevance Scales

Poster Presentation - The 48th European Conference on Information Retrieval (ECIR 2026). Delft, The Netherlands.

Analyzing AI Evaluation Benchmarks Through Information Retrieval and Network Science

Many analyses have been performed on Information Retrieval (IR) evaluation benchmarks. Benchmarking also plays a central role in evaluating the capabilities of Large Language Models (LLMs). In this paper, we apply an IR approach to LLM evaluation. …

Large Language Models as Assessors: On the Impact of Relevance Scales

Traditionally, relevance judgments have relied on human annotators, but recent advances in Large Language Models (LLMs) have prompted growing interest in their use as a proxy for relevance judgments. In this setting, a key yet underexplored factor is …

Large Language Models for Combinatorial Optimization: A Systematic Review

This systematic review explores the application of Large Language Models (LLMs) in Combinatorial Optimization (CO). We report our findings using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We conduct a …

AIDME: A Scalable, Interpretable Framework for AI-Aided Scoping Reviews

Poster Presentation - The 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. Padua, Italy.

AIDME: A Scalable, Interpretable Framework for AI-Aided Scoping Reviews

Scientific publishing is expanding rapidly across disciplines, making it increasingly difficult for researchers to organize, filter, and synthesize the literature. Systematic reviews address this challenge through structured analysis, but the early …

PILs of Knowledge: A Synthetic Benchmark for Evaluating Question Answering Systems in Healthcare

Patient Information Leaflets (PILs) provide essential information about medication usage, side effects, precautions, and interactions, making them a valuable resource for Question Answering (QA) systems in healthcare. However, no dedicated benchmark …

Large Language Models for Combinatorial Optimization: A Systematic Review

This systematic review explores the application of Large Language Models (LLMs) in Combinatorial Optimization (CO). We report our findings using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We conduct a …

Report on the 14th Italian Information Retrieval Workshop (IIR 2024)

IIR 2024, the 14th Italian Information Retrieval Workshop, served as the annual event for the IR and RS communities both in Italy and collaborating with Italian research institutions. This year's event spanned two days and featured studies on various …