Scientific Web Claims: A survey of definitions, tasks, datasets and methods
Salim Hafid, Sandra Bringay, Konstantin Todorov
Publications – Grey literature
Scientific web claims are seen as scientific claims as observed on the Web, across social media, online news, and other platforms. The growing prevalence of scientific discussions on the Web has intensified the need to process and assess this specific type of claims. Unlike claims from scientific publications, scientific web claims are expressed in lay terms, are often decontextualized, and typically lack proper citations, which poses unique challenges for their identification, verification, and communication. Nevertheless, the correct processing of scientific web claims is crucial to keeping online science discussions accurate and informed, for instance through fact-checking. This survey provides the first systematic overview dedicated specifically to scientific web claims. We review and compare existing definitions, task formulations, datasets, and methodological approaches across three major perspectives: (1) Scientific fact-checking on the Web, (2) Scientific citations on the Web, and (3) Science communication on the Web. Our interdisciplinary analysis integrates insights from natural language processing, information retrieval, artificial intelligence, social sciences, and science communication. We identify major methodological challenges, including the lack of unified definitions, domain-agnostic corpora, and foundational models tailored to science-related online discourse. We also discuss challenges related to the existing interplay between emotions and distortions of science online. By mapping current research efforts and highlighting open problems, this survey lays the groundwork for developing robust datasets, methods, and evaluation frameworks to advance the automated processing of scientific web claims, a necessary capability for strengthening the reliability of science-related online discourse at scale.