Tracking Inauthentic Amplification on Social Media: A Scalable Method to Detect Duplicated Messages
Manon Richard, Agnès Mustar, Lisa Giordani, Cristián Brokate, Pauline Bouchaud, Pedro Ramaciotti
Publications – Grey literature
Social networks are fertile ground for information manipulation campaigns. A common modus operandi is coordinated mass posting, which consists of publishing a large number of similar messages to give the impression that an opinion is widely shared. Despite potentially massive in volume, the use of sophisticated translation and generative AI tools makes the detection of such content a challenging process. To detect these artificially amplified texts, we propose the 3∆-space duplicate methodology. It considers three key aspects of messages: their semantic content, their grapheme structure, and their language. Computing pairwise distances within these dimensions enables the detection of abnormally close messages. Our methodology is designed to detect textual content that has been amplified through the use of copy-pasta, reformulation, and translation, enabling to target a wide range of messages. Its application on a real-world dataset of tweets about COP28 validates its effectiveness and its capabilities for analysing coordinated actions. Our contribution also includes the release of the first dataset for this task, created using well-known amplification tactics. The code and the dataset are released in open source.