1. médialab Sciences Po
  2. News
  3. How can we study audience participation in the #gettymuseumchallenge? Focus on the tools developed by the médialab

How can we study audience participation in the #gettymuseumchallenge? Focus on the tools developed by the médialab

How did the médialab's tools help analyze the #gettymuseumchallenge viral phenomenon? Find out how the tools developed at the médialab helped to study the massive public participation in this viral phenomenon.

Post

In an article published in Hybrid magazine, Béatrice Mazoyer, research engineer at médialab, and Martine Créac'h, professor of literature at Université Paris 8, look at the #gettymuseumchallenge phenomenon on Instagram. This challenge, which arose during the COVID epidemic, involved Internet users disguising themselves as works of art using everyday objects. In this study, the scale of the phenomenon necessitated the use of automatic data collection, processing and analysis tools, which are presented below.

Survey method using médialab tools

Béatrice Mazoyer used minet, a Python library and command-line tool designed to facilitate webmining. Thanks to minet, it was possible to download around 80,000 publications associated with the hashtags #tussenkunstenquarantaine and #gettymuseumchallenge.

After collection, data processing was required to organize this mass of information. Another Medialab device, xan, was used to sort publications by date, enabling the challenge's progress to be tracked over several months. Xan is particularly effective for managing and sorting large CSV files.

Finally, to group similar images together and identify recurring representations of certain paintings, Béatrice used PIMMI, a visual mining software designed by médialab. PIMMI detects image copies, whether total or partial, in large corpora, and groups images that have a part in common.

Reproduction Sidney Nolan Ned Kelly, 1946 National Gallery of Australia, Canberra. Gift of Sunday Reed, 1977
Reproduction Sidney Nolan Ned Kelly, 1946 National Gallery of Australia, Canberra. Gift of Sunday Reed, 1977

Example of a group of images detected by PIMMI

Once the image groups were created, it became easy to count the number of images in each group, which made it possible to produce the illustration below.

Additionally, the two researchers also used Panoptic, a tool developed by the CERES laboratory at Sorbonne University, designed to explore and annotate large image corpora. Created to facilitate the curation of large image datasets, Panoptic integrates similarity-based clustering algorithms. 

Chart created in 2024 illustrating the popularity (in number of images) of the most popular works in the corpus. Authors: Martine Créac'h and Béatrice Mazoyer.  Source: Instagram, between 2020 and 2023.
Chart created in 2024 illustrating the popularity (in number of images) of the most popular works in the corpus. Authors: Martine Créac'h and Béatrice Mazoyer. Source: Instagram, between 2020 and 2023.

Popularity of the most frequently reproduced works: one image in the graph represents around 70 images actually present on Instagram.

Other tools to complete the survey methodology

In addition to the médialab's resources, the survey also integrated other open source solutions for automatically analyzing the text of Instagram posts. These include Stanza, a Python library developed by the Stanford NLP Group. This tool comes in handy in several text processing functions, such as tokenization, lemmatization, grammatical annotation, dependency analysis and named entity recognition. In the article, it was the named entity detection feature that was used to automatically extract the names of artists quoted in Instagram posts.

To analyze the spread of the challenge over the months, the article relies on an analysis of the languages in which the posts are written. To find the language of each post, the FastText platform was used. FastText is a lightweight, open-source Python library that facilitates the learning of text representations and text classification.

The Python processes using Stanza and FastText have been published separately in a github repository. This repository also contains a list of the urls of the Instagram posts collected, with a view to scientific reproducibility.

Conclusion 

The use of digital resources developed by the médialab, along with other open-source solutions, was essential in analyzing the viral phenomenon of the #gettymuseumchallenge. Tools such as Minet for large-scale data collection, Xan for processing massive CSV datasets, and PIMMI  for visual analysis enabled the researchers to gather data and conduct their investigation.

The combined use of these digital tools illustrates their central role in contemporary social science research, particularly in the processing and analysis of big data.

Beyond the methodological aspect, the study highlighted the deeply social nature of the meme: born in a context of widespread isolation, the #tussenkunstenquarantaine movement shows how confined audiences reappropriated works of art to maintain a sense of connection—between individuals, but also with cultural institutions. This amateur activity, made visible through digital tools, reveals a form of collective resilience and cultural engagement in times of crisis.