Data Curation to Reveal True Historical Uncertainty: From Navigocorpus to Portic, an Interdisciplinary Story of Dirty Data Cleaning
Christine Plumejeaud-Perreau, Silvia Marzagalli, Pierre Niccolò Sofia, Robin De Mourat
Interdisciplinary collaboration between computer scientists and historians has been paving the way for innovative analyses, notably through the ability of computer tools to aggregate or disaggregate massive data. While these tools appear to deliver “clean” data, they mask the fact that data are a construct, and “imperfect” (contradictory, incomplete, imprecise) in nature. This article explains the process of data curation and uncertainty qualification performed during the “Portic” research project. Intense discussions and negotiation of our different disciplinary priorities and practices were necessary to propose data visualizations showing the degree of historical interpretation so that historians can literally take the measure of the uncertainty of the future of the past.