Society2Vec — from categorical prediction to behavioral traces
Publications – Communication
As the history of statistics has shown, the deployment of large societal measures has accompanied the probabilistic idea that while social phenomena were not governed by deterministic laws, it was possible to interpret society on the basis of observable regularities (Gigerenzer & al., 1989). As Ian Hacking (2002) points out, the development of statistics is inseparable from the existence of democratic and liberal societies where the recognition of the principle of indeterminacy of individual actions is, in a certain way, recaptured through the objectivation of overall regularities. Individuals are autonomous, but their behaviours are decipherable. The probabilistic paradigm has thus replaced the causality of natural laws with a technique for reducing uncertainty which, at the end of a vast process of categorizing populations, has allowed behaviour to be distributed according to a normal law. Thus, the immense efforts to record, quantify and measure the society that emerged in the 19th century have enabled social statistics to build trust in numbers (Porter, 1995). The measurement of populations by institutions and social sciences has only been possible thanks to the development of a stable and shared categorization system (Foucault, 2004). The social identity of individuals (gender, age, diploma, profession, family status, etc.), a set of behaviours identified by institutions (housing, travel, health, safety, etc.) or the recording of social practices (culture, crime, leisure, consumption, nutrition, etc.) have shaped the framework of a set of categories that can associate behaviours to the place and status of individuals in the society (Desrosières, 2000). Both practically, through the investment and maintenance of a codified system of record-keeping, and epistemologically, by distributing statistical events around average values, the "frequentist" method of social statistics has thus helped to solidify "constant causes". Embedded in regular and reliable institutional and technical mechanisms, they have acquired a kind of exteriority that provides a solid support for establishing correlations about many social phenomena and for giving them a causalist interpretation. The categories of descriptions of the social world have thus emerged as essential tools for interpreting social phenomena and have been internalized by individuals as collective representations to read the social world (Boltanski, Thévenot, 1983). The hypothesis of this paper is that the new computational techniques used in machine learning provide a new way of representing society, no longer based on categories but on individual traces of behaviour. The new algorithms of machine learning replace the regularity of constant causes with the "probability of causes". It is therefore another way of representing society and the uncertainties of action that is emerging. To defend this argument, this communication will propose two parallel investigations. The first, from a science and technology history perspective, traces the emergence of the connexionist paradigm within artificial intelligence techniques. The second, based on the sociology of statistical categorization, focuses on how the calculation techniques used by major web services produce predictive recommendations. Since 2010, machine learning based predictive techniques, and more specifically deep learning neural networks, have achieved spectacular performances in the field of image recognition or automatic translation, under the umbrella term of “Artificial Intelligence”. But their filiation to this field of research is not straightforward. In the tumultuous history of AI, learning techniques using so-called "connectionist" neural networks have long been mocked and ostracized by the "symbolic" movement. From a social history of science and technology perspective, we’ll explain how researchers, relying on the availability of massive data and the multiplication of computing power have undertaken to reformulate the symbolic AI project by reviving the spirit of adaptive and inductive machines dating back from the era of cybernetics. The connexionist approach (deep learning) proposes a series of fundamental transformations of numerical computation. Inductive machines undertake to granuralize the data that enters into the calculator. It focuses on probabilizing the models applied to the data. Finally, it gives an immediate and utilitarian objective to the production of the model: to identify objects on images, to guess the next click of the Internet user or to gather items with common properties. Using a series of examples of recommendation techniques developed by web services, we will show how this reconfiguration of predictive calculations is carried out in order to integrate unexpected events and sensitivity to contextual variations in an original way. The calculation developed by connexionist methods represents society in a way that no longer corresponds to the requirements of centrality, univocity and generality of statistical methods which distribute individuals around the mean according to a normal distribution. Personalized prediction techniques do not require any idea of a totalizing representation. It does not create an unambiguous representation, but modifies it according to the position of each person. It does not seek a generality common to all statistical individuals, but anticipates local truths (Mackenzie, 2013). By conducting a narrative that crosses the transformations of calculation and representations of society, we will then hypothesize that the current deployment of personalized predictive techniques closely follows contemporary forms of individuation in society. This trend is based in particular on the growing development since the 1970s of a critique of statistical categorization. Normative, normalizing and reducing, the representation of the position of individuals within the major categorical nomenclatures is increasingly discredited. It has been challenged by constructivist approaches in social sciences (Boltanski, 2014). It is constantly challenged in ordinary speeches that claim the inalienable singularity of subjects. The categories of representation used by sociology, demography, marketing or economics then appear as a silly and castrating prison. The hypothesis that will be explored in the conclusion of this paper is to underline that the successes and expectations expressed with regard to new forms of calculation constitute a technological response to the transformations of our societies. Bibliography Boltanski (Luc), « Quelle statistique pour quelle critique ? », in Bruno (Isabelle), Didier (Emmanuel), Prévieux (Julien), dir., Stat-activisme. Comment lutter avec des nombres ?, Paris, Zones, 2014, pp. 33-509. Boltanski (Luc), Thévenot (Laurent), “Finding one’s way in social space : a study based on games”, Social Science Information, vol. 22, n°4-5, 1983, pp. 631-680. Cardon (Dominique), Cointet (Jean-Philippe), Mazières (Antoine), “La revanche des neurones. L’invention des machines inductives et la controverse de l’intelligence artificielle », Réseaux, n°211, 2018, pp. 173-220. Desrosières (Alain), The Politics of Large Numbers. A History of Statistical Reasoning, Cambridge, Harvard University Press, 2000. Foucault (Michel), Sécurité, territoire, population. Cours au Collège de France. 1977-1978, Paris, Gallimard/Seuil, 2004. Gigerenzer (Gerd), Swijtnik (Zeno), Porter (Theodore), Daston (Lorraine), Beatty (John), Krüger (Lorenz), The Empire of Chance. How Probability Changed Science and Everyday Life, Cambridge, Cambridge University Press, 1989. Hacking (Ian), L'émergence de la probabilité, Paris, Seuil, 2002. Mackenzie (Adrian), “Programming subjects in the regime of anticipation: software studies and subjectivity”, Subjectivity, vol. 6, n°4, 2013, pp. 391-405. Porter (Theodore M.), The Rise of Statistical Thinking 1820-1900, Princeton, Princeton University Press, 1986. Porter (Theodore M.), Trust in Numbers. The Pursuit of Objectivity in Science and Public Life, Princeton, Princeton University Press, 1995.