Hiring Data Engineer
The médialab is hiring a Data Engineer for a 2 years contract in the framework of the project “Influence of web platform practices and algorithmic decisions on public access to climate information”
Background on the project
This position is part of the project “Influence of web platform practices and algorithmic decisions on public access to climate information”, funded by the Make Our Planet Great Again grant.
The goal of this project is to investigate the landscape of climate information available to the public in the media, social media, video platforms and search engines, including the amount of misinformation recommended by their algorithms. We intend to document the sources of information that are most prominent on web platforms like Youtube, Google or Facebook and track how the policies announced by these companies to fight ‘fake news’ and disinformation affect the relative prominence of these sources over time.
We propose to develop methodologies to monitor and document how results returned in response to frequent climate queries are changing over time and to quantify search engines’ “filter bubble” effect by which the personalization of results might either enclose users in their pre-existing beliefs or present them to challenging ideas.
The candidate will take part in the valorisation of the team’s work by contributing to the development of open-source softwares and to academic or technical communications (conferences and research papers).
Key objectives of the position
- Map the landscape of information sources and media controversy about climate change:
- identify, rank and list influential websites on the topic,
- build a database of articles on the topic using tools such as MediaCloud,
- build a database of associated shares on social media using APIs of services such as Buzzsumo, CrowdTangle, Newswhip...,
- Identify networks of websites that share misleading information on these topics (eg using the Hyphe platform) as well as network of social media accounts that amplify their message. Quantify their share of the total online discussion and identify influencers;
- Characterize key messages in articles (topic modelling, similarity detection, diffusion and transformation of key claims);
- Develop crawler for search engines (for e.g. Google and Youtube) that perform queries on climate topics and record results recommended by their algorithms;
- Develop a plugin that participants to our experiments will install in their browser allowing us to perform queries on a set of searches to measure the results that users get when using their own browsing environment (history of queries, localization, browsing history…) and compare to non-personalized results.
Data Engineer or Research Engineer or Data Scientist or PhD or Postdoc
Key skills required
- Web data harvesting (Web crawling/scraping, backlink network analysis, social media API)
- Crawling of search engines and social media platforms to emulate the behaviour of users making queries
- Text and Network analysis (basic level)
Technologies to be used
- Python in async or multithreaded mode see: Hyphe, minet, ural and gazouilloire
- Text analysis: NLP with NLTK and/or Spacy, indexation with ElasticSearch
- Network analysis: networkx or igraph
- Code and data versioning in a collaborative environment with git
You don’t need to master all those technologies but we will evaluate your experience with those. Médialab research engineers team will help you gain better skills if needed.
The position / Compensation
You will join the médialab SciencesPo in Paris and work with its team of researchers, software engineers and data scientists.
This is a 2-year position, starting as soon as possible and no later than spring 2020.
Proposed gross salary is 40k€/year and the position includes full health care coverage, 40 days of paid holidays per year along with 50% funding of public transportation and lunch tickets.
Note for non European candidates: life costs in France (social security, schools…) are much lower than in the US for instance, so this salary would offer similar life standards to e.g. a ~80k$/year would in San Francisco.
Remote work is possible one day per week maximum.
Send a CV and cover letter detailing your motivation and relevant skills for the position to this email address: firstname.lastname@example.org
Review of applications will begin as they arrive until the position is filled.