1. médialab Sciences Po
  2. Productions
  3. OpenRefine

OpenRefinerecommended by the médialab

A free open source desktop application for working with messy data

Tools – Software

Metaweb, Google

OpenRefine is most useful where you have data in a simple tabular format such as a spreadsheet, a comma separated values file (csv) or a tab delimited file (tsv) but with internal inconsistencies either in data formats, or where data appears, or in terminology used. OpenRefine can be used to standardize and clean data across your file without modifying original/raw data. It can help you:

  • Get an overview of a data set
  • Resolve inconsistencies in a data set, for example standardizing date formatting
  • Help you split data up into more granular parts or sort it
  • Match local data up to other data sets
  • Complete a data set with data from other sources

It keeps the data private on your own computer until you want to share them.

processing

all audiences

usable

2010