1. médialab Sciences Po
  2. Productions
  3. xan

xanmade by the médialab

command line tool to efficiently process CSV files

Tools – Code

Andrew Gallant, Guillaume Plique, Laura Miguel, Béatrice Mazoyer, César Pichon, Anna Charles

xan is a command line tool that can be used to process large CSV files efficiently.

Result of the
Result of the "view" command that can be used to visualize a CSV file in the terminal

This tool is a fork of xsv, that was originally written by Andrew Gallant (aka @BurntSushi) and forked by the lab.

The tool was heavily rewritten and improved by the lab's engineer to fit our daily use-cases.

We added, among many other features, a dynamic scripting language that can be evaluated for each row of a file, external sorting, efficient reverse reading, k-way merging of already sorted files and many other things.

As a lot of our other tools produce and consume CSV files, it was only natural that we might want to find ways to mangle those files faster and without requiring ad-hoc scripting.

We therefore encourage anyone dealing with large CSV files to try our fork.

curation and processing