1. médialab Sciences Po
  2. Productions
  3. Ural

Uralmade by the médialab

python library offering a variety of url-related utilities

Tools – Code

Guillaume Plique, Jules Farjas, Oubine Perrin, Benjamin Ooghe-Tabanou, Martin Delabre, Pauline Breteau

Ural is a python library exposing many utilities that can be used to process urls.

It is the result of many years of experience in the field of webmining (i.e. through the web crawler Hyphe) and offers its users various heuristics able to tame even the most devious urls.

Thus, Ural is for instance able to:

  • normalize urls
  • parse urls coming from well-known platforms such as Google, Facebook, Youtube, etc.
  • detect shortened urls
  • perform hierachical queries on urls
  • extract urls from html
  • etc.

processing

developers

usable

2018