Uralmade by the médialab
python library offering a variety of url-related utilities
Tools – Code
Guillaume Plique, Jules Farjas, Oubine Perrin, Benjamin Ooghe-Tabanou, Martin Delabre, Pauline Breteau
Ural is a python library exposing many utilities that can be used to process urls.
It is the result of many years of experience in the field of webmining (i.e. through the web crawler Hyphe) and offers its users various heuristics able to tame even the most devious urls.
Thus, Ural is for instance able to:
- normalize urls
- parse urls coming from well-known platforms such as Google, Facebook, Youtube, etc.
- detect shortened urls
- perform hierachical queries on urls
- extract urls from html