juro oravec

Data loaders, web scrapers, text mining

Data loaders, web scrapers, and text mining on public data, and data from Reddit, CrunchBase, or SoundCloud.

updated 19 Feb 2022
web scrapingscrapytext miningdata loader
archived

Over the years I’ve worked on various data loaders or web scrapers, integrating following sources (among others):

  • Public registers from Slovak, French and UK gov (scrapy)
  • Public images from NASA, European Space Agency, and Solar Dynamics Observatory (scrapy)
  • Public event listings from various Biotech websites (scrapy)
  • Reddit (praw)
  • SoundCloud (custom JavaScript)
  • CrunchBase (custom JavaScript)
  • Slant.co (custom JavaScript)