The UK parliament has a bunch of different committees who are tasked with conducting inquiries and producing reports about a range of different topics. The individual inquiries normally have a fixed duration and they hear evidence from a range of individuals known as witnesses. The UK parliament website maintains a list of open inquiries that …
On The Buses II: Fuzzy String Matching
This is the second part of a series of posts about my pet data science project exploring the availability of transport across different areas of Manchester. For those playing catch-up, you might want to take a look at the first post in this series before continuing. In the first post I looked at how to …
On the Buses
I've started a little project to look at availability of public transport across Manchester. To keep things easy to follow I'm going to split up the different elements of the project between blog posts. First off, I want to know all the bus routes in Greater Manchester and I want to know which ones go …
Mining Twitter with Selenium
This is great for freaking people out. It looks like a ghost is typing in your web browser. Web crawling using html parsers to grab links and navigate to new pages with the requests library is all very well, but when you want to physically submit search terms, or login details, or click buttons (etc.) …
Web Scraping YouTube Videos in Python
Web crawling and web scraping are two sides of the same coin. Web scraping is simply extracting information from the internet in an automated fashion. Web crawling is about indexing information on webpages and - normally - using it to access other webpages where the thing you actually want to scrape is located. YouTube is …