If you're like me you've probably been using Google Books without really thinking about it. I never really considered that there might be a philosophy or purpose to it. I just assumed that people put books online because... well, they can. It turns out that there's quite a lot more to it. However, if you've …
Friends of Fiends
It's not a typo. Well it is, but this time it's deliberate. I mistype friends-of-friends so often that I've decided to just give in and call my algorithm "friends-of-fiends" (FOF) instead. Problem statement: Student X is analysing data towards a galaxy cluster that doesn't have a known redshift, i.e. a known distance away from us. …
Making Faces
Recently I've been thinking about classification again, this time with more of an emphasis on object detection. Even surprising complex and diverse objects can be detected using machine learning algorithms: trees, people, fruit... First off, let's draw a line between detection and recognition. For example, when it comes to people, facial detection is simply telling …
Twitter in Python
There's a lot of social media analysis going on these days. One of the prime sources of data is Twitter - and it's available to everyone. Well, mostly. Twitter data flows continuously in a stream. Around the world people Tweet about 6,000 times a second. That's about 500 million Tweets per day. If you want …
Classified Locations
Classification is not something I do a lot of day to day, so when I was asked to give a lecture on random forest classification using scikit_learn I went looking for a random example to help me play around. In my lecture I (of course) started off with the canonical scikit_learn classification example using Fisher's …
Uber Python
A short while ago I was giving a lecture on how to use Python to query Twitter. Afterwards someone came up to me and asked if I knew that Uber also provided a Python API. I didn't, so I immediately went away and had a look at how it worked... The Uber Python API is …
Hello, world.
Physics. Data Science. General Geekery.