Wherein I fail to make Python read medieval Chinese calligraphy correctly, but I learn a lot about Chinese calligraphy and optical character recognition. The Plan It seems sensible that if we can read Chinese in our Jupyter notebook, we should try to translate the writing that's actually on the National Palace Museum exhibits and not …
Gaussian Processes in Python
I'm guessing that most people are pretty comfortable with the concept of uncorrelated Gaussian noise. It's the most frequently assumed noise. Even if you don't realise it, you're probably assuming Gaussian noise. Quick check: Are you using a chi-squared test to fit your data? Yes? Well there you go. Co-variate Gaussian Noise Here I'm going …
Night at the Museum: Translation in Python
I love Taipei. I also love Open Data. So I was very happy to read that the National Palace Museum in Taipei had an open data project. According to the article, the museum has put images and meta-data for 70,000 items online. So what do you get if you download the information on a particular …
Continue reading "Night at the Museum: Translation in Python"
Document Scraping with Python
Tired of reading all those documents everyone keeps sending you? Why not get your Jupyter Notebook to do it for you and condense the information? I'm joking of course... but if say you did want to read pdf documents directly in Python, how would you do it? Recently I had a go at doing just …
In the PYNQ: Set-up
Recently I got hold of a PYNQ-Z1 board and accessories kit from Digilent via the Xilinx University Program. This nifty piece of kit supposedly lets you program an FPGA (the Xilinx ZYNQ) using a Jupyter notebook. What's in the Box? Here's a picture of what's inside: Getting Started I'm following the PYNQ Guide to Getting …
Google Books from Jupyter
If you're like me you've probably been using Google Books without really thinking about it. I never really considered that there might be a philosophy or purpose to it. I just assumed that people put books online because... well, they can. It turns out that there's quite a lot more to it. However, if you've …
Friends of Fiends
It's not a typo. Well it is, but this time it's deliberate. I mistype friends-of-friends so often that I've decided to just give in and call my algorithm "friends-of-fiends" (FOF) instead. Problem statement: Student X is analysing data towards a galaxy cluster that doesn't have a known redshift, i.e. a known distance away from us. …
Making Faces
Recently I've been thinking about classification again, this time with more of an emphasis on object detection. Even surprising complex and diverse objects can be detected using machine learning algorithms: trees, people, fruit... First off, let's draw a line between detection and recognition. For example, when it comes to people, facial detection is simply telling …
Twitter in Python
There's a lot of social media analysis going on these days. One of the prime sources of data is Twitter - and it's available to everyone. Well, mostly. Twitter data flows continuously in a stream. Around the world people Tweet about 6,000 times a second. That's about 500 million Tweets per day. If you want …
Classified Locations
Classification is not something I do a lot of day to day, so when I was asked to give a lecture on random forest classification using scikit_learn I went looking for a random example to help me play around. In my lecture I (of course) started off with the canonical scikit_learn classification example using Fisher's …