A short while ago I was giving a lecture on how to use Python to query Twitter. Afterwards someone came up to me and asked if I knew that Uber also provided a Python API. I didn’t, so I immediately went away and had a look at how it worked…
The Uber Python API is more poorly documented than the Twitter API, but I still managed to track down a bunch of functionality. Here’s my guide to querying Uber using Python.
For starters we’ll probably need some standard Python libraries:
import json # for reading JSON format data import pylab as pl # for plotting import matplotlib.dates as md # for date formats from datetime import datetime # for date formats from datetime import timedelta # for date manipulation import time # to get the time import numpy as np # for array manipulation
Now let’s import the Uber specific libraries. The
uber_rides library is
# stuff for querying uber: from uber_rides.session import Session from uber_rides.client import UberRidesClient
Once we have all the libraries in place we need to establish our credentials with Uber. These are similar in style to those used by Twitter and to get them we need to create an App on the Uber developers’ website. When we do this we should obtain four pieces of basic identification:
1. A client ID
2. A client secret
3. A server token
4. An access token
I tend to put all these keys into a file creatively named keys.txt.
For simple queries we will only need the server token.
Start by creating a function that starts an instance of the Uber authorization client:
def get_api(): ''' Creates an instance of the uber-rides Session class ''' with open('keys.txt') as f: client_id = f.readline().strip() server_token = f.readline().strip() client_secret = f.readline().strip() access_token = f.readline().strip() session = Session(server_token) client = UberRidesClient(session) return session, client
We can now call this function to start our Uber access:
session,client = get_api()
Identify Uber Products
The simplest request we can make to the Uber API is to find the available Uber products at a particular location. Uber has four products:
To find out which ones are available at our location we need to speciy the location in terms of longitude and latitude:
We can then query the Uber products available:
response = client.get_products(lat, lon) products = response.json.get('products') print "Number of products available: ",len(products) for i in range(0,len(products)): print "Product ",i,": ",products[i]['display_name']
Ask for Pick-Up Time Estimates
We can also ask Uber how long it would take for each of these products to reach our location at the current time:
response = client.get_pickup_time_estimates( start_latitude=lat, start_longitude=lon, ) times = response.json.get('times') for i in range(0,len(times)): print "Product ",i,": ",times[i]['display_name'],";\ Arrival time: ",float(times[i]['estimate'])/60.," min"
Get Some Price Estimates
Now let’s specify a journey. We do this using a start and end point in terms of longitude and latitude:
inlat = 53.4477 inlon = -2.22437 outlat= 53.4680 outlon= -2.2315
We can pass these values to the Uber client and get a price estimate for our journey, including details of any surge multiplier:
response = client.get_price_estimates( start_latitude=inlat, start_longitude=inlon, end_latitude=outlat, end_longitude=outlon, seat_count=2 ) prices = response.json.get('prices') for i in range(0,len(prices)): print "Product ",i,": ",prices[i]['display_name'],"; Fare estimate: ",prices[i]['estimate']
This is all very interesting, but what can we do with it? Well, it seems to me that whenever I want to go to the airport there is an Uber surge multiplier being applied. I go to the airport at all times of the day and night so I’ve found this a bit surprising. So… I’m going to track the price of a journey as a function of time and see if there’s a cheaper time of day to travel.
We can only do this in real time at the moment because Uber doesn’t give access to historic data, but we can build our own database and use that.
Let’s have a look at how the price for our journey varies as a function of time…
The Uber API limits us to 2000 queries per hour. That’s about one query every 2 seconds.
We’re going to plot the results in real time using the matplotlib interactive plotter. To begin, we need to specify our starting time and initiate the plotter:
# get an absolute start time: init_time = time.time() # initiate interactive plotting: pl.ion()
There’s always the possibility of interruptions so we’ll also save the data points into a text file:
ofile = open('uberout.txt','a')
…and then we’re good to go:
inlat = 53.4477 inlon = -2.22437 outlat= 53.4680 outlon= -2.2315 # there are four categories of uber: labels=np.array(['uberX','uberXL','uberEXEC','assist']) # plotting colors: colors=np.array(['b','r','g','y']) # start tracking: while True: response = client.get_price_estimates( start_latitude=inlat, start_longitude=inlon, end_latitude=outlat, end_longitude=outlon, seat_count=2 ) estimate = response.json.get('prices') # get the time: now = datetime.fromtimestamp(time.time()) # extract the minimum and maximum price estimate. These # also have their own key word JSON values, but meh... for i in range(0,len(estimate)): low=float(estimate[i]['estimate'][1:].split('-')) high=float(estimate[i]['estimate'][1:].split('-')) # express this as a mean and an error: mean = 0.5*(low+high) err = 0.5*(high-low) # this bit updates the plot: pl.errorbar(now,mean,yerr=err,c=colors[i], label=labels[i]) # print to screen so we can see the progress: print now,mean,err,labels[i] # print to file so we have a saved copy: ofile.write(str(now)+" "+str(mean)+" "+str(err)+" "+labels[i]+"\n") # label the axes: pl.xlabel("Time") pl.ylabel("Price") pl.title("Uber Watcher") # update the limits: pl.xlim([datetime.fromtimestamp(init_time)-timedelta(minutes=20),now+timedelta(minutes=20)]) pl.ylim([0.,20.]) # format the dates a little to just show time: xfmt = md.DateFormatter('%H:%M:%S') pl.gca().xaxis.set_major_formatter(xfmt) # make sure the xlabels fit on the plot properly: pl.gcf().autofmt_xdate() pl.xticks(rotation=80) # add a legend: pl.legend(loc="best") # this bit displays it: pl.pause(0.1) # this bit sets the timestep between queries # Note that Uber limits us to 2000 queries per hour time.sleep(60)