Application Programming Interfaces (APIs) provide convenient interfaces between different applications, independent of the application's architecture and programming language that it has been written in.
15.1 Intro to APIs
A web application API allows the user to make requests via HTTP for some type of data and the API will return this data in pre-packaged XML or JSON formats.
An API request is very much like accessing a website with a browser. Both calls use the HTTP protocol to download a file. The only difference is that an API follows a much more regulated syntax and downloads data in XML (eXtensible Markup Language) or JSON (JavaScript Object Notation) formats as opposed to HTML (HyperText Markup Language).
An API uses four separate methods to access a web server via HTTP:
GET
POST
PUT
DELETE
The GET method is used to download data. The POST method is used to fill out a form or submit information to the server. PUT is similar to post and used for updating already existing objects. Most APIs use POST instead of PUT, so PUT is very rare. Finally, the DELETE method is used to delete information/data from a web server.
Most websites that provide APIs require prior authentication to operate with the web server via the API. In other words, you will first need to 'apply for an API' by registering with the website. After registering for an API you may have to wait for a few days. Once your application is granted you are provided with an api_key value that allows the server to identify you when you are making your calls.
Some APIs require more detailed forms of identification using the OAuth protocol parameters for each request:
OAuth Parameter
Value
oauth_consumer_key
Your OAuth consumer key (from Manage API Access)
oauth_token
The access token obtained (from Manage API Access)
oauth_signature_method
hmac-sha1
oauth_signature
The generated request signature, signed with the oauth_token_secret obtained
oauth_timestamp
Timestamp for the request in secs since the Unix epoch
oauth_nonce
A unique string randomly generated per request
This seems a bit complicated, but most larger website provide sample codes for using their APIs in various popular programming languages. Sometimes, they even provide small Python libraries to facilitate the interaction with their website.
15.2 Yelp-API
For interacting with the Yelp API for instance you can find information here. There is also a Python library available that you can download from here.
If you want to install this library you simply open a command line window (in Windows click on the windows start menu and type cmd followed by enter) and type:
pip install yelpapi
Before you can use this library in a Python script you need to have a yelp account, so make one. Then log in. You then need to apply for a 'new app'. This usually takes a few minutes. You will then be issued a unique client_id as well as a unique client_secret key. These two keys identify you as the user. Do not make these keys public.
Now, in order to use this library you can then simply write a small script as follows:
Which searches for the top 5 rated ice cream places in Austin TX and orders the results by ratings. The pprint() function prints the dictionary data with nice indents so that you can see the data structure better.
Note
The yelp-api returns data in the form of a Python dictionary object. Go to the earlier chapter on data types on refresh your memory. It's often convenient to transform the dictionary data into a Pandas DataFrame. Ultimately, it's easier to analyse data when it is in a DataFrame.
Here is a small example script that shows an entire solution for this:
from yelpapi import YelpAPIfrom pprint import pprintimport pandas as pdclient_id ='YOUR_CLIENT_ID_FROM_YOUR_YELP_APPLICATION'client_secret ='YOUR_CLIENT_SECRET_FROM_YOUR_YELP_APPLICATION'def parse_dict(init, lkey=''):""" This function 'flattens' a nested dictionary so that all subkeys become primary keys and can then be read into a DataFrame as separate columns """ ret = {}for rkey,val in init.items(): key = lkey+rkeyifisinstance(val, dict): ret.update(parse_dict(val, key+'_'))else: ret[key] = valreturn ret# Access Yelp with the user idenficiation codesyelp_api = YelpAPI(client_id, client_secret)# Query Yelp via its API and search for 5 top ice cream places in Texas# This will return a 'dictionary' data object.response = yelp_api.search_query(term='ice cream', \ location='austin, tx', sort_by='rating', limit=5)# Print the results nicely with tabls for sub-keys in the resulting# dictionarypprint(response)# Prepare a Pandas DataFrame so we can transform the dictionary-data into# a Pandas DataFrame object# Nr. of obs from our searchnrObs =len(response['businesses'])index =range(nrObs)# Take first observation as an example for key-extraction#print(parse_dict(response['businesses'][0],''))tempDict = parse_dict(response['businesses'][0],'')# Make a list that contains all the keys in the dictionarycolumns = []for keys in tempDict: columns.append(keys)# Generate the empty dataframe with all the key-columnsdf = pd.DataFrame(index=index, columns=columns)# Run a loop through the dictionary data and extract it into the DataFramefor i inrange(nrObs): tempDict = parse_dict(response['businesses'][i],'')for key in tempDict:# From dictionary into dataframe df.loc[df.index[i], key] = tempDict[key]# Sort by ratingdf = df.sort_values('rating', ascending=False)
Warning If you google how to access FRED you may come across a library (or service) called Quandl. Do not use it, it does not work well in my experience.
Here is the link with instructions about how you can install the library pandas-datareader
Simply open a terminal window and type:
conda install -c anaconda pandas-datareader
You will be prompted whether you want to install this library and possible a couple others. Simply type y to accept and the pandas-datareader library will install. You need internet access for this. Restart the iPython console within Spyder by simply closing it. It will then open a new interactive console (command line window) within Spyder. Now you are good to go.
You can now easily download data from the FRED database as follows:
Error in library(tidyquant): there is no package called 'tidyquant'
start<-as.Date("1947-01-01")end<-as.Date("2024-03-01")# Quarterly data# -----------------------------------------------------------------------------cat('Downloading quarterly data ... \n')data_list<-c('USRECQ', 'GDP', 'GNP', 'GDPC1')df_q<-tq_get(data_list, from =start, to =end, get ="economic.data")
Error in tq_get(data_list, from = start, to = end, get = "economic.data"): could not find function "tq_get"
Error in eval(expr, envir, enclos): object 'df_q' not found
Downloading quarterly data ...
This will download GDP and other variables and plot them. When you download the data you can specify the time frame. This will of course depend on the specific time series you are interested in. Read the documentation about the data on the FRED to see what is available.