If you’re reading this post, you’re probably familiar with data. And if you’re familiar with data, you’ve probably at least heard of R, that great statistical programming tool. We’re going to focus on Google Analytics in this post, but these skills are transferable to other APIs. So how can you integrate the power of R with Google Analytics and other applications accessible with an API?
Being able to interrogate Google Analytics data is great within the interface, and you can do loads of cool stuff in Google Analytics to make the most of the wealth of data. But to combine the analytics data with other data sources to enrich your analysis, we need to pull the data from Analytics, and we can do this with R and its wealth of packages.
The primary package you’ll need to install in R is the imaginatively named RGoogleAnalytics, but other more creatively titled packages are available. To install and run this in your R system, use the following command:
install.packages(“RGoogleAnalytics”, dependencies=TRUE) library(RGoogleAnalytics)
This should install all the packages you need to retrieve the data from R. You’ll also need to make sure you have a developer’s project active so you have the credentials to access the API. To do this, create a new project from the Google Developer Console and turn on the Analytics API. Then create a new Client ID that is an installed application. You should now have access to your client ID and client secret. Use this client ID and client secret as part of the authentication using the following command:
gaToken <- Auth(client.id, client.secret)
When you first execute this command to generate the token, you will have to sign in to authorise access for the token, but only once if you follow this process.
Saving the token allows the user to repeatedly access the Google Analytics API without having to input their credentials every time, or even having to sign in with the process in the previous step. If you don’t provide a filepath, like here, then the token will be saved directly into your working directory:
save(gaToken, file=”gaToken”)
If you want to access the API with your token in a separate session, use the load function (making sure you include the filepath if you haven’t set your working directory):
load(“filepath/gaToken”)
You also should validate your token, to make sure it is up to date and can access the API. Make sure you have loaded the RGoogleAnalytics package as the validation function is included in this package:
validateToken(gaToken)
Within the validateToken function, use the Robject name that was assigned when the token was created using the Auth function, as this is the object that was saved within the token file that R can recognise.
Now you can access the Google Analytics API, and pull out as much data as Google allows using functions like QueryBuilder and GetReportData. This is one of the more complex processes for accessing an API using R. Other tools use much simpler APIs.
For retrieving keyword rankings from Advanced Web Ranking, all you need to do is submit a GET request and you’ll get your data back. Consult the API documentation for your favourite tools to retrieve data more efficiently, with added reproducibility. Doing t-test’s in Excel requires vigilance to make sure all of the references are correct, unlike a few lines of code in R which we can reference and reproduce later if we need to.
We can easily access APIs and pull a wealth of data, and combine many data sources to gain extremely rich datasets. The power available with R is that we can then analyse the data using any of the statistics and machine learning packages available to us, which we can’t access with other tools.
Summary
- Access and extract data from the Google Analytics API (and other APIs) using R
- Combine API access with the vast amount of analysis packages
- Create a reproducible workflow for extracting and analysing data using R
Click here for more information on how we can help with your insights and analytics, and put our useful tools into practice.