How to extract site search terms from subfolders in Google Analytics
Google Analytics has a lot of really nice features. This post will combine two of them: filters and internal site search. I am a big fan of monitoring internal site search queries. Google gives good guidance for query parameter-based search and for post-based searches, but there is little to no guidance on how to extract site search data for site searches that generate brand new subfolders (or subdirectories, if you’d rather).
You will need:
- Admin access to Google Analytics
- A site whose internal site search generates subfolders
- A testing profile
- Use of advanced filters (example below)
- An understanding of RegEx (example below)
‘A site whose internal site search generates subfolders’: What does that even mean?
I got the idea for this blog post because one of our clients has an internal site search that generates URLs that look like these, where the search query (“foobar”) is within its own subfolder:
This kind of site search generation is not supported by Google Analytics by default, so this post is a tutorial for how to generate a report that looks like this:
Write RegEx to replicate your internal site search URL structure
Using the example above, their URL structure for internal site searches looks like this:
Our end goal is to extract the search term from the second subfolder. To do this, we need to write some RegEx (I have a few good references in another post) that separates the non-search components from the search terms. This can be done by using round brackets () to group parts of the URL together.
So the example above translates to:
- The first section begins with “/search-results/” (denoted by ^)
- The second section can be any string of characters (denoted by .*)
- The final section is a number (denoted by [0-9]*/$)
This can modified depending on how your internal site search folder structure looks.
Use an advanced filter to extract the search term
In Google Analytics, you need to create a new advanced filter to extract the search term from the URL:
PRO TIP: Always test a new filter on a testing profile before adding it to your main reporting profile.
An advanced filter in Google Analytics works using one or two fields to extract data from (input fields) and combine it into a final field (output field).
Google Analytics reads input fields as RegEx groups, which is why we wrote the RegEx as we did above. Google will then translate your groups in “A1, A2, A3…” and “B1, B2, B3…”. In this example, “/search-results/” becomes “A1”, the search term becomes “A2” and the page depth becomes “A3”. We do not have any “B” terms as we are only interested in extracting data from one dimension.
You then specify what you want to extract, and in this case it is that second group within the RegEx. We want to extract “A2” and implement it as a search term.
This results in a filter that looks like this:
- Look for URLs (request URI) structured as defined in the RegEx [Field A->Extract A]
- Extract the second part of that URL [$A2]
- Place this as your search term [Output to ->Constructor]
Save your filter.
Now if you go to your Behaviour>Site Search>Search Terms report, you should have a list of terms that people are searching for on your site.