How To Write and Test Regex Filters for Google Analytics (With Examples)
As with many of my articles here, I do some research for a client and then write about it here. To be honest, there are a couple of reasons why… first is that I have a terrible memory and often research my own website for information. Second is to help others who may also be searching for information.
What is a Regular Expression (Regex)?
Regex is a development method to search and identify a pattern of characters within the text to either match or replace the text. All modern programming languages support Regular Expressions.
I love regular expressions (regex) but they can be a little bit frustrating or infuriating to learn and test. Google analytics has some amazing capabilities… where you can create views with regular expressions or filter your data within regular expressions.
For example, if I wanted to see just the traffic on my tag pages, I could filter for /tag/ in my permalink structure by using:
/tag\/
The syntax is critical there. If I just used “tag”, I would get all pages with the term tag in them. If I used “/tag” then any URL that starts with tag would be included, like /tag-management because Google Analytics default to including any character after the regular expression. So, I need to ensure that I have the following slash included… but it has to have an escape character on it.
Regex Syntax Basics
Syntax | Description |
^ | Begins with |
$ | Ends with |
. | A wildcard for any character |
* | Zero or more of the previous item |
.* | Matches any characters in |
? | Zero or one time of the previous item |
+ | One or more times of the previous item |
| | The OR operator |
[abc] | A or b or c (can be any number of characters) |
[a-z] | Range of a to z (can be any number of characters) |
[A-Z] | Range of A to Z (capitalized) |
[0-9] | Range of 0 to 9 (can be any number) |
[a-zA-Z] | Range of a to Z or A to Z |
[a-zA-Z0-9] | All alphanumeric characters |
{1} | Exactly 1 instance (can be any number) |
{1-4} | Range of 1 to 4 instances (can be any number) |
{1,} | 1 or more instances (can be any number) |
() | Group your rules |
\ | Escape special characters |
\d | Digit character |
\D | Non-digit character |
\s | White space |
\S | Non-white space |
\w | Word |
\W | Non-word (punctuation) |
Regex Examples For Google Analytics
So let’s put some examples out there for some Custom Filters. One of my colleagues asked me for assistance to identify an internal page with the path of /index in addition to all blog posts that were written with the year in the permalink:
My custom filter pattern for the filter field Request Url:
^/(index|[0-9]{4}\/)
That basically states to look for /index OR any 4-digit numeric path ending with a trailing slash. I created a view in Analytics and added this as the filter:
Here are a few more examples:
- You have a blog with the year in the URL permalink path and you want to filter the list to any year. So I want any 4 numeric digits followed by a trailing slash. Request URl Filter Pattern:
^/[0-9]{4}\/
- You want to compare all of your pages where the title has certificate or certification in it. Page Title Filter Pattern:
(.*)certificat(.*)
- You want to compare two landing pages based on their Campaign Medium passed in the Google Analytics campaign URL as utm_medium = direct mail or paid search.
(direct\smail|paid\ssearch)
- You want to compare all of the products that are men’s shirts based on the URL path. Request URl Filter Pattern:
^/mens/shirt/(.*)
- You want to compare all of the pages numbered the URL path that ends with the number. Request URl Filter Pattern:
^/page/[1-9]*/$
- You want to exclude a range of IP Addresses. Exclude IP Address Filter Pattern:
123\.456\.789\.[0-9]
- You want to include a thankyou.html page where a submission was successful based on the querystring success=true. Request URl Filter Pattern:
thankyou\.html\?success=true
How to Test your Regex Expressions
Rather than trial and error within Google Analytics, I often just jump over to regex101, a fantastic tool for testing your regular expressions. It even breaks down your syntax for you and provides the details of your regular expression: