Pantheon Community

Pantheon_stripped

I am having a hard time understanding how Pantheon treats UTM parameters. I migrated to Pantheon from Rackspace, where we were previously using UTM parameters all over the place to track how people get to our site. We also pull those UTMs through form submissions and pass them along to help track source. This is how our business tracks all web attribution from paid ads, organic, social, email, etc.

Since the migration our UTMs have been replaced with ‘PANTHEON_STRIPPED’. I have found this article about how Pantheon treats url strings, but it is going a bit over my head.

I would almost prefer disabling varnish cache rather than lose UTM tags, though I know that is an extreme scenario.

Does anyone have any experience with this who can offer suggestions?

@lukesdyer, since Pantheon will always strip the UTM parameters’ values, your application logic (i.e., Wordpress or Drupal PHP code) won’t be able to read them. But it sounds like you already know that.

Whether you can disable Pantheon’s varnish cache layer to eliminate this behavior, I don’t know.

In cases where I’ve needed PHP code to read UTM values, I’ve had to use Javascript. The JS gets executed in the browser, so it can read the UTM values. In one case, we read the values and packaged them – using a different URL parameter – into an Ajax request to our backend, so that our PHP code could read them.

Unfortunately, I don’t see any quick fixes for you on Pantheon.
-Chris

Thanks Chris,

It looks like UTMs are making to Ga via the JS script and just not being passed through app layer as mentioned. I think that is all I needed and just was initially freaked out when I saw “pantheon_stripped” in GA, but looking at the data overall it is a small percentage. I am going to keep an eye on it and re-evaluate after more site traffic data is collected. The one thing that I am worried about is that I am passing UTM parameters from a custom short url that is redirecting to the main longer url. I plan on testing that in full to make sure that the redirect is still passing parameters to GA.

Any ideas on debugging this to identify the source of the stripped traffic that appears in GA? I had a conversion come through GA attributed to PANTHEON_STRIPPED. For the most part Google is grabbing the appropriate UTMs, but there are some outliers that are making it through. I’m not sure if this is part of my code or some wonky redirects…

That’s tricky. Most likely you’d need to find in the backend code (Wordpress, right?) where the site is generating a URL with the current query params. E.g., if the code is blinding reading $_GET['utm_source'], that’s a red flag. But more likely, the code will be reading $_GET entirely.

If Wordpress has a utility function to build urls, you could debug in there (e.g., url() in Drupal 6/7).

Apparently, this issue is affecting Facebook Ads also. I’m assuming it is because they are redirecting to a URL with GA UTM parameters in them. So frustrating…

The issue is that FB Ad URL is https://l/facebook.com/l.php?[facebook parameters and UTM parameters] which redirects to: https://example.com?[facebook parameters and UTM parameters]

this is still an on-going struggle for me. I have paid ads that are driving UTM parameters within the URL to landing pages with forms that are programmed to insert the UTM variables into hidden form fields.

Pantheon is stripping some of these out. Seems to be random.

Question: If I disable the Varnish Caching Layer on specific pages would this stop Pantheon from stripping my UTMs? I found an article on how to do this: https://pantheon.io/docs/cache-control

Response from Support Staff:

STAFF: Unfortunately, there isn’t a way to NOT have UTM vars stripped. You can view those here: https://pantheon.io/docs/pantheon_stripped#which-query-parameters-are-optimized

ME: So, to be clear, setting my HTTP headers to disable caching for edge layer, Varnish will not prevent the UTMs from getting stripped? It randomly strips out some but not others

STAFF: We strip that at the CDN edge so bypassing cache won’t defeat that mechanism.

1 Like

I realize that this is most-likely a programmatic issue on my part. But, I did not have this issue with Rackspace or Kinsta, and am probably going to need to switch web hosts if I cannot determine the root cause of the issue.

As you noted, the root cause is that we remove the params at the CDN layer. The primary reason for this is that the varnish caching engine will see those params as each a different page, preventing it from caching the page (or rather creating a unique cache entry for every user which is never called again).

In general the reason you’re seeing PANTHEON_STRIPPED in your analytics reports, is most likely that there’s a server side redirect somewhere, and that’s sending a user to a link with the stripped params.

It’s normal practice with any caching technology to sanitize these parameters in some fashion. For most use cases the parameters are only used by the Google Analytics (or similar) javascript apps, and so they’re not needed on the backend. Removing them from the query allows the request to be cached consistently across users since the url is part of the cache key. Some folks find that using fragments instead of query params works for them, but these are definitely not sent to the server by the user’s browser.

There is one way around this, with a major caveat. When the parameters are stripped, they’re moved into a header, X-Pre-Strip-Debug. You can read this header in your application, and parse out the original params and operate on them however you need to. The caveat here is that your application code will only be run on POST requests, or the first run of a page. That means that if you’re expecting the page to render based on those for every new request, it will fail. If you send a no-cache header with that page, it should hit the application every time, but you lose out on the benefits of caching.

Here’s some untested sample code that might help if you understand the implications noted above:

    <?php
    $headers = getallheaders();
    $header = $headers['X-Pre-Strip-Debug'];    
    $stripped_params = explode('&', $header);
    foreach ($stripped_params as $param) {
      $parts = explode('=', $param);
      $_GET[$parts[0]] = $parts[1];
    }
    print_r( $_GET );
    // $_GET should now have the expected utm params that were removed.

Again, I’d recommend finding an alternate way to encode your params so this isn’t an issue as what I’m describing here isn’t supported, could break unexpectedly, etc. Also the sample code above is not secure, you’ll want to ensure everything is sanitized just in case.

2 Likes

@doug_pantheon I really appreciate you talking the time and effort to put this response together. I think that my next action will be to try and find a developer to help me adjust my WordPress Theme and Plugin(s) code, so that this is not such a problem for me. I am grateful that you have have helped me better understand the problem.

2 Likes

I am having this same issue and trying to find a fix for this.

1 Like

@Leandra Welcome to the community. Does this page help at all? https://pantheon.io/docs/pantheon_stripped

If not, let us know where you’re stuck and we can look at it. :slight_smile:

1 Like

Thank you for following up. I will have my developers look through the document. Thank you again.

2 Likes

@lukesdyer, I think your best bet is to use JavaScript to read the utm parameters from the URL, then inject those parameters’ values into your hidden form fields.

Reading the custom header (as mentioned by @doug_pantheon) may work, but you’ll lose the benefits of the CDN and cached pages. Further, your fix would be specific to Pantheon, so you’d need to adjust your code if you moved web hosts.

Hope this helps.

2 Likes