Premium POV

Turkeys, Fruit, and Spotting Imposters: Keys to Data Driven Marketing

The explosion of data availability and AI technology have revolutionized advertising over the past decade. Tools such as smart bidding and data-driven attribution barely existed five years ago, yet border on being industry standards today. Moreover, agencies are also buying into this trend, using data-driven decision making to challenge assumptions and replace “gut instinct” and rules of thumb with hard data.

Threats to Data Interpretation

“A turkey, judging its future by the experience of its previous days, would have no cause for alarm right up until Thanksgiving.”

There is a fundamental challenge posed by this data revolution: the value of a data-driven insight is limited by the quality of the data from which it is derived, and by the skill of its interpreters. Everyone has heard the old adage about lies and statistics. Using data poorly can be even worse than not using data at all. Yet, a great number of organizations, like our hypothetical turkey, do just that, making the same handful of common mistakes when attempting to become data oriented. Weak data-driven “insights” frequently draw from incomplete or incorrect data, attempt to draw conclusions from too small of a data set, or fail to account for changes in external conditions. It is our hope that this article will arm marketers with the tools to either update your own data collection and evaluation processes, or to screen the validity of data-driven insights from agencies and other business partners.

Do You Know Where Those Numbers Have Been?

Once a viable data source is established, maintenance becomes the next major hurdle to data quality. The disconnected tagging and website team setup is the bane of anyone interested in stable website analytics. In my own experience, website teams pushing changes without considering the tag management team is the single most common reason for tracking outages, with all of the lost business potential that those outages represent. The importance of consistent data collection only rises as a business becomes more data centric. Advertising tools such as smart bidding and data-driven attribution rely heavily on a consistent stream of data to power AI-driven insights. Errors and outages in the data stream can push erroneous data into the decision-making algorithms, negatively impacting bidding and attribution for weeks after the mishap. Companies that rely on these tools should consider data outages to be a risk to their digital advertising returns and should, therefore, expect website and tagging teams to collaborate to mitigate the real business revenue risk posed by tracking outages.

What is That Supposed to Mean?

When communicating both internally and with other business partners, it is important to establish consensus on common definitions of metrics. Different advertising and analytics platforms will define similar sounding metrics differently. Even within the Google space, my own area of expertise, conversion attribution is handled differently between Google Ads and Google Analytics. In Google Analytics 4 (GA4), the default setting is that the last non-direct click is assigned attribution for a conversion event (goal completion/eCommerce transaction for universal analytics). However, Google Ads, in its own conversion reporting, only “sees” its own contribution to a conversion. Therefore, it will take credit for any user conversion path that involved a Google Ads click, so its internal conversion numbers will (generally) be higher than the conversion totals attributed to that Google Ads account in Google Analytics. These metrics only become more divergent when more advanced conversion attribution modeling is implemented.

Another common source of divergent data when allegedly measuring the same metric is sourcing data from multiple tags. Different analytics tagging will use different methods for attribution and user identification (i.e., Were these two sessions actually performed by the same person?) and will, therefore, report different numbers for otherwise commonly defined metrics. Other common sources of divergence are in session definitions and conversion lookback windows. (i.e., How long ago should a website session by the same user count towards a conversion?) In short, beware of naively comparing data sourced from different platforms without first accounting for differences in how this data was measured. Your “apples to apples” comparison may actually be a “bananas to blueberries” comparison in disguise.

A Significant Other, or Just an “Other”

One of the key ideas in statistics is the idea of variance. Advertising is full of “random” chances. How many users who searched for one of your keywords are actually ready to buy? How many of those were enticed enough by your ad to click on it? Of those individuals, how many complete a purchase online or visit a physical location to buy your product? An ad may have a 3% CTR over tens of thousands of impressions, but any particular grouping of 100 impressions may feature 0 clicks, 10+ clicks, or anything in between. This “instability” when picking small groups of impressions is called variance, and it means that we cannot take results gathered from a small data set too literally.

Suppose we want to measure whether some ad copy change has measurably improved CTR. In order to conclusively demonstrate that the new ad copy performs better than the old ad copy, we need to demonstrate that CTR has not only increased, but increased by a substantial enough margin that it is improbable that the difference could have emerged out of random chance. The more data we collect, the smaller this margin becomes. As a rule of thumb, quadrupling the number of impressions and clicks approximately halves the error margin in the CTR measurement. By consequence, more subtle effects require more data to validate.

Unfortunately, not every agency is as cognizant of this idea as they should be. Your agency may not explicitly lay out exact data collection requirements in their client communications, but they should at least make an effort to communicate that data collection takes time and that this time is required to develop accurate insights. An agency promising “instant” results or claiming that data collected from an overly small sample size is indicative of a broader trend should raise a red flag. This could be indicative of institutional ignorance of these concepts, or of an attempt to be intentionally misleading.

To practically implement these ideas into your decision-making process, make it a point to find out the size of the data set used to form a data-driven insight. Implement a process when evaluating changes to estimate the amount of data required to establish statistical significance, and do not pre-emptively evaluate the results of a change until this data has been collected. Challenge “insights” derived from an insufficiently sized data set. Thoroughly implementing such processes may require some persuasion and insistence with less statistically oriented decision makers, but the end result of such persistence is higher quality data-driven insights that actually drive the business forward.

There’s More to It Than Meets the Eye

Another point of potential failure in data-driven decision making is failing to account for external conditions. Nothing in the advertising world occurs in a vacuum. Individual channels can be boosted by the performance of other channels. For example, branded and direct traffic spike significantly when TV and other traditional media is running. Moreover, most searches have some level of seasonality: people typically search for gyms around New Year’s, buy binders and notebooks in the month before school starts, and buy electronics between Cyber Monday and Christmas. A data-driven agency should seek to account for and quantify the impact of external influences on advertising results.

These fluctuations in external conditions are often a nuisance when trying to measure the impacts of more subtle changes. Returning to our previous example, a modest increase in ad CTR can easily be buried under the post back-to-school crash in interest in school supplies. Worse, an actively detrimental change could be unfairly promoted by advantageous external circumstances. This leads to a natural desire to be able to evaluate the performance of intentional changes without having to worry about this sort of fluctuation in external conditions.

The golden standard for this type of measurement is an A/B test: we randomly assign every impression on the related keywords to either the old or the new ad copy. We can then measure the CTR of both ads over the same time period, minimizing the impact of confounding variables such as seasonality. After some data collection threshold is met, we look at the results and determine whether the new ad copy demonstrates conclusively better performance than the old ad copy. In cases where A/B testing is not feasible (e.g., when measuring lead generation rates after a monolithic website overhaul), it is the responsibility of the agency to identify external confounding factors in the data and to attempt to control for them.

For many small- and even medium-sized businesses, the volume of data required can take some time to collect. It’s important to keep in mind that advertising might not generate instantaneous results. Impatience here can result in pushing changes that either don’t actually drive improved results over the long term, or worse, are actually detrimental, but simply appear better over the short term due to an anomaly of chance. Waiting for an ample amount of data to collect before levying judgement on the performance of an advertising change will ensure that the changes that stick are those that are driving the business forward.

Subdivided Data is Slower Data

The other side of that concern is to be wary of excessive subdivision of data. Producing click volume on high-impact campaigns is fairly quick for most mid-sized or larger advertisers. As the data is subdivided into increasingly fine divisions, such as ad groups, keywords, individual ads, or specific demographic and interest groups, data collection becomes slower. A keyword that only generates 100 impressions per month would take years to demonstrate significance for anything short of a multiple-percentage point increase in CTR. The slow rate of data collection at this level of subdivision ensures that a month-to-month or even quarter-to-quarter comparison of individual keyword performance, outside of a small handful of a client’s highest volume keywords, is often highly suspect: any “performance differences” found are far more likely to be due to random statistical scatter than to some attributable cause.

That said, smart agencies can intelligently aggregate data to identify underlying trends in data that appears fragmented at face value. For example, search queries can be aggregated into clusters of similar terms, e.g., on the basis of containing similar words or phrases. In one of the ad campaigns we’ve worked on, we identified that search queries including the word “how” were 70% less likely to result in a conversion after an ad click than those that did not include that word. All of this data was spread out over 100+ unique search terms, many of which only had a handful of searches over a one-year period, and therefore would fly under the radar without aggregation. By blocking searches including this keyword, and others, we drove a 40% decline in cost per conversion for this campaign over a two-month period.

Insights or Imposters?

Becoming a data-driven marketing team, and doing it well, is a lengthy process involving many hurdles. It frequently requires revisions to many of the business processes surrounding data, all the way from tinkering with the nuts and bolts of data collection tooling to developing a consistent use of terminology across department lines. Conscientiousness throughout the data insight process, from establishing metrics, to building and maintaining data collection systems, and interpreting the data collected, is critical to driving high quality data-driven insights. The end result of this diligence is the difference between finding data-driven insights or winding up with anecdotally-driven imposters.

What we do best

Schedule Call
x
Warren Douglas
Let's Talk!
Let's discuss if we're a fit for your premium brand. Our approach may be just what you need to take your business to the next level.