Running a company without feedback is akin to driving aimlessly straight ahead. You lose gas and time. The same is true for running a business based on the company’s online presence. That’s why we need user behavior analysis. However, what should we do when the analyst incorrectly interprets data? What are the most common pitfalls and inference types? We’ve prepared answers based on Joanne Rodrigues’ work—an experienced data analyst and author of the book “Product Analytics.”
What pitfalls do data analysts fall into?
The main task of web analysts is interpreting data and turning it into specific actions aimed at improving products to meet users' needs and expectations.
Thanks to user behavior analysis, analysts can understand how users use products, their interests, who they are, what they do with products, and how they make purchases. As a result, analysts are able to involve users and increase the company’s profits.
The correct user analysis and the ability to draw accurate conclusions often determine market success or failure. Despite the willingness and usually heavy investment in research, companies face difficulties in effectively drawing and applying findings from their analyses.
The most common pitfalls that data analysts fall into.
Adding a story to unrelated facts
Analysts often start their work with a descriptive presentation of collected data. What does that mean? Let’s look at an example. When studying user behavior on a website, we discover that 500 users visit our site within a month. 50% stop at displaying the home page, 40% delve deeper into the site, but only 10% decide to use the contact form and send an inquiry.
That’s what the descriptive presentation of data looks like. Unfortunately, some analysts stop at this and don’t look deeper into the context and reasons for such behavior. Or worse, they start to draw conclusions based on unrelated facts without further research.
Moreover, some researchers tend to choose and isolate a particular behavior to explain why it occurs. This approach is inherently incorrect because user behavior always stems from something, has context, and depends on many variables.
Let’s return for a moment to the above example and focus on the 40% of users who delve deeper into the site and go beyond the home page but don’t reach the contact form. An inexperienced data analyst might draw the following conclusion: 40% of users don’t reach the contact form because they can’t find it.
At a glance, this conclusion seems pretty logical and probable. It’s not hard to imagine that the CTA button leading to the form is poorly visible or located in a place where users don’t expect it.
However, if we investigate the matter further, we will quickly realize that the analyst has no concrete proof to back up their hypothesis, and the cause for the form’s low conversion rate lies elsewhere.
Equally good explanations for this situation can include too many text fields to fill out, the mobile version of the form not displaying correctly, or users needing to enter data they don’t have on hand. What’s more, what about the users who left the website at the home page level?
If we try to focus on individual data or a chosen few, we lose the entire view of the situation, which can lead to overlooking significant variables that influence user behavior. Such selective data analysis results in incorrect conclusions.
The next aspect that analysts should pay attention to is the size of the effect.
Let’s look at an example inspired by Joanne Rodrigues’ book. After conducting further studies, we found that in a month, 1000 users (organic traffic) visit our website, 50% use Google, and 50% use Bing. 30% of Google users decide to send an inquiry, but only 10% of users send inquiries from Bing.
The analyst hypothesizes that if we manage to increase the traffic from Google to 100%, then this will translate into a higher form conversion rate—let’s assume that our business will receive 12.000 inquiries in a year (we optimistically assume that every user will send one inquiry).
Following the hypothesis, we invested resources in advertisements targeting Google users. Now, let’s assume that our marketing efforts were effective, and we achieved 1000 Google visits, i.e., 100%. However, the number of inquiries increased only to 240. Why? Our ads attracted a user group different from the one that found our site organically through the browser. And this other group might not be as interested in our offer as the previous one.
In this case, calculating the size of the effect doesn’t provide helpful information because attracting users from Google didn’t result in a 100% increase in conversion rate due to other factors influencing users.
As we can see from the above examples, to increase conversion of our business, it’s not enough to know what variables influence user actions; we also need to understand the size of the effect of a given behavior.
Too much focus on the data
Machine learning and artificial intelligence software are indispensable tools for data analysts. While they considerably speed up collecting and analyzing data, they aren’t perfect tools for understanding the user.
One of the primary purposes for which analysts use ML/AI tools is to predict the effects of a particular occurrence. These tools are exceptionally useful when we want to find out how much sales or conversion will increase in the coming months, what the best products to recommend in our online store are, or what the projected decline in subscriptions to our product is.
However, the fundamental problem with machine learning tools is that they focus exclusively on numerical data. They won’t give us reasons why users behave in a certain way. Therefore, relying on them won’t replace the traditional causal inference. In other words, quantitative research that answers the question “how much” has little significance without qualitative research that will answer “why.”
The point is that data analysts shouldn’t exclusively rely on one of these two methods. Only by combining them will they obtain correct results that show user behavior and the reasons behind it.
Inference types
In the book “Product Analytics,” Joanne Rodrigues presents three inference types that allow us to evaluate the correctness of obtained conclusions.
Inference based on research
One way to determine whether our conclusions are correct is to conduct two tests. Commonly known as A/B testing.
It’s a very popular method in all sorts of research, for example, clinical research, where one group of patients receives a drug and the other gets a placebo. The research aims to observe the effects in both groups and determine whether the drug is effective. In this case the group that received placebo is a counterfactual.
If we would like to use this method for our example with the contact form, then we can create two variants of the website. In the first version, the button leading to the form will remain in the unchanged location, and in the second variant, it will be located in a more visible place. Next, we will present the unchanged version to one user group and the second variant with the changed location of the CTA button to the other group. At the end, we will compare how often users find the form and evaluate which site version brings better results.
An additional supplement to this study might be the analysis of places users most often click on (with tools that deliver data in real-time). This will help us discover places users tend to visit most often.
Conducting A/B testing allows us to observe a particular phenomenon in the same conditions but with one variable. This enables us to see how the situation changes and under what factors. Of course, not all phenomena can be studied this way.
Inference based on randomness
Inference based on randomness is a part of statistical inference, meaning we’re trying to draw conclusions based on probability. This method assumes that the occurrence of a particular event in a random way has a very low probability. If it happened, there needs to be a specific reason behind it.
In her book, Joanne Rodrigues uses the example of an earthquake that occurs on a desert island every two years. Since the island is uninhabited, no one can confirm that this earthquake occurred, but we can assess it through observation and measuring algae bloom. The presence of algae around the island increases a month after an earthquake; it’s very unlikely that such a spike happened naturally. If it occurs naturally, it may be one case in 100 years. The most probable hypothesis in this case is that it happened because of the earthquake.
The author suggests we can confirm the conclusions obtained with this method using the Bayes’ Rule. The Bayes’ Rule allows us to determine the probability of an event occurring based on the previous analysis of the conditions related to the event.
This inference type won’t give us 100% certainty but will provide data about user behavior that will indicate a high probability, which in most cases will be correct.
Inference based on prediction
Inference based on prediction (predictive inference) is a handy method used to confirm conclusions. This type of inference is often used when we want to prepare for the future.
For example, we can use this inference type to predict the number of users who will use our application during the first year. This will enable us to appropriately plan the infrastructure that will sustain the traffic and employ sufficient employees to maintain it.
User behavior analysis. Summary
Monitoring users and analyzing their behavior gives us valuable data that provides insight into user interaction with a product.
The correct interpretation of user behavior allows us to increase user involvement, profits generated by the product, quality of user experience, optimize conversion, attract new users, etc.
Data analysts shouldn’t stop at preparing descriptive assessments of data. They shouldn’t come up with false narratives or link facts that they can’t prove.
Moreover, they shouldn’t focus on data too much because the numbers can’t give us reasons behind user behavior. Machine learning tools can speed up some processes, but they shouldn’t be the exclusive repertoire of the analyst’s tools. It’s worth remembering that causal inference (quantitative analysis) is still a very important pillar of the analyst’s work.
Additionally, different inference types can help us evaluate the correctness of our conclusions.
When analyzing user behavior, it’s not only important to observe a particular behavior but also to understand what context and variables influenced and caused it. Only in this way will we be able to create and improve applications effectively.