The validity, reliability, and credibility of qualitative research are issues that are rarely raised in UX research discussions.
We assume that since something was studied, the results of such research are indisputable, trustworthy, and non-negotiable.
The attitude toward research results can sometimes be too uncritical. Many people take an attitude towards them (qualitative and quantitative research) that can be summarized in a simple sentence: Data isn't debatable; data is obtained and used.
But there are two sides to every coin. The mere execution of research shouldn't justify its questionable quality. It's worth remembering the popular memento, according to which:
The only thing worse than a lack of research is unreliable, non-credible research, whose methodology, method of execution, and thus results arouse justified skepticism.
User experience, UX research, and usability tests make sense if and only if they're performed according to academic methodological standards and when conducted, analyzed, and reported according to research standards.
In the social sciences, much attention is paid to methodological correctness and the quality of results.
Methodologists and researchers put much effort into improving research and data collection methods.
It's important to remember that every UX research and every research method (especially qualitative research) is rooted in academically developed tools, methods, and standards. Therefore, it's worth taking advantage of this potential.
So, what are the internal and external validity of UX research?
How should research be prepared and conducted to have high confidence in the obtained results?
How is quality defined and achieved in qualitative research?
If you are ready for some practical methodological knowledge, we invite you to read!
What is quality in qualitative research?
The Cambridge Dictionary defines quality as a characteristic or feature of something that distinguishes it from others.
In his book Designing Qualitative Research, Uwe Flick very accurately defines quality in the context of qualitative research.
By the way, Uwe Flick's book is a very valuable compendium of knowledge that every UX researcher should use.
According to his approach, the quality of research is one of the main, or perhaps the most important, aspects of the project design stage.
Quality is strongly linked to the standardization and design of the research situation and the factors that influence it.
In his argument, Flick very aptly notes that if researchers can control and remove factors that interfere with the study — external and internal — they will obtain much more accurate, reliable, and objective results.
That will be useful in the design process and the development of a digital product.
We can define quality in qualitative UX research as the attention qualitative researchers pay to designing a study, conducting it, analyzing the data, reporting, and communicating the results.
Naturally, such a focus on quality, on ensuring that a study produces the most reliable results possible, also directs attention to the researcher, in particular:
- Their place, role, and limitations in the research process.
- Their ability to see their own mistakes and limitations of their own perspective.
- Their role in testing validity and validating quality in the research plan in each of its stages.
- Giving credibility validation equal status and priority as research goals, respondent selection criteria, research methods, organization, and time framework.
How to achieve high quality at the design stage of UX research?
Flick has no doubts about this—quality at this stage is ensured by a well-thought-out, justified, and adequate choice of research method. This means, in practice, that the researcher can maintain the utmost non-arbitrariness in their choices.
In other words, the choice of research method cannot be dictated by the following:
- Convenience
- Personal preferences, likes, or dislikes
- Desire to speed up research, to make it simpler or less expensive
In addition, the researcher needs to carefully consider the chosen method's adequacy. They need to be sure that the method selected will be the best one to capture the fragment of reality that will be studied.
Adequacy is achievable through knowledge and acquired experience but also through the researcher's approach, who should, according to Flick, remain open to diversity.
The idea is to look for experiences and ways to study them that differentiate respondents, not just those they share.
At the stage of research implementation, quality will manifest itself primarily in the:
- Rigor — the application of a given method in accordance with the art of research and with consistency.
- Creativity — searching for new answers, observations, and inspiration rather than confirming established assumptions.
- Consistency — e.g., interviews are easier to compare and generalize if they've been done consistently.
- Flexibility — the ability to balance consistency, rigor, and creativity.
What are reliability and validity in UX?
Here, we come to the problem of reliability, which Pallabi Roy Singh, in an article titled "Reliability and validity: ensuring a Foolproof UX research plan," defines as the need to ensure the repeatability and reproducibility of research.
In other words, the idea is that other researchers can replicate the research study by conducting it in the same way, under the same conditions, with respondents sharing the same characteristics that were included in the study, and get the same results.
In turn, Pallabi Roy Singh defines validity as the degree to which the chosen research method measures what the researcher wants it to measure. The choice of research method is crucial for the high accuracy of study findings.
The reliability of research and its accuracy is essential for the overall success of a research project.
As Raluca Budiu, a researcher associated with the Norman Nielsen Group, teaches in an article titled "Internal vs. External Validity of UX Research," whenever we prepare UX research (quantitative or qualitative), there is a high risk that the results won't reflect reality.
All because of a flawed research design.
We can also consider validity and reliability in qualitative research from a different perspective, as John W. Creswell proposed in his book "Research Design. Qualitative, Quantitative and Mixed Methods Approaches."
According to Creswell, validity in qualitative research has different connotations than in quantitative research.
It's not directly related to reliability or generalizability. The validity of qualitative research means that the researcher has checked the reliability of the results using specific procedures.
In turn, Creswell believes that reliability should be understood as the consistency of capturing one's own current and historical research, as well as that done by other researchers.
Hence, it's essential in qualitative research to document procedures and create detailed protocols and research databases.
Pallabi Roy Singh, quoted above, distinguishes three methods that allow research results to be considered reliable:
- Test-Retest Reliability
- Parallel Forms Reliability
- Inter-Rater Reliability
The reliability in the Test-Retest formula means that similar results are obtained in both tests. Test-Retest works best as a reliability estimator when surveys or questionnaires are used in research.
The Parallel Forms Reliability formula compares two equivalent test forms that measure the same attribute.
Inter-rater reliability is a measure of reliability used to assess the degree to which different researchers agree on their assessment decisions.
This reliability estimator is recommended for quality validation of observational, field, and contextual studies.
What are internal and external validity errors?
As we've already observed, validity and reliability are two distinct concepts that shouldn't be confused.
Reliable research means that its results will be similar if we repeat it. Thanks to this, we'll be sure that the result isn't coincidental.
However, it's worth remembering that research with high reliability and low validity is research that perfectly measures, unfortunately, not what we would like it to measure.
How to deal with this problem? The answer is to pay attention to internal and external validity.
The internal validity of research means that neither the researcher nor the procedure favors any possible results (e.g., reactions, behavior).
To put it a little differently, internal validity is the certainty that we cannot explain the observed, captured in the course of the research, cause-and-effect relationship in another way with the help of other variables or factors.
We should mention the remark made by Arlin Cuncic in the article "Understanding Internal and External Validity. How These Concepts Are Applied in Research".
Research can be considered internally valid if we can eliminate alternative explanations for the results.
We can establish cause-and-effect if the research meets the following three criteria:
- Cause precedes effect in terms of time.
- Cause and effect vary.
- There are no other likely explanations for this relationship.
For example, if we want to research whether a screen's diagonal influences the speed at which a user performs a task, we divide the respondents into two groups, A and B. Then, we conduct tests with them at different times of the day (e.g., early morning and late evening). That will cause the results to have low internal validity.
The differences in performance and the speed of execution of the task may be caused by the size of the screen diagonal but also by another variable, such as the time of day, which can affect the result.
Research conducted in this way will have low internal validity because we can't establish the cause-and-effect relationship between task completion speed and a screen's diagonal size without a doubt.
The relationship between these variables may or may not occur, or it may occur and be further conditioned by yet another variable (or variables) that distorts the result.
The external validity of the research means that the participants and the research conditions are representatives, reflecting a fragment of reality.
You can also encounter the term ecological validity. However, it is more commonly used in psychological research and has little to do with user experience.
Combing back to our example:
If the research is conducted in a laboratory under different conditions from typical user experience and behavior, it will have low external validity.
For instance, the distance from the user's monitor will be unnaturally long, the light level may be far from usual, and the device itself and its method of operation may be unknown to the respondent.
The difference may also stem from the sheer influence of researchers on a respondent (e.g., intrusive, distracting verbal, and behavioral interference).
In summary, to ensure good external validity, you should make sure that the study environment is as close as possible to users' actual experiences. High external validity should also mean that the study's findings can be used in other contexts.
Factors that can disrupt external validity most often include:
- Research conditions that deviate far from natural conditions
- Measuring tools (e.g., Head-Stabilized Eye Tracking)
Factors that can disrupt internal validity include:
- Researcher bias — personal beliefs of researchers can affect the data collected from respondents and prompt them to draw wrong conclusions.
- Repeated testing — if you're using the same test to study a respondent repeatedly, then they will naturally get better at it.
- Allowing members of the control group to interact with other participants.
The high internal validity of the qualitative research can be achieved by:
- Using a random order of tasks
- Controlling the configuration of the study from one session to another
- Detecting and neutralizing variables that distort the results.
The high external validity of the qualitative research can be achieved by:
- Defining precisely the criteria for inclusion/exclusion of respondents
- Utilizing statistical methods to regulate any issues (e.g., caused by uneven respondent groups)
- Recruiting respondents that are representative of the target group
- Ensuring that the research conditions, context, and course correspond as much as possible to natural and real processes, conditions, and context.
Internal and external validity in UX research. Summary
- A popular myth and approach to UX research is the belief that data isn't debatable; data is acquired and used.
- It is worth remembering that unreliable and non-credible research is the only thing worse than a lack of research. The reliability of qualitative research, or rather the validity and reliability of qualitative research, is essential for making good business decisions.
- Every research method used in UX research (e.g., UX testing) is rooted in tools, methods, techniques, and standards developed by scientists.
- The quality of research is one of the main aspects to be considered at every stage of the research project.
- Quality is strongly linked to the standardization and design of the research situation and the factors that influence it. Qualitative research is susceptible in this aspect.
- Researchers who can control and remove factors interfering with the research — external and internal — can obtain much more relevant, reliable, and objective results.
- Quality is defined as the attention researchers pay to designing the research, conducting it, analyzing the data, reporting, and communicating results.
- The researcher also influences the results' quality, reliability, credibility, and validity, and they must be aware of their influence on them.
- Reliability is defined as the need to ensure the repeatability and reproducibility of research.
- Validity is the degree to which the chosen research method measures what the researcher wants it to measure.
- The choice of research method is crucial for the high accuracy of study findings.
- Paying attention to quality in qualitative research reduces the risk that the study's results don't reflect reality.
- Validity in qualitative research isn't directly related to reliability or generalizability.
- The validity of qualitative research means that the researcher has checked the reliability of the results using specific procedures.
- The reliability in the Test-Retest formula means that similar results are obtained in both tests. Test-Retest works best as a reliability estimator when surveys or questionnaires are used in research.
- The Parallel Forms Reliability formula compares two equivalent test forms that measure the same attribute.
- Inter-rater reliability is a measure of reliability used to assess the degree to which different researchers agree on their assessment decisions. It's recommended for quality validation of observational, field, and contextual research.
- Research with high reliability and low validity perfectly measures, unfortunately, not what we would like it to measure.
- The internal validity of research means that neither the researcher nor the procedure favors any possible results.
- Internal validity is the certainty that we cannot explain the observed, captured in the course of the research, cause-and-effect relationship in another way with the help of other variables or factors.
- We can establish cause-and-effect if the research meets the following three criteria: cause precedes effect in terms of time, cause-and-effect vary, and there are no other likely explanations for this relationship.
- The external validity of the research means that the participants and the study's conditions are representatives, reflecting a fragment of reality.
- We can achieve qualitative research's high internal validity by using a random order of tasks, controlling the study's configuration from one session to another, and detecting and neutralizing variables that distort the results.