Skip to content

How to handle missing data

Missing data is a recurring challenge in clinical research, as any study can face the absence of experimental observations, potentially undermining the reliability of analyses. Understanding the frequency and nature of missing data is essential for implementing the most appropriate data management strategies.

 

 

Picture of Ennio Russo

Ennio Russo

Medical Writing & Scientific Communication Executive, Ph.D.

From the same author

dati mancanti

In any clinical study, researchers often encounter datasets with missing observations, commonly referred to as “missing data.” Most standard statistical methods require that information is available for all observations for each study variable. Therefore, managing missing data is crucial, as neglecting them can lead to distorted and unreliable results.

Understanding Missing Data

The first step in managing missing data is to understand how frequently they occur. Intuitively, handling a dataset where missing data represent a small percentage is quite different from dealing with a dataset with a significant amount of missing data.

 

Next, it’s essential to comprehend the reasons behind the missing data. This aspect is key in interpreting results, as it allows researchers to distinguish whether the missing data arise from causal dynamics or are associated with specific experimental factors. Based on this criterion, missing data can be classified into three main categories:

  1. Missing Completely At Random (MCAR): In this case, missing data are randomly distributed across the sample and are not related to any study variables.
  2. Missing At Random (MAR): Here, the probability of a missing data point is related to certain variables, but not the value of the missing data itself.
  3. Missing Not At Random (MNAR): This category includes all missing data that depend on both the value of the data itself and certain study variables.

Managing Missing Data

Ideally, the best way to manage missing data is to prevent them from occurring in the first place. This requires careful study design and accurate data collection. For example, reducing the number of follow-up visits and collecting only essential information at each visit, along with designing easy-to-complete forms, can help minimize missing data. Prior to starting clinical research, it’s advisable to develop a detailed protocol documentation, including methods for participant screening, training for researchers and participants, communication among involved parties, and monitoring of collected data. Additionally, it’s possible to establish a priori an acceptable level of missing data.

There are various techniques to handle missing data, fundamentally falling into two approaches: either deleting observations or imputing missing values. Here are some techniques available to researchers:

 

  • Listwise Deletion: This method removes cases with missing data and analyzes only the remaining complete data. If the assumption of MCAR is met, this method can produce unbiased estimates.
  •  Pairwise Deletion: This method uses available data for each specific analysis, preserving more information than listwise deletion. However, it can produce estimates from different data sets and may lead to analytical issues.
  • Mean Substitution: Missing values are replaced with the mean of the variable. However, this can introduce bias into the estimates and increase standard error.
  • Regression Imputation: This method estimates missing values using other variables through regression analysis. It allows for more data retention compared to deletion methods.
  • Last Observation Carried Forward (LOCF): Each missing value is replaced with the last known observation for that subject. While simple, this method can produce biased estimates of treatment effects.
  • Maximum Likelihood: This method estimates missing data using observed data from other variables. It can be time-consuming and may yield biased estimates if assumptions are not met.
  • Multiple Imputation: This technique replaces missing data with several plausible values, generating multiple complete datasets. The results of analyses on these datasets are then combined to obtain a final estimate. It is a robust method that produces valid estimates even with a small sample or a high number of missing values.

 

The choice of method should be evaluated by the researcher in relation to the experimental needs and characteristics of the missing data.

Conclusion

Missing data present a significant challenge in clinical research, as they can compromise the reliability and validity of analyses. Understanding the nature and frequency of missing data is essential to adopt the best management strategies. Preventing missing data through careful study design and attentive data collection is a crucial first step. If missing data are present, researchers have several techniques at their disposal to manage them, adapting their approach based on the nature of the missing data.

Further Reading:

– Kang H. The prevention and handling of the missing data. *Korean J Anesthesiol*. 2013 May;64(5):402-6. doi: 10.4097/kjae.2013.64.5.402.

– Heymans MW, Twisk JWR. Handling missing data in clinical research. *J Clin Epidemiol*. 2022;151:185-188. doi:10.1016/j.jclinepi.2022.08.016.

 

 

Picture of Ennio Russo

Ennio Russo

Medical Writing & Scientific Communication Executive, Ph.D.

Our services associated with this topic

Subscribe to the Clariscience newsletter

Recommended Articles

The bibliography of a clinical protocol serves to demonstrate that the protocol is built on a solid foundation of existing…
The introduction of an article is meant to…introduce what has been done! But what structure can we give it?
Randomization is a statistical procedure that, when applied in clinical studies, assigns participants to different treatment groups randomly.
In any clinical study, researchers often encounter datasets with missing observations, commonly referred to as “missing data.”

Desideri avere maggiori informazioni sui nostri servizi?

Would you like more information about our services?

SERVICES

Would you like more information about our services.

ABOUT US

Corporate

Learn about the values that underpin our company, the ecosystem within which the people who work with us operate, the approach we take to customer relations, and the charity initiatives we have selected and supported over the years.

Work with us

Find out about any vacancies, send your spontaneous application and find out about the job profiles of those who already work with us.

Referral program

If you work in the life science sector, there is a new opportunity waiting for you. By participating in the Clariscience Referral Programme you can economically capitalise on your expertise and your network of contacts.

Would you like more information about our services?