Treatment comparisons for pretest – posttest experimental studies

Recently, I faced a question on the statistical analysis method for an animal study involving testing of lipid lowering agents. The study was designed along the following lines. Animals were randomized to receive one of the study agents. Prior to initiation of treatment, lipid levels (LDL, HDL and total cholesterol) were measured (baseline). Study drugs were administered ­­daily for xx days and the lipid levels were again measured at end of study (post treatment).

This kind of study design is often labelled as a pretest – posttest design and is quite common in the medical field for comparing different treatments.

In many clinical studies of cholesterol lowering agents, the percent change from baseline is analyzed for between group differences. Hence, I suggested using percent change from baseline as the outcome measure in an ANOVA as the statistical analysis method for the above animal experiment. On further scanning of the literature, I however noted that in spite of the widespread prevalence of the pretest – posttest experimental design, there is a lack of consensus on the best method for the data analysis.

The possible choices for the outcome measure to use in the analysis of data from a pretest – posttest study could be the post-treatment values (PV) or the change between baseline and post treatment values, referred to in the literature as change scores or gain scores (DIFF) or any measure of relative change between baseline and post treatment values. Percentage change from baseline (PC) mentioned above is an example of a measure of relative change. Some of the other measures of relative change include symmetrized percent change (SPC) and log ratio of post-treatment to baseline (LR). Furthermore, the methods for statistical analysis include parametric and non-parametric versions of ANOVA or ANCOVA on any of the above outcome measure. So indeed there exist a lot of possibilities for the analysis of data from a pretest – posttest study!

Various simulation studies have provided us with pointers to guide the choice of the outcome measure and analysis method. The current thinking seems to be that an ANCOVA on PV has higher power than a simple ANOVA on PC, especially in the situation where little correlation exists between baseline and post treatment values. However, SPC with wither the ANOVA or ANCOVA seems to be a good option in the case of additive or multiplicative correlation between baseline and post treatment values.

The simulation studies have tried to mimic various real scenarios. Vickers has studied in detail the situation where the outcome is continuous and there is an additive correlation between post treatment and baseline values. SPC as an outcome measure was not studied (Vickers A. J. The use of percentage change from baseline as an outcome in a controlled trial is statistically inefficient: a simulation study. BMC Medical Research Methodology (2001) 1:6). Meanwhile, Berry & Ayers consider count data, include SPC as an outcome measure and consider parametric and non-parametric analysis methods in the presence of additive and multiplicative correlation between baseline and post treatment values (Berry D. A. & Ayers G. D. Symmetrized Percent Change for Treatment Comparisons. The American Statistician (2006) 60:1, 27-31).

At the time of design of the statistical analysis plan, there is bound to be little information available on the extent and type of correlation that may exist between the baseline and post treatment values. Hence there is a need to conduct extensive review and/or simulations, taking into account also the type of data and correlation structures encountered in different therapeutic areas, to identify (therapeutic area specific!) best practices for the choice of the outcome measure and analysis method for the pretest – posttest design.

Why Statistics?

Sometime back I gave a talk on the topic ‘Why Statistics?’ in the course of a workshop on Clinical Research Methodology. Having found no crisp answers to the question during the research for the talk, and considering the importance of the topic, I thought ‘Why Statistics?’ would also be a good theme for a first blog post.

‘Statistics’ is a term used to refer both to the subject of statistics as well as to data and data summaries. I plan to discuss these different definitions of ‘statistics’ in a future post.

Statistics is a scientific discipline that is important not just in clinical research. Nowadays, with the increased emphasis placed on data analytics and evidence based decisions in research and business, an awareness of the importance and the right use of statistical methods becomes crucial in every domain of application.

As a scientific discipline, statistics can be defined as “the science of collecting, analyzing, presenting, and drawing inference from data”.

We should probably rewrite this definition as “…drawing inference from incomplete data”, because most often we use data from a random sample of a large population to draw conclusions about the population. Moreover, since different random samples drawn from the same population may give slightly different results due to the sampling variability, the definition of statistics can be further expanded as “the science of collecting, analyzing, and drawing inference from incomplete data, in the presence of variability”.

Why do we need to have a basic understanding of statistics?

Nowadays, in our professional as well as personal lives, we are constantly bombarded with ‘statistics’ (here statistics = data + analysis) in the course of our work or by the media.

We need to get a good understanding of statistics so that we are in a position to critically look at the origin of the data (design of study), the data themselves, the analysis of the data and the inference. Most importantly, we also need to know that, due to the sampling variability, there is will always be a certain amount of uncertainty in the inference from a statistical analysis. We need to keep this uncertainty in mind to take appropriate informed decisions. Unfortunately, most often, this uncertainty, either does not find its way into reports, especially, media reports or is not given sufficient importance (Nice cartoon at posted in The LoveStats Blog)

To put in a nutshell, a right understanding of the science of statistics is essential to filter the truth from the lies and the damned lies!

Continue reading