Spaghetti plots with ggplot2 and ggvis

This post was motivated by this article that discusses the graphics and statistical analysis for a two treatment, two period, two sequence (2x2x2) crossover drug interaction study of a new drug versus the standard. I wanted to write about implementing those graphics and the statistical analysis in R. This post is devoted to the different ways of generating the spaghetti plot in R, and the statistical analysis part will follow in the next post.

Spaghetti plots, are often used to visualize repeated measures data. These graphs can be used to visualize time trends like this or to visualize the outcome of different treatments on the same subjects, as in figure3 of the article above. Briefly, in Spaghetti plots, the responses for the same subject, either over time or over different treatments, are connected by lines to show the subject-wise trends. Sometimes, different line types or colors are used to distinguish each subject profile. The plot looks like a plate of spaghetti, that’s probably the reason for the name.

Dataset:

The dataset for illustrating spaghetti plots can be obtained from ocdrug.dat.txt and a brief description of the dataset is at ocdrug.txt. I first saved the files to a local directory and then read the data into a R dataframe and assigned appropriate column labels:

ocdrug <- read.table(paste(workdir,"ocdrug.dat.txt",sep=""),sep="") 
## “workdir” is the name of the variable storing the directory name where the data file is stored 
colnames(ocdrug) <- c("ID","Seq","Period","Tmnt","EE_AUC","EE_Cmax","NET_AUC","NET_Cmax")

## Give nice names to the treatments (OCD and OC) and the treatment sequence 
ocdrug$Seq <- factor(ifelse(ocdrug$Seq == 1,"OCD-OC","OC-OCD"))
ocdrug$Tmnt <- factor(ifelse(ocdrug$Tmnt == 0,"OC","OCD"), levels = c("OCD", "OC"))

 

Spaghetti plot using ggplot2

It is possible to make a spaghetti plot using base R graphics using the function interaction.plot(). We however do not discuss this approach here, but go directly to the approach using ggplot2. We want to exactly reproduce figure 3 of the article that actually has four sub-figures. In base R, we can use mfrow(), but in ggplot2, one way to achieve this is to first create the 4 individual figures and arrange them using the grid.arrange() function in package gridExtra. First, we load the required packages,

require(ggplot2)
require(ggvis)
require(gridExtra)  ## required to arrange ggplot2 plots in a grid

and create a theme common for all the graphs:

mytheme <- theme_classic() %+replace% 
        theme(axis.title.x = element_blank(), 
        axis.title.y = element_text(face="bold",angle=90))  

We then make the first sub-figure. This is for the EE_AUC. The y-axis is in log10 scale:

p1 <- ggplot(data = ocdrug, aes(x = Tmnt, y = EE_AUC, group = ID, colour = Seq)) +
    mytheme +
    coord_trans(y="log10", limy=c(1000,6000)) +
    labs(list(title = "AUC", y = paste("EE","\n","pg*hr/mL"))) + 
    geom_line(size=1) + theme(legend.position="none")

Making the remaining three graphs follows along similar lines. Note that in the graphs p2, p3 and p4 the points for some subjects (outliers?) are labeled. We can get the labels using geom_text() and choosing the subjects to be labeled. We also include a legend below graphs p3 and p4.

p2 <- ggplot(data = ocdrug, aes(x = Tmnt, y = EE_Cmax, group = ID, colour = Seq)) +
    mytheme +
    coord_trans(y="log10", limy=c(100,700)) +
    labs(list(title = "Cmax", y = paste("EE","\n","pg/mL"))) + 
    geom_line(size=1) + 
    geom_text(data=subset(ocdrug, ID %in% c(2,20)), aes(Tmnt,EE_Cmax,label=ID)) +
    theme(legend.position="none")

p3 <- ggplot(data = ocdrug, aes(x = Tmnt, y = NET_AUC, group = ID, colour = Seq)) +
    mytheme +
    coord_trans(y="log10", limy=c(80000,300000)) +    
    labs(list(y = paste("NET","\n","pg*hr/mL"))) + 
    geom_line(size=1) + 
    geom_text(data=subset(ocdrug, ID %in% c(18,22,20)), aes(label=ID), show_guide = F) +
    scale_colour_discrete(name="Sequence: ", labels=c("OCD then OC", "OC then OCD")) + 
    theme(legend.position="bottom")

p4 <- ggplot(data = ocdrug, aes(x = Tmnt, y = NET_Cmax, group = ID, colour = Seq)) +
    mytheme +
    coord_trans(y="log10", limy=c(10000,60000)) +
    labs(list(y = paste("NET","\n","pg/mL"))) + 
    geom_line(size=1) + 
    geom_text(data=subset(ocdrug, ID == 9), aes(label=ID), show_guide = F) +
    scale_colour_discrete(name="Sequence: ", labels=c("OCD then OC", "OC then OCD")) + 
    theme(legend.position="bottom")

Finally, we arrange plots p1 through p4 as a matrix, using the function grid.arrange() and save it to a .png file:

png(filename = paste(workdir,"ByTmnt_ggplot2.png",sep=""), width = 640, height = 640, bg="transparent")
grid.arrange(p1, p2, p3, p4, ncol = 2)
dev.off()

Creating an interactive spaghetti plot with ggvis

Having recreated figure3 of the article using ggplot2, I then wanted to make an interactive version of the plot. The R package ggvis can be used to provide some interactive features. Here is the user interaction that we wish to add:

  1. To be able to select which (of the four) plot to view
  2. To provide a tooltip to the user, that gives info on the subject ID when the cursor is pointed at a point or line in the graph

To create a plot in ggvis that includes a tooltip, we need to first create an identifier for each row in the dataset like so:

 
ocdrug$uid <- 1:nrow(ocdrug)  # Add an unique id column to use as the key
all_values <- function(x) {
  if(is.null(x)) return(NULL)
  row <- ocdrug[ocdrug$uid == x$uid,]
  paste0(names(row[1]), ": ", format(row[1]))
}

Then,

 
ocdrug <- group_by(ocdrug, ID) ## Data is grouped, by subjects ocdrug %>% 

ocdrug %>% 
  ggvis(x = ~Tmnt, y = input_select(c("EE: AUC" = "EE_AUC", "EE: Cmax" = "EE_Cmax",
            "NET: AUC" = "NET_AUC", "NET: Cmax" = "NET_Cmax"), 
            label="Y-aixs variable", map = as.name)) %>%    ## choose which graph to display
  layer_paths(stroke = ~Seq) %>%    ## color lines by treatment sequence as before
  layer_points(fill = ~Seq) %>%        ## color points by treatment sequence as before
  layer_points(fill = ~Seq, key := ~uid) %>%    ## having to do it twice, 
         ## else the points just seemed to appear and disappear. Have not understood why?
  add_axis("x", title = "Group", title_offset = 50, grid=FALSE) %>%    ## Axes and legend
  add_axis("y", title = "", grid=FALSE) %>%
  scale_numeric("y", trans="log") %>%
  hide_legend("stroke") %>%
  add_legend("fill", title = "Sequence") %>%
  add_tooltip(all_values, "hover")    ## Finally add the tooltip

To display the interactive plot, copy the above code and paste it in an R session. The plot would appear in the browser. The R session should be kept open. The plot below is only a screenshot from the browser and is not interactive.

Spaghetti_ggvis

My original aim was actually to create an interactive version of the ggplot2 graphic that displays all the four graphs at once, but also includes a tooltip, instead of the text labels for selected subjects. I also wanted that pointing at one subject in one particular graph will highlight the profile for that subject, not only that graph, but in the remaining three graphs as well. It however looks like, now, ggvis does not support multiple graphs in the same page. A full-fledged Shiny app may be a solution for someone with no knowledge of html, css, Java etc… I welcome experts to share any other ideas by which such interactivity can be achieved.

Advertisements

One thought on “Spaghetti plots with ggplot2 and ggvis

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s