Hand-on_EX02

Author

Zhang Chenbin

2.1 Overview

In this chapter, you will be introduced to several ggplot2 extensions for creating more elegant and effective statistical graphics. By the end of this exercise, you will be able to:

control the placement of annotation on a graph by using functions provided in ggrepel package, create professional publication quality figure by using functions provided in ggthemes and hrbrthemes packages, plot composite figure by combining ggplot2 graphs by using patchwork package.

2.2 Getting started

2.2.1 Installing and loading the required libraries

pacman::p_load(ggrepel, patchwork, 
               ggthemes, hrbrthemes,
               tidyverse) 

2.2.2 Importing data

exam_data <- read_csv("data/Exam_data.csv")

2.3 Beyond ggplot2 Annotation: ggrepel

ggplot(data=exam_data, 
       aes(x= MATHS, 
           y=ENGLISH)) +
  geom_point() +
  geom_smooth(method=lm, 
              size=0.5) +  
  geom_label(aes(label = ID), 
             hjust = .5, 
             vjust = -.5) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100)) +
  ggtitle("English scores versus Maths scores for Primary 3")

2.3.1 Working with ggrepel

ggplot(data=exam_data, 
       aes(x= MATHS, 
           y=ENGLISH)) +
  geom_point() +
  geom_smooth(method=lm, 
              size=0.5) +  
  geom_label_repel(aes(label = ID), 
                   fontface = "bold") +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100)) +
  ggtitle("English scores versus Maths scores for Primary 3")

2.4 Beyond ggplot2 Themes

print(class(exam_data$MATHS))
[1] "character"
exam_data$MATHS <- as.numeric(as.character(exam_data$MATHS))

ggplot(data=exam_data, aes(x = MATHS)) +
  geom_histogram(bins=20, boundary = 100, color="grey25", fill="grey90") +
  theme_gray() +
  ggtitle("Distribution of Maths scores")

2.4.1 Working with ggtheme package

ggplot(data=exam_data, 
             aes(x = MATHS)) +
  geom_histogram(bins=20, 
                 boundary = 100,
                 color="grey25", 
                 fill="grey90") +
  ggtitle("Distribution of Maths scores") +
  theme_economist()

2.4.2 Working with hrbthems package

ggplot(data=exam_data, 
             aes(x = MATHS)) +
  geom_histogram(bins=20, 
                 boundary = 100,
                 color="grey25", 
                 fill="grey90") +
  ggtitle("Distribution of Maths scores") +
  theme_ipsum()

The second goal centers around productivity for a production workflow. In fact, this “production workflow” is the context for where the elements of hrbrthemes should be used. Consult this vignette to learn more.

ggplot(data=exam_data, 
             aes(x = MATHS)) +
  geom_histogram(bins=20, 
                 boundary = 100,
                 color="grey25", 
                 fill="grey90") +
  ggtitle("Distribution of Maths scores") +
  theme_ipsum(axis_title_size = 18,
              base_size = 15,
              grid = "Y")

2.5 Beyond Single Graph

p1 <- ggplot(data=exam_data, 
             aes(x = MATHS)) +
  geom_histogram(bins=20, 
                 boundary = 100,
                 color="grey25", 
                 fill="grey90") + 
  coord_cartesian(xlim=c(0,100)) +
  ggtitle("Distribution of Maths scores")
print(p1)

Next

print(class(exam_data$ENGLISH))
[1] "character"
if(is.factor(exam_data$ENGLISH) || is.character(exam_data$ENGLISH)) {
  exam_data$ENGLISH <- as.numeric(as.character(exam_data$ENGLISH))
}

p2 <- ggplot(data=exam_data, aes(x = ENGLISH)) +
  geom_histogram(bins=20, boundary = 100, color="grey25", fill="grey90") +
  coord_cartesian(xlim=c(0, 100)) +
  ggtitle("Distribution of English scores")
print(p2)

Lastly, we will draw a scatterplot for English score versus Maths score by as shown below

p3 <- ggplot(data=exam_data, 
             aes(x= MATHS, 
                 y=ENGLISH)) +
  geom_point() +
  geom_smooth(method=lm, 
              size=0.5) +  
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100)) +
  ggtitle("English scores versus Maths scores for Primary 3")
print(p3)

2.5.1 Creating Composite Graphics: pathwork methods

There are several ggplot2 extension’s functions support the needs to prepare composite figure by combining several graphs such as grid.arrange() of gridExtra package and plot_grid() of cowplot package. In this section, I am going to shared with you an ggplot2 extension called patchwork which is specially designed for combining separate ggplot2 graphs into a single figure.

Patchwork package has a very simple syntax where we can create layouts super easily. Here’s the general syntax that combines:

Two-Column Layout using the Plus Sign +. Parenthesis () to create a subplot group. Two-Row Layout using the Division Sign /

2.5.2 Combining two ggplot2 graphs

p1 + p2

2.5.3 Combining three ggplot2 graphs

(p1 / p2) | p3

2.5.4 Creating a composite figure with tag

((p1 / p2) | p3) + 
  plot_annotation(tag_levels = 'I')

2.5.5 Creating figure with insert

p3 + inset_element(p2, 
                   left = 0.02, 
                   bottom = 0.7, 
                   right = 0.5, 
                   top = 1)

2.5.6 Creating a composite figure by using patchwork and ggtheme

patchwork <- (p1 / p2) | p3
patchwork & theme_economist()