Skip to content

Introduction to Data Visualization with ggplot2

# Make the points 40% opaque
ggplot(diamonds, aes(carat, price, color = clarity)) + # or size can be used
  geom_point(alpha = 0.4) +
  geom_smooth() # fitted line

Attributes

Labels and size are only applicable to categorical variables.

A common adjustment is the position. Position specifies how ggplot will adjust for overlapping bars or points on a single layer. For example, we have identity, dodge, stack, fill, jitter, jitterdodge, and nudge

Geometries

The jitter geom is a convenient shortcut for geom_point(position = "jitter"). It adds a small amount of random variation to the location of each point, and is a useful way of handling overplotting caused by discreteness in smaller datasets.

p + theme(axis.line = element_line(color = "red", linetype = "dashed"))

# Similarly, element_rect() changes rectangles and element_text() changes text. You can remove a plot element using element_blank()

plt_prop_unemployed_over_time +
  theme(
    rect = element_rect(fill = "grey92"),
    legend.key = element_rect(color = NA),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    panel.grid.major.y = element_line(
      color = "white",
      size = 0.5,
      linetype = "dotted"
    ),
	
	# Set the axis text color to grey25
	
   axis.text = element_text(color = "grey25"),
	
    # Set the plot title font face to italic and font size to 16
	
   plot.title = element_text(size = 16, face = "italic")
  )
  
  theme_*()
  library(ggthemes)
  # Acces built in templates
  theme_set()
  # Change all to a certain theme
p <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  # Use a jitter position function with width 0.1
  geom_point(alpha = 0.5, position=position_jitter(width=0.1))

# The brewer scales provide sequential, diverging and qualitative colour schemes from ColorBrewer. These are particularly well suited to display discrete values on a map.
scale_fill_brewer 

p + theme(legend.position = new_value) Here, the new value can be

"top", "bottom", "left", or "right'": place it at that side of the plot. "none": don't draw it. c(x, y): c(0, 0) means the bottom-left and c(1, 1) means the top-right.

Whitespace means all the non-visible margins and spacing in the plot.

To set a single whitespace value, use unit(x, unit), where x is the amount and unit is the unit of measure.

Borders require you to set 4 positions, so use margin(top, right, bottom, left, unit)

Create a nice plot

# Set the color scale
palette <- brewer.pal(5, "RdYlBu")[-(2:4)]

# Add a title and caption
ggplot(gm2007, aes(x = lifeExp, y = country, color = lifeExp)) +
  geom_point(size = 4) +
  geom_segment(aes(xend = 30, yend = country), size = 2) +
  geom_text(aes(label = round(lifeExp,1)), color = "white", size = 1.5) +
  scale_x_continuous("", expand = c(0,0), limits = c(30,90), position = "top") +
  scale_color_gradientn(colors = palette) +
  labs(title = "Highest and lowest life expectancies, 2007", caption = "Source: gapminder")