How to Build Data Animation in R

Photo by Gene Devine on Unsplash
Photo by Gene Devine on Unsplash

Have you seen any beautiful racing bar chart data animation on Youtube and wondered how it was built? 

Data visualization by animations is fun to watch. There are various libraries to create spectacular animations. I will show you how to use gganimate in R to animate data by creating a racing bar chart as an example.

What is gganimate

gganimate is a package added to ggplot2. It amid to be the “Grammar of Animated Graphics.” As I mentioned in my other blog post, “Why ggplot2 is so good for data visualization”, ggplot2 was developed behind ideas of Grammar of Graphics. ggplot2 does an excellent job of remapping the core ideas of Grammar of Graphics, but the graphics are static. To see data changing by frames, gganimate expands the Grammar of Graphics to fulfill the animation gap in ggplot2.

gganimate adds additional building blocks on top of ggplot2, which makes it extremely easy to add animation to the existing plots. Those new building blocks are:

  • transition: It’s like a script in the movie. It defines how the animation looks. For example, a lot of data animation is done over time and transition_timecan be used in this case.
  • view: It’s like a camera in the movie. It defines how the axis or zoom should look.
  • shadow: defines how data multiple times should look like. For example, it can trace the movement by time and slowly fade out the nth data point.
  • enter/exit: defines how data show and disappear
  • ease_aes: defines how a value changes to another during tweeting (tweeting is a filmmaking technique for generating intermediate frames such that one image evolves smoothly into the next.)

Install gganimate​

Installing the gganimate is straightforward in R. You must do it by typing install.packages("gganimate")in RStudio. However, if you run any of the examples from gganimate website, you might encounter the following error:

 

				
					No renderer backend detected. gganimate will default to writing frames to separate files
Consider installing:
- the `gifski` package for gif output
- the `av` package for video output
and restarting the R session
				
			

This is because gganimate requires the engine to render a gif or video output. Otherwise, it creates numerous png per frame. The easiest solution to fix the above issue is installing gifski on Macbook install.packages("gifski") and restarting the R session.

Create a gganimate with COVID-19 data

To work with gganimate, we need to set up a ggplot2 chart with some data. Let’s work on the COVID-19 dataset with a daily dump.

It is very straightforward to create a ggplot2 chart with the data. First, we read the downloaded CSV file into a data frame. Then, let’s filter on only the continents and convert the date from string to date type. Finally, we choose a scatter plot with the total_deaths and total_cases fields.

				
					library(ggplot2)
library(gganimate)
library(dplyr)
df = read.csv("~/Downloads/full_data.csv")
df <- df %>% 
 filter(location %in% c("Asia", "Europe", "Africa", "North America", "South America", "Oceania")) %>% 
 mutate(date=as.Date(date, format="%Y-%m-%d"))
ggplot(df, aes(total_deaths, total_cases, color=location)) +
 geom_point()
				
			
ggplot static view | image by author
ggplot static view | image by author

Add animation with gganimate plot

To see an animation from the ggplot2 above, we need to add one line of code to render the animation as GIF.

				
					ggplot(df, aes(total_deaths, total_cases, color=location)) +
 geom_point() + 
 # below is gganimation section
 transition_time(date)
				
			
First animation with gganimate | image by author
First animation with gganimate | image by author

One very cool thing about the animation from this scatter plot is you can see the rate increase between total death and entire case from time to time. You can see the sudden spike from Asia for the total cases in the middle of the animation.

To make the horizontal bar chart, we can still apply the pattern we showed in the first animation above. The code and animation look like below:

				
					ggplot(df, aes(y=location, x=total_cases)) +
 geom_col() + 
 # below is gganimation section
 transition_time(date)
				
			
First animation for racing bar chart | image by author
First animation for racing bar chart | image by author

It doesn’t look good. A couple of issues with the animation above:

  1. The y-axis (location) is not sorted correctly based on the total cases and doesn’t reorder the rank.
  2. The bar chart stretched and went back multiple times, which doesn’t make sense for the total_case.

Now let’s fix this. We need to introduce the row_number sorted by total_cases for each date to fix those two problems. In this way, we precisely know the order of the y-axis for each day.

				
					df_rank <- df %>%
 group_by(date) %>%
 arrange(date, total_cases) %>%
 mutate(row_number = as.character(row_number()))
				
			

Now we can create the animation with the new data frame transition_time to transition_states as it is easier to handle the order change.

It is similar if you have created animation on PowerPoint or Keynote. The state transition divides data into multiple states based on the levels in a given column. In this case, we will use the date for each stage. It animates each date as a frame and uses fade to enter and exit for smooth transitions.

				
					animacion <- ggplot(df_rank, aes(x=row_number, y=total_cases, fill=location)) +
 geom_col() +
 geom_text(aes(x=row_number, y=0, label = location), hjust=1.1) +
 coord_flip(clip = "off", expand = FALSE) + 
 theme_minimal() + 
 theme(
 panel.grid = element_blank(), 
 legend.position = "none",
 axis.ticks.y = element_blank(),
 axis.title.y = element_blank(),
 axis.text.y = element_blank(),
 plot.margin = margin(1, 4, 1, 4, "cm")
 ) +
 # below is gganimation section
 transition_states(date, state_length = 0, transition_length = 2) +
 enter_fade() +
 exit_fade() + 
 ease_aes('quadratic-in-out') 
 
animate(animacion, width = 640, height = 480, fps = 60, duration = 20, rewind = FALSE)
				
			
The Complete Animation with gganimate | image by author

Final Thoughts

gganimate brings static ggplot2 to make it easy to build creative data visualization like the racing bar chart. However, while the animation is astonishing at first glance, it doesn’t show all the information. It cannot replace static plots entirely. How do you like gganimate? Are you ready to try it in your following projects?

About Me

I hope my stories are helpful to you. 

For data engineering post, you can also subscribe to my new articles or becomes a referred Medium member that also gets full access to stories on Medium.

In case of questions/comments, do not hesitate to write in the comments of this story or reach me directly through Linkedin or Twitter.

More Articles

Source: Aron Visuals from Unsplash

Airflow Schedule Interval 101

The airflow schedule interval could be a challenging concept to comprehend, even for developers work on Airflow for a while find difficult to grasp. A confusing question arises every once a while on StackOverflow is “Why my DAG is not running as expected?”. This problem usually indicates a misunderstanding among the Airflow schedule interval.

Read More »

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link