Have you seen any beautiful racing bar chart data animation on Youtube and wondered how it was built?
Data visualization by animations is fun to watch. There are various libraries to create spectacular animations. I will show you how to use gganimate in R to animate data by creating a racing bar chart as an example.
What is gganimate
gganimate is a package added to ggplot2. It amid to be the “Grammar of Animated Graphics.” As I mentioned in my other blog post, “Why ggplot2 is so good for data visualization”, ggplot2 was developed behind ideas of Grammar of Graphics. ggplot2 does an excellent job of remapping the core ideas of Grammar of Graphics, but the graphics are static. To see data changing by frames, gganimate expands the Grammar of Graphics to fulfill the animation gap in ggplot2.
gganimate adds additional building blocks on top of ggplot2, which makes it extremely easy to add animation to the existing plots. Those new building blocks are:
- transition: It’s like a script in the movie. It defines how the animation looks. For example, a lot of data animation is done over time and
transition_timecan be used in this case.
- view: It’s like a camera in the movie. It defines how the axis or zoom should look.
- shadow: defines how data multiple times should look like. For example, it can trace the movement by time and slowly fade out the nth data point.
- enter/exit: defines how data show and disappear
- ease_aes: defines how a value changes to another during tweeting (tweeting is a filmmaking technique for generating intermediate frames such that one image evolves smoothly into the next.)
Installing the gganimate is straightforward in R. You must do it by typing
install.packages("gganimate")in RStudio. However, if you run any of the examples from gganimate website, you might encounter the following error:
No renderer backend detected. gganimate will default to writing frames to separate files Consider installing: - the `gifski` package for gif output - the `av` package for video output and restarting the R session
This is because gganimate requires the engine to render a gif or video output. Otherwise, it creates numerous png per frame. The easiest solution to fix the above issue is installing gifski on Macbook
install.packages("gifski") and restarting the R session.
Create a gganimate with COVID-19 data
To work with gganimate, we need to set up a ggplot2 chart with some data. Let’s work on the COVID-19 dataset with a daily dump.
It is very straightforward to create a ggplot2 chart with the data. First, we read the downloaded CSV file into a data frame. Then, let’s filter on only the continents and convert the date from string to date type. Finally, we choose a scatter plot with the total_deaths and total_cases fields.
library(ggplot2) library(gganimate) library(dplyr) df = read.csv("~/Downloads/full_data.csv") df <- df %>% filter(location %in% c("Asia", "Europe", "Africa", "North America", "South America", "Oceania")) %>% mutate(date=as.Date(date, format="%Y-%m-%d")) ggplot(df, aes(total_deaths, total_cases, color=location)) + geom_point()
Add animation with gganimate plot
To see an animation from the ggplot2 above, we need to add one line of code to render the animation as GIF.
ggplot(df, aes(total_deaths, total_cases, color=location)) + geom_point() + # below is gganimation section transition_time(date)
One very cool thing about the animation from this scatter plot is you can see the rate increase between total death and entire case from time to time. You can see the sudden spike from Asia for the total cases in the middle of the animation.
To make the horizontal bar chart, we can still apply the pattern we showed in the first animation above. The code and animation look like below:
ggplot(df, aes(y=location, x=total_cases)) + geom_col() + # below is gganimation section transition_time(date)
It doesn’t look good. A couple of issues with the animation above:
- The y-axis (location) is not sorted correctly based on the total cases and doesn’t reorder the rank.
- The bar chart stretched and went back multiple times, which doesn’t make sense for the total_case.
Now let’s fix this. We need to introduce the row_number sorted by total_cases for each date to fix those two problems. In this way, we precisely know the order of the y-axis for each day.
df_rank <- df %>% group_by(date) %>% arrange(date, total_cases) %>% mutate(row_number = as.character(row_number()))
Now we can create the animation with the new data frame
transition_states as it is easier to handle the order change.
It is similar if you have created animation on PowerPoint or Keynote. The state transition divides data into multiple states based on the levels in a given column. In this case, we will use the date for each stage. It animates each date as a frame and uses fade to enter and exit for smooth transitions.
animacion <- ggplot(df_rank, aes(x=row_number, y=total_cases, fill=location)) + geom_col() + geom_text(aes(x=row_number, y=0, label = location), hjust=1.1) + coord_flip(clip = "off", expand = FALSE) + theme_minimal() + theme( panel.grid = element_blank(), legend.position = "none", axis.ticks.y = element_blank(), axis.title.y = element_blank(), axis.text.y = element_blank(), plot.margin = margin(1, 4, 1, 4, "cm") ) + # below is gganimation section transition_states(date, state_length = 0, transition_length = 2) + enter_fade() + exit_fade() + ease_aes('quadratic-in-out') animate(animacion, width = 640, height = 480, fps = 60, duration = 20, rewind = FALSE)
gganimate brings static ggplot2 to make it easy to build creative data visualization like the racing bar chart. However, while the animation is astonishing at first glance, it doesn’t show all the information. It cannot replace static plots entirely. How do you like gganimate? Are you ready to try it in your following projects?
I hope my stories are helpful to you.
For data engineering post, you can also subscribe to my new articles or becomes a referred Medium member that also gets full access to stories on Medium.
How to Engage with Users By Storytelling: Show Data Analytics in R and Shiny
Using R and Shiny, we can build an app where the end users can interact with the data analysis we have done. I will show you how to engage with users by storytelling - show data analytics in R and Shiny.
R For Data Analysis: How to Find the Perfect Cocomelon Video for Your Kids
I will share my journey on using R for Data Analysis: building an end-to-end solution for exploring trending Cocomelon videos using R from scratch.
Is ChatGPT Making People Lose Interest in Writing: Learning From Using ChatGPT
ChatGPT is powerful and scary. As people interested in writing, we have thoughts and manually type each word. Will this change how we write, and will more people lose interest?
After using ChatGPT for some time, my answer is: No. It isn’t capable of changing anything humans produce. But it could hurt people who want to get started.