Animated Maps
Maps present the opportunity for immensely valuable data visualizations. Animations can help to elevate these visuals.
For all of the basic animations, we'll use the txhousing
dataset within the ggplot2
library, for simplicity. This dataset contains "information about the housing market in Texas provided by the TAMU real estate center".
Our goal is to reproduce this plot:

Like always, we start by importing libraries. This time, there's two new ones.
library(ggplot2) #plotting and getting the txhousing dataset
library(dplyr) #sorting the data to fit our graph
library(gganimate) #animating the ggplot
library(sf) #sf = shapefile, specifically used to plot maps
library(tidygeocoder) #used to get latitude and longitude points from locations
We'll start by getting the backdrop for our plot.
usa <- st_as_sf(maps::map("state", fill=TRUE, plot=FALSE)) #requires 'maps' package is installed
This line is farily complicated. The maps::map()
call uses the maps package to draw a geographical map. The state
parameter means that we are drawing the continental U.S. with states as polygons. Fill=TRUE
indicates that we'd like the polygons to (eventually) have colors. Plot=FALSE
just means we won't plot the map right now.
maps::map()
returns a map object, which must be turned into an sf
if we want to plot it within ggplot
. To do this we call st_as_sf()
to turn the map (st
: spacial type) into an sf
(simple features). This object is then stored in the variable usa
.

Going back to the basics, we read txhousing into a new data frame for viewing.
txhousing_data <- txhousing

To plot the data on a map, we'll need each city to correspond to a point with an appropriate latitude and longitude. Let's prepare to do this my making a new data frame exclusively for cities, that we can then merge back with txhousing_data
. In our new dataset, to save time geocoding, each city should only be featured once:
cities <- data.frame(unique(txhousing_data$city)) #df of unique cities
names(cities)[names(cities) == 'unique.txhousing_data.city.'] <- 'city' #rename column name for simplicitydataframe

Essentially, we're taking all the unique city names and making them a data frame, then renaming the column name to save some typing later on.
We're now going to take each of the cities, and find both the latitude and longitude using the geo_osm
function in the tidygeocoder
package. The geo_osm
function returns a tibble with the name, latitude, and longitude, so putting it into the corresponding cities
dataframe is simple. I've concatenated ", Texas" to the end of the cities to make it easier for the function to identify certain ambiguous names (i.e. "Paris", which is a city in both Texas and France).
cities <- geo_osm(paste0(cities$city, ", Texas"))

Let's join this data frame back with the original. First, we'll need to rename the cities
column in the txhousing_data
to match the change we made to our geo_osm
call:
txhousing_data$city <- paste0(txhousing_data$city,", Texas")

We'll then join the data together, connecting our city
column to the address
column:
txhousing_data <- left_join(txhousing_data, cities, by=join_by('city'=='address'))

Now that we have the latitude and longitude points for each city in our original data frame, we can convert these points to geometry in R.
sf_txhousing_data <- txhousing_data |>
filter(!is.na(long)) |> #NAs are not allowed to be converted
filter(!is.na(lat)) |> #NAs are not allowed to be converted
st_as_sf(coords = c('long','lat')) |> #convert the 'lat' and 'long' coordinates to points
st_set_crs(4326) #set a standard coordinate system
For each coordinate pair, we use the st_as_sf
function to convert the "regular" lat and long numbers into actual geometric points. Then, to make sure alignment is proper, we set a Coordinate Reference System using st_set_crs
, since there are a number of ways to align latitude and longitude along a 2D plane. For most situations, 4326 (World Geodetic System) is your go-to.

Now that we've got all our data in order, we can start to plot. We're now using geom_sf()
to plot "simple features" using ggplot
. We have a few specifications: the color
of the inner circle should be the median sale price, and both the inner and outer circle should have size
s that reflect the number of listings/sales for that city. Obviously, there are always at least as many listings as there are sales, so listings will be the outer circle and sales the inner.
Note that the shape parameter in the first geom_sf
just means the circle will be unfilled (the outline will be gray).
ggplot()+
geom_sf(data=usa)+
geom_sf(data=sf_txhousing_data, aes(size=listings), shape=1, color="gray")+
geom_sf(data=sf_txhousing_data, aes(size=sales, color=median))

We're almost done. Let's clean up this chart a little by fixing the scales and adding titles.
ggplot()+
geom_sf(data=usa)+
geom_sf(data=sf_txhousing_data, aes(size=listings), shape=1, color="gray")+
geom_sf(data=sf_txhousing_data, aes(size=sales, color=median))+
viridis::scale_color_viridis(option="B", label=scales::comma)+ #requires the 'viridis', 'scales' packages are installed
scale_size_continuous(label=scales::comma, range = c(1,15))+
theme_void()+
labs(title="Texas Housing Data from 2000-2015",
subtitle="The outer circle is how many houses were listed.<br>The inner circle is how many houses were sold and for what price.",
color="Median Sale Price",
size="Number of Listings/Sales")+
theme(plot.title = ggtext::element_markdown(size = 22, hjust =0.5, face = "bold"),
plot.subtitle = ggtext::element_markdown(size = 15, hjust =0.5, face = "bold")) #requires the 'ggtext' package is installed

Having all of America here isn't really necessary as all our data is within Texas, so we can cut off the coordinates of our graph with a self-explanatory coord_sf
call.
+coord_sf(xlim = c(-107, -90), ylim = c(25, 37))
Here, the X (longitude) goes from -107 to -90 (or 107 W to 90 W) and Y (latitude) goes from 25 to 37 (or 25 N to 37 N).
Finally, we can add transition_time(date)
as usual, to tell R that each frame we see are cycling/seeing data points from different points of time, as specified by the date
(and smoothing the in-between areas). We'll also add the same {as.integer(frame_time)}
as before, to update the viewer on the current data they're seeing.
ggplot()+
geom_sf(data=usa)+
geom_sf(data=sf_txhousing_data, aes(size=listings), shape=1, color="gray")+
geom_sf(data=sf_txhousing_data, aes(size=sales, color=median))+
viridis::scale_color_viridis(option="B", label=scales::comma)+
scale_size_continuous(label=scales::comma, range = c(1,15))+
theme_void()+
labs(title="Texas Housing Data from 2000-2015",
subtitle="The outer circle is how many houses were listed.<br>The inner circle is how many houses were sold and for what price.
<br><br>Year: {as.integer(frame_time)}<br>", #changed line
color="Median Sale Price",
size="Number of Listings/Sales")+
theme(plot.title = ggtext::element_markdown(size = 22, hjust =0.5, face = "bold"),
plot.subtitle = ggtext::element_markdown(size = 15, hjust =0.5, face = "bold"))+
coord_sf(xlim = c(-107, -90), ylim = c(25, 37))+ #added line
transition_time(date) #added line
We're done! Now we can just give our animation to an object and animate the object using the animate function, with some appropriate parameters.
animation <- ggplot()+
#continued from above
animate(animation, fps=10, duration=15, end_pause=30, height = 8,
width = 9, units = "in", res = 200)
Congrats, you've made it through all the basic animations, and are ready to tackle the advanced ones.
Last updated