Animated Maps
Maps present the opportunity for immensely valuable data visualizations. Animations can help to elevate these visuals.
Last updated
Maps present the opportunity for immensely valuable data visualizations. Animations can help to elevate these visuals.
Last updated
For all of the basic animations, we'll use the txhousing
dataset within the ggplot2
library, for simplicity. This dataset contains "information about the housing market in Texas provided by the TAMU real estate center".
Our goal is to reproduce this plot:
Like always, we start by importing libraries. This time, there's two new ones.
We'll start by getting the backdrop for our plot.
This line is farily complicated. The maps::map()
call uses the maps package to draw a geographical map. The state
parameter means that we are drawing the continental U.S. with states as polygons. Fill=TRUE
indicates that we'd like the polygons to (eventually) have colors. Plot=FALSE
just means we won't plot the map right now.
maps::map()
returns a map object, which must be turned into an sf
if we want to plot it within ggplot
. To do this we call st_as_sf()
to turn the map (st
: spacial type) into an sf
(simple features). This object is then stored in the variable usa
.
Going back to the basics, we read txhousing into a new data frame for viewing.
To plot the data on a map, we'll need each city to correspond to a point with an appropriate latitude and longitude. Let's prepare to do this my making a new data frame exclusively for cities, that we can then merge back with txhousing_data
. In our new dataset, to save time geocoding, each city should only be featured once:
Essentially, we're taking all the unique city names and making them a data frame, then renaming the column name to save some typing later on.
We're now going to take each of the cities, and find both the latitude and longitude using the geo_osm
function in the tidygeocoder
package. The geo_osm
function returns a tibble with the name, latitude, and longitude, so putting it into the corresponding cities
dataframe is simple. I've concatenated ", Texas" to the end of the cities to make it easier for the function to identify certain ambiguous names (i.e. "Paris", which is a city in both Texas and France).
Let's join this data frame back with the original. First, we'll need to rename the cities
column in the txhousing_data
to match the change we made to our geo_osm
call:
We'll then join the data together, connecting our city
column to the address
column:
Now that we have the latitude and longitude points for each city in our original data frame, we can convert these points to geometry in R.
For each coordinate pair, we use the st_as_sf
function to convert the "regular" lat and long numbers into actual geometric points. Then, to make sure alignment is proper, we set a Coordinate Reference System using st_set_crs
, since there are a number of ways to align latitude and longitude along a 2D plane. For most situations, 4326 (World Geodetic System) is your go-to.
Now that we've got all our data in order, we can start to plot. We're now using geom_sf()
to plot "simple features" using ggplot
. We have a few specifications: the color
of the inner circle should be the median sale price, and both the inner and outer circle should have size
s that reflect the number of listings/sales for that city. Obviously, there are always at least as many listings as there are sales, so listings will be the outer circle and sales the inner.
Note that the shape parameter in the first geom_sf
just means the circle will be unfilled (the outline will be gray).
We're almost done. Let's clean up this chart a little by fixing the scales and adding titles.
Having all of America here isn't really necessary as all our data is within Texas, so we can cut off the coordinates of our graph with a self-explanatory coord_sf
call.
Here, the X (longitude) goes from -107 to -90 (or 107 W to 90 W) and Y (latitude) goes from 25 to 37 (or 25 N to 37 N).
Finally, we can add transition_time(date)
as usual, to tell R that each frame we see are cycling/seeing data points from different points of time, as specified by the date
(and smoothing the in-between areas). We'll also add the same {as.integer(frame_time)}
as before, to update the viewer on the current data they're seeing.
We're done! Now we can just give our animation to an object and animate the object using the animate function, with some appropriate parameters.
Congrats, you've made it through all the basic animations, and are ready to tackle the advanced ones.