The NBA and other sports leagues are full of interesting data and trends. We can visualize how professional basketball has changed with heat maps that track shot location.
The animation we make will look like this:
The dataset is available for download here, and it is originally from GitHub.
As always, let's load in our libraries.
library(tidyverse) #used to get the datalibrary(dplyr) #used to wrangle the datalibrary(gganimate) #used to animate the plotlibrary(ggplot2) #used to make the plot
Before we get started, we need to load in some functions in advance, so that we can call them when we plot. These were taken directly from the author of the data, with slight modifications to make it a bit easier to understand.
There's no need to try and decipher all of that, just know that it plots the background lines for our court and that we call it with plot_court().
plot_court()
I recommend that you paste this chunk of code into your workspace, run it, then minimize it by clicking the downward arrow on whatever line number the comment ####plotting the court#### is. This is so that the functions are within the R memory.
Let's now get started with the data.
shots <-read_csv("nba_shots.csv")
Once we have all the shots, we can filter them based on location. We only want shots that are within half-court, so we'll only take those whose LOC_Y is below 40.
shots_plot <- shots |>filter(LOC_Y<=40)
For simplicity and to allow the plot to look good, let's round each LOC_X and LOC_Y location to the next lowest integer. This is effectively making "bins" for our data to be grouped into, which makes the heat map look more readable (and easier to render - there's less points to plot). Without this, each shot location would be tiny points instead of easy-to-read large squares.
We'll also make the SEASON_1 column integers, so that it can be properly read by gganimate.
Now that the data has the right class, we can group each shot by its location (both LOC_X and LOC_Y) and the season it was shot. Then, we count how many shots were taken at the location in that season.
Nice. Now we're on to the plot. Let's start with a static version. Note that we call plot_court() first, since it creates the background which we'll plot over.
geom_tile is a heat map syntax in ggplot, which makes a tile at each given location. In this case, it's making a tile at each LOC_X and LOC_Y for each shot. The fill=Count tells R that we want the tiles to be filled according to how many shots were taken at that location. The group=interaction(LOC_Y, LOC_X) is a bit tricky to understand, but it essentially means that every group/tile is a unique shot with a unique LOC_X and LOC_Y. Finally, the alpha of 0.8 is given to make the background court visible behind our data.
This doesn't look too promising, but don't worry, we're on the right track. To fix it, we can enhance some aspects of our plot, including the legend, gradient, titles, and theme. Note that we're using a log2 scale, so that layups (close shots near the basket) don't skew the data.
The scale_fill_gradient2 line is important, since it sets the color scheme we'll be using. While the colors are self-explanatory, it should be noted that since we're using a log2 scale, the midpoint must take that into account. That is, the midpoint of 6 really means 2 to the 6th power (i.e. the midpoint is 256).
The coord_fixed(ratio=1) line just makes sure that are squares remain squares: each coordinate should have an equal size height and width (the ratio between height and width should be 1).
We're now ready to animate. Again, we're going to use transition_time for our animation, cycling over the year of the season. The new range syntax is way of avoiding using as.integer in our subtitle. All it does is tell R that our animation is using integers (L) from 2004 to 2024.
Since we're doing this, we can just use {frame_time} in our plot, without any class conversions.
plot_court()+geom_tile(data=shots_plot, aes(x=LOC_X, y=LOC_Y, fill=Count, group=interaction(LOC_Y, LOC_X)), alpha=0.8)+scale_fill_gradient2(low ="white", mid="white",high ="red", trans="log2", midpoint =6)+coord_fixed(ratio=1)+theme_void()+labs(subtitle ="Year: {frame_time}", #changed line title ="NBA Shots Heatmap: Switch from Mid-range to 3-pointers", fill="# of Shots")+theme(plot.title = ggtext::element_markdown(size=16, hjust=0.5, face="bold"), plot.subtitle = ggtext::element_markdown(size=14, hjust=0.5, face="bold"))+transition_time(SEASON_1, range=c(2004L, 2024L)) #added line
That's it for the animation! Now we just feed it into an object, and animate that object with some simple specifications.