Animated Word Clouds
Word clouds can help viewers easily identify data points. Animating can help to show trends over time.
Last updated
Word clouds can help viewers easily identify data points. Animating can help to show trends over time.
Last updated
For all of the basic animations, we'll use the txhousing
dataset within the ggplot2
library, for simplicity. This dataset contains "information about the housing market in Texas provided by the TAMU real estate center".
We're trying to make this graph:
Let's start off like always, loading the libraries we'll be using to make the word cloud. Note that we've now added ggwordcloud
to our imports.
With our libraries in place, we get the dataset in the same way we've been doing.
Now that we have our data, we need to format it to make a suitable word cloud. We'll take the volume
column, which is the number of house sales multiplied by their price (i.e. how much money was spent), and use it as an indicator for size.
Our data is already formatted in this manner, but there's too many data points/cities. If we were to plot a word cloud with the data set as is, it would look like this:
There's just too much going on for the viewer to properly understand - we need to cut down how many cities we're showing per frame. Let's only show the top five cities by housing market volume, for every date. We can format our dataset to this specification by using the slice_head
function:
Our data now looks like this:
Perfect! We have the top five for every month. Let's get started on our word cloud.
Since word clouds are built by shuffling the words around the biggest one, it might be useful to set a seed for our plot. This way, we always follow the same randomly generated shuffle, which can be useful for debugging.
We're now using geom_text_wordcloud
to make our plot, with the label
being the city name and the size
being the volume of houses (for a given month). We scale up the max_size
of the city with the largest volume to 30 (scale_size_area
), which is a somewhat arbitrary size, but I've found it works well to show large, readable words, without having points that are too big to fit inside the plot.
Finally, we'll add the transition_time(date)
to tell R to only show the words for a given date, cycling through dates.
This looks like a good start, but we're missing some key information. Let's add a descriptive title and the current year that is being displayed.
The syntax to do this is the exact same as what we've been doing earlier.
As we finalize the plot with frames, it's important to note that words will jump around in each frame. So, it's important that we keep the FPS low to accommodate for this, since, unfortunately, there is no way to stop words from switching from frame to frame. By keeping our FPS low, we minimize the number of switches per second, improving readability. With a lower FPS we still have to show all the frames/data we have, so we have to increase the duration of the animation.
That's it! You've successfully made an animated word cloud.