Animated 2D Point Maps
Last updated
Last updated
Looking at flight data can provide insights into the busiest times for air traffic as well as highlighting the times where the sky is empty.
In this module, we'll be making this animation:
The data is available here. Download all the folders into one folder, then click on the .tar file to generate the csv for each hour. Note that the files total to around 14 GB in size. If you want to work with smaller size files, consider following the tutorial below, but using this file as a reference instead (only requires about 400 MB of space).
Let's load in the libraries:
Since the data is a bit scattered, our first step is organizing it into a single data frame. We can do this with a simple function.
Since the data is divided into hours, and each file has a label that corresponds to the hour of data its carrying, we can write a simple sprintf call to input an integer representing an hour and obtain all the data for that hour.
While the syntax might look a bit daunting, all the first line in the function is doing is reading the data at a location given by two hour numbers that are always 2 digits (with leading zeros if needed).
Now that our data is loaded in for the hour, we can move on to conversions, to get things done all at once. We'll start by converting the time given for all the data in each hour to a readable time for R, using as_datetime
.
Next, we'll take the floor of every time present within the dataset. The original data has a record of every flight every 10 seconds, which is very powerful data, but a bit too much for our plotting methods within R. So, we're decreasing things 60-fold, by only choosing one data point for every hour.
Once the time column is representing data to an hour precision, we then filter so that all the characteristics we care about are present. Next, we group by icao24
, or each plane's 'serial number', arrange the column in descending order so it's easier to view, and choose the first (arbitrary) logged location for the plane, so that for every hour, we have exactly one location for all planes that were flying at some point during that time.
All together, this function reads in an hour, and gives a formatted data frame with a single location for all the planes logged during that hour, which will decrease our plotting time drastically, without sacrificing too much detail.
With our function in place, we can run it over a 24-hour iteration, to bind together a large dataset with hourly locations for all planes during the 24-hour window.
We're printing here to track our progress, since this function will take about three minutes to run, so it's important we know exactly how far along we are on that process (and if there's an error). Once that's done, we'll get a data frame that looks like this:
Even though we shortened our focus down to just one log every hour, the dataset is still very large: nearly 300,000 rows. To focus down even more, we can focus on just American Airlines flights, which have a call sign that starts with "AAL":
There's two parts to each plane we're showing on our plot: one is the location of the aircraft itself, and the other is a tail leading out from the aircraft in the opposite direction it is heading, to show direction.
The former of these two parts is easy to implement:
We just convert the data frame's latitude and longitude points into a sf object using st_as_sf
, and set the Coordinate Reference System to the standard 4326.
The latter part of each aircraft is a bit more complicated to create:
Essentially, we're just adding a longitude and latitude pair to act as an end point for the line we're making to represent the trail. The longitude and latitude are created using some trigonometry:
Since the heading is measure as the distance from the vertical line (clock-wise), we manipulate it as described in the image above to align it within our triangle. From here, it's simple to see how our formula is generated. Since cosine and sine are in radians, we multiply the 90-heading
by pi/180
to convert it. Then, we're dividing the result by 100 to scale it down, otherwise the tails would be too long, and subtracting it from the original lon and lat points to put our tail behind the point.
With these newly created columns, we can convert our two points into a linestring using the sprintf
and st_as_sf
functions. The sprintf
call makes a linestring character string with each of our floating-point coordinates (referenced by a %f
). Then, we indicate to R that we want to convert the data held within the "geom" column to an sf object. As always, we then set our Coordinate Reference System to the standard 4326.
With all our data in place, we'll now get the data for the plot of the U.S. underneath the planes, with a standard call to the maps package, then converting that call to a plottable sf
object.
We're now ready to plot everything. Since this plot takes some time to load, I'll just jump to the final iteration of our plot.
The geom_sf
calls are all fairly standard: remember that we're including the group attribute to indicate we want each plane to animate over its own locations. The alpha isn't equal to one so that we can see the underlying map.
We'll use theme_void
to get rid of the background axes and format our frame_time
variable to the XX:XX AM/PM format, using %I:%M %p
. Next, we'll limit our plot to only display the continental U.S., with limits at the vertical and horizontal end points of the country. Since we're not displaying the axes, we're also not going to label the X and Y axes. After we format the title and subtitle, we call transition_time
to indicate we're cycling through different times continuously. Finally, we're calling enter_fade
and exit_fade
with alphas of 0 to make planes appear and disappear in a more smooth fashion (planes will fade in and out).
We can now animate our animation.
After a few minutes, we'll get our finished animation.