Want to arrive on time? Don’t fly out of Chicago.
All data tell a story, but when observations number in the millions or more, the story can be harder to find. While mathematical differences can guide investigation of smaller data sets, large data sets require different tools. Heike Hofmann, professor of statistics, is an expert at exploring large data sets through images and animations that tell stories numbers alone simply can’t.
Each dot represents a commercial flight on March 13, 1993.
The Bureau of Transportation Statistics publishes a large data set of all commercial flights in the United States between October 1987 and December 2008. To find the big story within 123 million flight arrivals and departures, Hofmann started small and selected snapshots of 24-hour periods – about 3 a.m. to 3 a.m. Each snapshot consisted of about 25,000 cross-continental flights.
One of the 24-hour snapshots contained March 13, 1993, the day a historic snowstorm engulfed the northeastern part of the country. Hofmann created an animation to visualize flights traversing the country that day. Dots representing flights move across a map of the United States. The size of the dot represents how delayed the flight was: the larger the dot, the more delayed the flight, ranging from 15 minutes to four hours – or more.
At the beginning of the animation (3 a.m.), tiny dots the size of a pinhead flitter across the map, but as the day moves on, increasingly larger dots appear, some the size of an eraser.
On that particular day, East Coast airports shut down in the early afternoon. After 10 a.m., there was no airport activity in the northeast corner of the United States. Where did the large dots accumulate?
You guessed it: the Windy City.
At the end of the animation, a burst of large dots spring out of the Chicago area like fireworks, meaning the stranded travelers are finally on their way.
Of course, weather can’t hold all the blame for a delayed flight, but as Hofmann explored more data from different days, she found that Chicago was still prone to delays.
It was a quiet day for weather on January 19, 2006, yet the story in Chicago was the same: a burst of large dots float out of the city at the end of the 24-hour period, representing numerous delayed flights.
“The ability to visualize pieces of data sets allows us a clear understanding of the data,” Hofmann said. “It puts the data to good use, sparking questions and, hopefully, answers to a problem.”
And now we can see evidence in support of the intuition to avoid flying out of Chicago. [/autop]
Animations coming soon
[feature_footer author="Will Stone" read_more="alumni"]