Introduction

It is no secret that COVID-19 had and continues to have an impact on many things in our country, but one area of focus is tourism and spending. In an article written in December of 2020, it was estimated that tourism worldwide would not return to 2019 levels until 2023. In the first half of 2020, global tourism fell by nearly 65% (BEHSUDI, 2020). The reason for this was due to businesses shutting down, people not wanting to come in close contact with others, and capacity restrictions. One of the most common ways of traveling to go on vacation is by plane, which puts passengers in close contact with many strangers, with a select few not adhering to mask guidelines. But what about places that people could go to that were accessible by driving? What about places that were outdoors, making it easier to social distance? This has led to studies on national parks. Did national parks face drastic changes in tourism during COVID-19 and the years following? One study dives deep into this question by analyzing cell phone data.

In the paper “Covid-19’s impact on visitation behavior to US national Parks from communities of color: Evidence from Mobile Phone Data”, it was found that national parks closer than 347 km to individuals experienced an increase in numbers during the pandemic, while the opposite was true for parks more than 347 km from people. The authors looked further into race and found that of all the people who visited national parks during the time of the study, that there was disproportionately low-representation amongst people of color (Alba et al., 2022). One reason in particular why national parks are important to our economy is because of the amount of jobs they provide. In Figure 1, we can see the amount of jobs supported by national parks, and how many of those are local jobs. We can also see the total output per year, which shows the amount of money the national parks contributed to the economy for that particular year. In 2020 alone, national parks contributed $28.6 billion to the economy. This shows that national parks are more important to our economy than some may think! Based on the statistics, in 2022, national parks supported a record number of jobs and record contribution to the economy. Based on the article by Behsudi, it seems that national parks rebounded quicker than expected. Figure 2 shows the number of recreational visits from 2019-2022.

Another thing to consider with tourism is how it impacts the environment. The national park service (NPS) website states: “Over the last 30 years air quality has improved significantly in national parks and across the U.S. This is great news because parks need clean air. It is essential for the health of our visitors and employees, clean clear views of park scenery, and a healthy natural environment. But, is it also known that almost all national parks are still affected by air pollution.” (https://www.nps.gov/subjects/air/airqualityparks.htm). Do tourists have a negative impact on air quality? Does air quality affect the number of visitors a specific park has? This project hopes to answer those questions. In addition to figuring out if the number of visitors has an impact on air quality, this project will also look at what impacts the number of visitors. The NPS website has a lot of data that can be used to gather different characteristics of each park.

There are many benefits to doing a project like this. The information mined from this data is useful for park managers and the government as well as tourists like us. Looking at things from a business perspective, parks want to have the most amount of visitors possible. In this project if we are able to find that certain activities have a positive correlation with the number of visitors, this would be good to know as leadership. If the landscape and geography allows, adding one of these activities to their park might be able to increase the number of visitors. Another possible outcome could be if there is an activity that has a positive correlation with the number of visitors, this could help leadership decide how to use their resources. For example, parks that offer biking see more visitors than parks that don’t, it would be smart to invest money into biking infrastructure to improve the experience of those that use it and also potentially attract new visitors. This type of project can also help tourists decide where they want to visit. If someone loves to go horseback riding for example and we find that one region specifically has a lot of parks that offer horseback riding, this would narrow down some parks for them to choose from. Hopefully we will also be able to tell which activities are commonly occurring together, so people can potentially pick regions or parks to visit based on this and their own interests.

Doing this project might also lead to other discoveries of unexpected relationships between variables in the national park data. There might be something that impacts visitor numbers other than activities that wouldn’t have been known without this analysis. After finding some relationships between visitor numbers and different variables, it could be helpful for the NPS to have a model that predicts how many people visit each park. This could allow them to hire more employees, or move more resources to a more popular park. No matter what patterns are found or not found, this project will help to gain a better understanding of the characteristics of different NPS parks. It is important to understand what makes up these parks as they are popular places for tourism, places that protect the environment, and also provide jobs to many and have a positive impact on the economy. Digging deep into that analysis could help identify ways to improve certain parks or lead to further questions and other data to look at.

Q&A

  • Are there any features that can predict the number of activities a park offers?
    • It was found that looking at the acre data was not a good indicator of the number of activities the parks offered. Using visitation numbers as a predictor for number of activities was shown to be an okay choice, but not the best. It was found that parks that were Historic Sites, Historic Parks, Memorials, and Monuments, offered a low amount of activities. National Parks on the other hand, offered a high number of activities.
  • Can the designation of a park (national park vs. national monument, etc.) or region be predicted by the types of activities offered?
    • Doing unsupervised learning, such as clustering and association rule mining (ARM), offered some insight on this. There was a difference in likelihood of activities being offered in different regions, which is discussed in more detail in a later question. Clustering showed that National Battlefields, National Battlefield Parks, National Historic Sites, National Historical Parks, and National Monuments do not offer mountain climbing or surfing. No predictive models were built see if activities offered can predict region or park type, but there seem to be patterns in the data.
  • Does the geographical area (region) of the park influence the number of visitors?
    • Region did not appear to influence the number of visitors a park had, with one exception. Parks that were located in the Alaska region tended to have lower visitor numbers compared to parks in other regions, which makes sense given how remote and hard to get to Alaska is. It appeared that park designation was a better way to predict the number of visitors. Throughout the project, it is seen that National Parks have more visitors than other parks, and National Historic Sites/Parks and National Battlefields/Battlefield Parks were least visited in 2022.
  • Are there any activities that are more likely to be offered in a particular park or region?
    • The Northeast and Intermountain regions were examined since they were the two most frequent regions in the dataset. It was found that parks in the Northeast region are more likely to offer activities such as geocaching, live music, golfing, and arts and crafts to name a few. When looking at parks in the Intermountain region, they were more likely to offer canyoneering, jet skiing, whitewater rafting, and rock climbing. As you can see, these are very different types of activities and seem specific to the region.
  • Is there a difference in visitation numbers pre and post COVID-19? (If so, which parks?)
    • This question was not able to be answered during this project. In the data cleaning and exploratory data analysis stage, visitor trends for Great Smokey National Park were looked at since it was the most visited National Park in 2021 & 2022. By looking at the trends, there was a slight dip in 2020, and then after 2020, visitor rates were higher than they were pre 2020. Unfortunately, the models built in this project were mostly classification based, so looking at the trend of time series data was not able to be done. To be able to correctly answer this question, the averages of visitor numbers pre and post COVID could potentially be compared using a statistical test. A more time consuming approach could be to individually plot the trends for each park. It is also hard to know the true answer by only looking at the years 2016-2022. It would be more beneficial to have many more years of pre COVID-19 data.
  • Are there any variables that can be used to predict park type or region?
    • The number of activities a park offered as well as the 2022 visitor numbers were found to be really good predictors for whether or not a park was classified as a National Park or not. It was found that parks that had more activities offered and higher visitor numbers in 2022 were more likely to be a National Park than not.
  • Is air quality in a specific park correlated with the number of visitors? (Will look into parks that have air quality measuring devices)
    • This question ended up not being able to be answered. The format of the air quality data had an ozone concentration measurement for every hour of every day from January 1, 2016 to December 31, 2022. Given the format, there was no easy way to incorporate this data into any of the models built. It might have been useful if there was data for every park, that way the average ozone concentration for each year for each park could be a column in the main dataset, but only having a handful of ozone data make it impossible to do any real analysis with it.
  • Do activities or specific region of the park influence the number of visitors?
    • Overall, parks located in Alaska had less visitors and National Parks had more visitors. When looking at activities specifically, horse trekking was the most important activity. Parks that offered horse trekking were more likely to have average visitor numbers in 2022, whereas parks that didn’t offer this activity were more likely to have higher visitor numbers in 2022.
  • Do parks of similar size share any characteristics?
    • Throughout the analysis, the amount of acres a park had did not seem to show any interesting relationships or similarities between them. It appears that where the park is located, the park type, and activities are better are grouping together certain parks.
  • Do the park descriptions differ based on park type?
    • For each park, the words in the park description were looked at. 11 different groups were formed, many of them not having a distinction between park type. National Parks however made up a very large portion of one cluster and National Historic Sites, National Historic Parks, and National Monuments made up a large portion of another cluster. It appears that there is a a distinction in the park descriptions between these two clusters and that Historic Parks/Sites and Monuments have similar types of descriptions.

Sources:

Alba, C., Pan, B., Yin, J., Rice, W. L., Mitra, P., Lin, M. S., & Liang, Y. (2022, August 4). Covid-19’s impact on visitation behavior to US national Parks from communities of color: Evidence from Mobile Phone Data. Nature News. https://www.nature.com/articles/s41598-022-16330-z#citeas 

BEHSUDI, A. (2020, December 1). Impact of the pandemic on tourism – IMF F&D. IMF. https://www.imf.org/en/Publications/fandd/issues/2020/12/impact-of-the-pandemic-on-tourism-behsudi#:~:text=In%20the%20first%20half%20of,in%20a%20post%2Dpandemic%20world.