My Earth Science Data Portfolio

Logo

Earth Data Analytics Certificate Course 2024

View My GitHub Profile

Documenting my Learning Adventure

This adventure made possible through the University of Colorado Boulder and the Environmental Data Science Innovation and Inclusion Lab

Contact Information

I have been the Project Coordinator for the Smoky Mountain STEM Collaborative {SMSC} at SCC since 2018. SMSC is part of NASA’s SciAct community. Southwestern is the nation’s only community college to have a collaborative partnership with NASA which started in 2015. My background is in science education with 16 years experience teaching High School and I have an M.S. in Biology with many years teaching at the college level. In addition, I also worked as the assistant director for Western Carolina University’s Upward bound Math & Science program in the 1990s. My passions are inspiring young people to answer their own questions and supporting those who live and work in western North Carolina.

My Projects for the Earth Data Analytics Professional Graduate Certificate

Adding a Map - we learned to:

SMSC is located at Southwestern Community College shown in the map below.

The land where SCC is located is the ancestral home of the Eastern Band of the Cherokee Indians in the mountains of western North Carolina and not far from the Qualla Boundary where the Cherokee People reside. This is my home too and I hope to foster stewardship of this land through data. Creating an interactive map using Open Street Map was our first assignment.

Climate Coding Challenge - we learned to:

A graph showing a linear regression of mean annual temperatures in Denver, Co A graph showing a linear regression of mean annual temperatures at Coweeta Hydrologic Lab

Here is my python code showing extraction and analysis of data from Coweeta Hydrologic Lab (elevation 2200-3800ft).

I chose Coweeta because it is near where I live and work and has been collecting data for a long time. Note that it’s hard to compare the two trendlines because one is in degrees Fahrenheit and the other is in degrees Celsius. However, the slope on both plots show a gradual trend upward. If I weren’t trying to finish the rest of the assignments, I might run some additional statistics, but for now I’m thrilled I can share this part of my work especially since hurricane Helene caused the NOAA data center to be offline for a little while.

Mapping Migration - Species Distribution challenge - we learned to:

This Interactive Map of Veery Migration was our guided learning exercise and then we picked another species to try on our own.

The Veery or Catharus fuscescens, is part of the Turdidae family. It is found in the southeastern US during migration and it may be able to anticipate hurricanes in the Atlantic according to a study by Christopher Heckscher. Unfortunately, Atlantic hurricanes tend to coincide with Veery migration and have a negative impact on their breeding season. This is an example of species that is studied in phenology - the impact of a changing climate on the cyclical pattern of an organisms life history.

Habitat Suitability Project - we learned to:

My work on this project can be viewed here.

This is a work in progress which I plan to complete eventually. Previously I worked with SCC student Stella Walborn using the TourIt platform from Infiniscope and a wealth of resources from regional groups to create a virtual tour of rivercane in WNC. You can find that here.

Sources:

Modeling Urban Asthma Rates

The learning goals for this project were:

In urban areas, vegetation can help clean the air from traffic and other air pollutants although simple measurements such as NDVI (Normalized Difference Vegetation Index) which document vegetative health have shown mixed results. The relationship between metrics such as mean patch size, edge density, fragmentation and human health may help quantify the benefit of greenspaces in an urban environment. This project investigated that idea.

A comparison of asthma rates with the geographic distribution of healthy vegetation as determined by NDVI.

A comparison of asthma rates with the geographic distribution of healthy vegetation as determined by NDVI. Note that there are some correlation of areas with lots of vegetation (dark green) correspond to areas with low rates of asthma (purple) are identifiable in the side by side maps.

To test the strength of this correlation, a linear ordinary least squares (OLS) regression model was used and shows that access to greenspace can explain some of the geographic distribution of asthma, but other factors are involved.

Here is my python code for the model. Please note there are some data gaps due to changes made on the CDC website.

Land Classification Project

The learning goals for this project were:

This project used an unsupervised K-means clustering algorithm to group land cover pixels by similar spectral signatures. The K-means algorithm helps to reveal patterns or clusters that have minimal within-cluster variation. This study used the harmonized Sentinal/Landsat multispectral dataset to look at patterns in vegetation data. The HUC region 6 watershed covers the drainage of the Tennessee River Basin from Kentucky to the Gulf of Mexico. Most of the Tuckasegee River basin, where I live, is within this region is forested.

The image above shows the location of Jackson County in North Carolina where the Tuckaseegee River originates.

Land Cover Interpretation based on Spectral Data

According to a publication by the North Carolina Department of Environmental Quality [1], the middle of the Tuckasegee River watershed shown in this analysis “…drains the west-central portion of Jackson County…[and] traditionally, land use in the watershed was agricultural with light residential and commercial activity along the transportation corridors”. In 2008, the NC Department of Mitigation Services designated Savannah Creek along with 18 other tributaries were identified for “…restoring wetland and stream functions such as maintaining and enhancing water quality, restoring hydrology, and improving fish and wildlife habitat.”[2].

Looking at the cluster analysis, # 1 may be vegetation along the river itself due to limited riparian buffer zones. Between 2001-2011 in unit 06010203 impervious surfaces increased by an average of 27 acres with forest converted by development (31 acres) or agriculture (2 acres). Clusters 2 and 3 dominate the plot and are most likely different forest types while clusters 4 and 5 are probably tied to residential and agricultural regions.

Here is my python code for the cluster analysis

Sources

  1. Tuckasegee River Subbasin HUC 06010203. 2006 available online HERE
  2. Little Tennessee River Basin Restoration Priorities. June 2008. Amended 2018. available online HERE

Habitat Suitability Project

The learning goals for this project were:

Rhododendron maximum is found in North Carolina and West Virginia which are both part of the Appalachian Mountains. Dudleya, et al identify Rhododendron maximum as an emerging foundation species following the decline of the American Chestnut and Eastern Hemlock. In addition, “Rhododendron affects numerous riparian forest ecosystem processes, including decomposition and nutrient cycling.” Of the four hardiness division established by Sakai, et al, Rhododendron maximum is listed in the very hardy category with a tolerance for temperatures down to -40 F, although the USDA recommends a minimum temperature of -13 F with 150 frost free days. Rhododendron maximum is adapted to medium & coarse textured soils with drought tolerance and medium tolerance for fire. It is found at elevations up to 6200 feet which includes all of Wayne County, WV and most of Jackson County, NC except for the highest peak Richland Balsam on the Blue Ridge Parkway.

This project intended to compare different climate models using MACAv2 data, however various roadblocks led to an incomplete project using pseudocode. I was able to establish the current distribution of Rhododendron maximum as shown below using data from the Global Biodiversity Information Facility.

In addition, the elevation of both counties provides suitable habitat as shown below.

If I had been successful in extracting minimum temperature data from the MACAv2 data, the next step would be to harmonize all rasters with the soil pH and percent sand composition of soils - shown below - along with the elevation data to analyze a consensus model of habitat suitability.

Information Sources

Capstone Project: Comparison of Surface Mineral Alteration by Fire at Two Different Scales

Project Team: Fellow classmate Hannah Rieder and NEON data science advisor Bridget Hass

Project Overview:

We will investigate the mineral content of the Earth’s surface based upon reflectance data and develop a python tutorial for others seeking to use these two datasets. We expect that to find a correlation NEON and EMIT data and therefore, expand data products available to NEON data users. In addition, we hope to analyze differences in mineral content before and after burning by wildfires. Using two different scales will allow us to use detailed information from NEON to interpret broader patterns found in EMIT data. “Ground truthing” with NEON data will help identify sources of variability in EMIT data and reduce the influence of outliers. Mineral identification for the EMIT library is effective for areas without dense vegetation and moisture, so comparison with NEON reflectance data may help minimize potential issues by clarifying signal interference sources.

A previous study by Park and Sim (2023) showed that Landsat and Lidar measures of burn severity were comparable and Lidar has been used to document tree mortality in difficult forest terrains (Bueno et al., 2025). The EMIT instrument on the International Space Station introduces the potential of extending Lidar measurements to include surface minerals. Handler (2019) noted that “…fire hazard pose threats to physical, biological, and social values in the project area such as: soil stability, hydrology and air quality, [and] wildlife habitat…” This project aims to develop another tool for evaluating forest restoration needs based upon soil characteristics using the EMIT mineral spectral library to classify local NEON spectral data. Understanding the severity of fire at the mineral level as well as the landscape level will be valuable to forest managers.

Project Workflow

Our general method will be to identify co-located EMIT L2B Estimated Mineral Identification data with NEON Airborne Observation Platform data in the Sierra National Forest Soaproot Saddle site where the Creek Fire (2020) and the Blue Fire (2021) occurred. The downloaded EMIT data after being orthorectified, will be used to create distribution maps based upon the library for 10 specific minerals. Cluster analysis will be used to classify the NEON data and then compare it with the EMIT maps. Ultimately, we would like to understand how fire disturbances affect Earth surface minerals and create a road map for future explorations of co-located data from these two sources.

Below is an image (from Google Earth Engine) showing an RGB band combo of the 2024 AOP SOAP hyperspectral data with the southern part of the Creek Fire boundary.

Listed below are a variety of resources used to support our project.

EMIT Data & Resources

EMITL2BMIN “provides estimated mineral identification and band depths in a spatially raw, non-orthocorrected format” (Green, 2023). Each granule has two NetCDF4 files with a 60 m spatial resolution that include mineral identification and uncertainty estimate. Minerals identified in this dataset include: calcite, chlorite, dolomite, goethite, gypsum, hematite, illite+muscovite, kaolinite, montmorillonite, and vermiculite.

NEON Data & Resources

This is hyperspectral raster data distributed in an open HDF5 format in UTM projection showing scaled reflectance. Each file contains all 426 reflectance bands for a single 1 km by 1 km tile.

My Jupyter Notebook for our project can be found HERE

Note: We are in the early stages of this project which will be completed in August 2025. In this group collaboration, Bridget Hass is guiding us on how to access and structure our work in jupyter notebook which is reflected on the main branch. To date, I have contributed research on the science of fire impacts on soil composition and Hannah Rieder has worked on the spectral data which is shown on her branch.

Sources: