Maps Begin with Data
By Scott Malone
This summer, I worked with the Environmental Protection Agency, in Lenexa Kansas as a voluntary intern. I learned that making a time-series map takes more than just getting data and putting it into ArcGIS. For my summer project, I created a time-series animation of water quality trends using data from the Kansas City Urban Stream Network.
Hosted at KCWaters.org, the project is “dedicated to promoting greater awareness about the quality of water in the Greater Kansas City Area.” Using data acquired every 15 minutes over a 4 month time period (March 1st through July 10th, 2013) I created an animation showing water quality trends over our wet spring and early summer months. The process of constructing a visually informative and appealing animation from raw data was full of challenges. Unlike the canned projects I was accustomed to from my GIS courses in college, this project involved a significant amount of data manipulation before I was able to ever open up an Arcmap project and begin map-making.
Track stream conditions hourly using KCWaterbug. Find out more.
The Urban Stream Network consists of eighteen sites spaced across sixteen streams in the Kansas City metro area. Each site consists of a stream probe and telemetry box which collects readings on water temperature, conductivity, turbidity, and water depth. The readings are transmitted via satellite and compiled into a database using software called WISKI from a company named KISTERS. E Coli data is modeled for each stream (you can find out more at http://pubs.usgs.gov/sir/2008/5014/) based on other variables collected by instrumentation. With readings done every fifteen minutes over four months, I worked with an initial dataset of over 700,000 records grouped by station by each parameter all wrapped into one fun text file. I definitely experienced the joys of taking data and running through multiple processing steps before enjoying the fruits of my labor in a GIS friendly database.
First, I removed the header information (station name, number, other identifiers) provided for each parameter and converted it into a spreadsheet friendly format. I painstakingly created a spread sheet for each stream (16, remember), transposed data, added stream names, and added parameter names. With over 40,000 records for each stream (16, remember) the process was time consuming. Unfortunately such data processing can become necessary when working from data extracted for purpose different than my own. Once each stream was standardized, I combined them back together into a GIS usable table.
Adding time, or rather converting time, was another detail that I learned wasn’t always simple. Of course, creating a time-series map necessitates time stamps that ArcGIS can use for creating a time-aware dataset.
After running through this data manipulation exercise, I now have a much greater understanding of data management. I completely value having data in databases and extracting it out for an intended purpose. I also appreciate that the *I* in GIS is there for a very important reason! My next post will review how I took the telemetry data and started looking for interesting and useful trends.
Scott Malone is a graduate from the University of Kansas with a degree in Environmental Studies. He spent part of summer 2012 as a voluntary intern with the Environmental Services Division where he worked with LiDAR, land cover, and water quality telemetry data.
Editor's Note: The opinions expressed in Greenversations are those of the author. They do not reflect EPA policy, endorsement, or action, and EPA does not verify the accuracy or science of the contents of the blog.