Skip to content

Suggested Data Sources

2009 June 14

Tell us what data sources should be in Data Finder. Or tell us what the kinds of information you need to find. We will review your suggestions and revise Data Finder accordingly.

Editor's Note: The opinions expressed in Greenversations are those of the author. They do not reflect EPA policy, endorsement, or action, and EPA does not verify the accuracy or science of the contents of the blog.

31 Responses leave one →
  1. Mark Corrales permalink
    July 6, 2009

    For numerous suggested data sources,
    take a look at the extensive appendix, table 2, in this document:

    Georgopoulos P. (2008). A multiscale approach for assessing the interactions of environmental and biological systems in a holistic health risk assessment framework. Water, Air, and Soil Pollution: Focus 8(1): 3-21. DOI:10.1007/s11267-007-9137-7 [DOI link]
    http://dx.doi.org/10.1007/s11267-007-9137-7
    http://www.springerlink.com/content/5r8100723p704h16/fulltext.pdf

  2. EPA staff permalink
    June 22, 2009

    Additional suggestions for links in the data finder…..

    here’s some things that NEIC uses that might be of interest:

    Pacific Northwest National Laboratory Infrared Spectral Library
    IRIS database (toxicological information)
    NIST CHEMBOOK
    Envirofacts (TRI)
    RCRA Online (policy letters)
    SW846 online (hazardous waste methods)
    OW methods databases
    Henrys Law constant database

    They also use EPA Models such as Minteqa2, Screen3, ISC, Water9, IWAIR, and AERMOD, but these may not fit within the database framework of the Data Finder site.

  3. Craig Alvord permalink*
    June 5, 2009

    There are some good data available through the site. Access to data might be made easier for the public if terms such as marine, ocean, and vessel discharges provide results via searching.

  4. Steve W permalink
    June 4, 2009

    Have you tapped the information obtained during the National Dialog? Much of our input was internal EPA, especially during the Jam sessions. I believe I recall specific data systems being mentioned.

  5. EPA Staff permalink
    June 4, 2009

    I think the Nutrients Database would be a useful addition to this meta-database.
    As a future addition, at least for nutrients, the ability to coordinate with universities and other research facilities would be of immense value.

  6. Steve W permalink
    June 4, 2009

    Add and report out metrics –
    We are collecting data in real time. We should also be capable of analyzing the data in real or near real time?

    Examples – Daily frequency of comments, where are comments comming from, how many new datasets suggested, how many suggestions incorporated / pending / turned down, how many bugs identified. Later we will also want to know, how many datasets downloaded? How many datasets accessed? By whom (generally) ? We have the technology. How will measure sucess or failure?

  7. Lucy Stanfield permalink
    June 2, 2009

    Surface Water – link to GLENDA (I see it’s under Monitoring as well)

    Hazardous Waste – link to RCRAInfo rather than (or in addition to) the RCRAOnline document database. Your definition of data stresses websites where numerical data can be downloaded. I would think RCRAInfo would be the better link for DataFinder to highlight.
    http://www.epa.gov/enviro/html/rcris/rcris_query_java.html

    I searched for PCBs and no results were found. What about linking to http://www.epa.gov/epawaste/hazard/tsd/pcbs/pubs/data.htm?

    For the Climate Change topic, you might want to include a link to this page on Renewable Energy.
    http://www.epa.gov/renewableenergyland/
    The data is stored in an Excel file and posted as a Google Earth file also.

  8. Thomas McCurdy permalink
    June 2, 2009

    Nice website in general.  

    CHAD, of which I am the developer, is a database of human activities, and as such best fits in:  Health Risks / Exposure.  It has almost nothing about Health Effects in it, but is used to do exposure assessments.  It should be located where HEDS is, developed also by our Division in NERL.  By the way, it soon will be part of HEDS, which will be used as a portal.

  9. EPA staff permalink*
    June 2, 2009

    TITLE: actor

    DESCRIPTION: ACToR (Aggregated Computational Toxicology Resource) is a collection of databases collated or developed by the US EPA National Center for Computational Toxicology (NCCT). More than 200 sources of publicly available data on environmental chemicals have been brought together and made searchable by chemical name and other identifiers, and by chemical structure. Data includes chemical structure, physico-chemical values, in vitro assay data and in vivo toxicology data. Chemicals include, but are not limited to, high and medium production volume industrial chemicals, pesticides (active and inert ingredients), and potential ground and drinking water contaminants.

    URL: http://actor.epa.gov/actor/faces/ACToRHome.jsp

    ORGANIZATION: ord

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

  10. EPA staff permalink*
    June 2, 2009

    TITLE: environmental enforcement results

    DESCRIPTION:

    URL:

    ORGANIZATION: Region 5′s Office of Regional Counsel

    GEOGRAPHIC_SCALE: Regional

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

    COMMENT: I would like to see this tool be used to retrieve environmental enforcement data (e.g. injunctive relief, penalties, supplemental environmental projects)

  11. EPA Staff permalink*
    June 2, 2009

    TITLE: Pollution Abatement Costs and Expenditures (PACE)

    DESCRIPTION: The Pollution Abatement Costs and Expenditures (PACE) survey is the most comprehensive national source of pollution abatement costs and expenditures related to environmental protection for the manufacturing sector of the United States. The PACE survey collects facility-level data on pollution abatement capital expenditures and operating costs associated with compliance to local, state, and federal regulations and voluntary or market-driven pollution abatement activities. Because the facility-level data contains confidential information, user access to the data at this level or detail is managed by the Census Bureau. More aggregate data and summary statistics have been published and are available from Census and the EPA.

    URL: http://yosemite.epa.gov/ee/epa/eed.nsf/pages/pace2005.html

    ORGANIZATION: AO, OPEI, NCEE

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

    COMMENT: This data is collected by the Census Bureau working under an IAG with the EPA. The original data series began back in the early 1970s and was collected on mostly an annual basis up until the mid-1990s. At that point, largely due to budget issues (Census had been paying for the survey) Census discontinued the survey. After a period of time passed, EPA agreed to begin paying for the data to be collected, and a dedesigned and renewed survey was implemented to collect 1999 data. Again, there was a lapse in the survey due to budget constraints and efforts to reassess and redesign the survey, so the next survey collected data from 2005 (what the EPA and Census websites both focus on). The Census’ website on the PACE survey is http://www.census.gov/mcd/pace.htm
    The EPA website is: http://yosemite.epa.gov/ee/epa/eed.nsf/pages/pace2005.html

    The Census Bureau is resonsible for securing the plant-level data and survey responses, but can provide access to the micro-level data to EPA and other organizations or researchers for a fee. Historically, researchers have made considerable use of the micro-level data, including linking the plant-level data to other economic and environmental data for research purposes.

    Possible keywords: pollution abatement, economic cost

    Contacts: Brett Snyder, Cynthia Morgan, and Ron Shadbegian

  12. EPA staff permalink*
    June 2, 2009

    TITLE: Beach Advisory Data

    DESCRIPTION: Database of beach advisories at coastal and Great Lakes beaches that receive EPA Beach Act grants.

    URL: http://iaspub.epa.gov/waters10/beacon_national_page.main

    ORGANIZATION: Office of Water

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

    COMMENT: Contact is Bill Kramer

  13. EPA staff permalink*
    June 2, 2009

    TITLE: National Listing of Fish Advisories

    DESCRIPTION: Database of state and tribal advisories on consumption of fish

    URL: http://134.67.99.49/scripts/esrimap.dll?name=Listing&Cmd=Map

    ORGANIZATION:

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

    COMMENT: Contact is Jeff Bigler

  14. June 2, 2009

    Lots of services are published on geodata.epa.gov. I think it would be helpful to list the services for the data systems as they do in http://www.data.gov

  15. Michael Hessling permalink
    May 29, 2009

    2nd the BASINS. There’s a ton of data there.

    Also, PRAWNS. It basically outputs to BEACON, which you can see at http://iaspub.epa.gov/waters10/beacon_national_page.main . It’s possible to download this raw data, for each state, from here: http://www.epa.gov/waterscience/beaches/seasons/

    • emcmah02 permalink*
      May 31, 2009

      Michael, Thanks for suggesting data sources for Data Finder. I looked at BEACHES and BEACON and it appears that they provide data about beach condition and closure. You can find data about a beach of interest by clicking progressively deeper in the site.

      For this version of Data Finder we’ve defined data sources as sites where you can download numerical data. I’m not sure whether these sites fit that definition because you have to click a few levels deep in order to get to the data. People told us that they needed a place to find numerical data, so we’ve tried not to include databases of non-numerical information, like lists of Superfund sites. Based on user feedback we may change the site to include sites that are more query oriented, help people find specific data sets, or find tools. Do you have any suggestions for what sites should be included or excluded?

      • Michael Hessling permalink
        June 1, 2009

        Understood, Ethan. Thanks.

        You can download the raw BEACON data by visiting the seasons pages and clicking on a state (example: http://www.epa.gov/ost/beaches/seasons/2007/xl/ca.xls ). OW doesn’t yet provide a one-file download of all the data, but that, I’m sure, can be done. (I know that xls isn’t the best format, but it’s still the raw data. Also, 2008 data will go up this summer.)

        The Nutrients DB, which you do include in Data Finder, is the same way. You have to drill down a few pages to get to the data.

  16. May 28, 2009

    TITLE: GIS Data

    DESCRIPTION: EPA’s Geospatial Data Catalog

    URL: http://www.epa.gov/geospatial/data.html#catalog

    ORGANIZATION: US EPA

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: Data Connection

    COMMENT: Has data services and GIS data downloads

  17. Lisa Jenkins permalink
    May 27, 2009

    I’ve sent you comments using yellow stickies on PDF files, but here are most of them for the blog:

    Link to ACRES in CIMC or Envirofacts for Brownfields data.
    Link to RCRAInfo query in Envirofacts or to CIMC for RCRA and RCRA Corrective Action sites – not to RCRA Online which is documents, not sites.
    SuperCPAD is a better link for Superfund sites than whta you have – it gets people directly to the data
    New ideas:
    What about the geospatial download!!? It is part of Envirfacts, but really works with GoogleEarth and other mashups.
    What about the system Jerry Johnston has set up to create data sets for mashups?

  18. May 27, 2009

    TITLE: BSAF Data Set

    DESCRIPTION: EPA MED researchers developed a data set of approximately 20,000 biota-sediment accumulation factors (BSAFs) from 20 locations (mostly Superfund sites) for nonionic organic chemicals, e.g., PCBs, PCDDs, PCDFs, DDTs, PAHs, and pesticides. Fresh, tidal, and marine ecosystems are included in the data set, and species in the data set include fish and benthic species (e.g., lobster, crayfish, and benthic invertebrates). The purpose of the data set is fivefold: i) provides tools for evaluating the reasonableness of BSAFs from other locations, ii) provides a tool for building a BSAF data set for locations of your interest, iii) provides data for performing bounding assessments of risks for locations where limited or no bioaccumulation are available, iv) permits inquiry into underlying relationships and dependences of BSAFs upon ecosystem conditions and parameters, and v) allows comparison of PCB, PCDD, and PCDF residues to residue-effects data download from PCBRes (see
    http://www.epa.gov/med/prods_pubs.htm).

    URL: http://www.epa.gov/med/Prods_Pubs/bsaf.htm

    ORGANIZATION: ORD, NHEERL, MED

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

  19. EPA Staff permalink*
    May 27, 2009

    TITLE: PCBRes Database

    DESCRIPTION: The PCBRes database is used by scientists and risk assessors in correlating PCB and dioxin-like compound residues with toxic effects. The purpose is to develop PCB critical residue values for fish, mammals and birds, especially as these relate to aquatic and aquatic-dependent species. This database includes expression of critical residue values based upon PCB Aroclors and total PCB-based congener specific methods because PCBs occur as complex mixtures. Because PCB toxicity occurs via the arylhydrocarbon-receptor (AhR), PCB toxicity has also been expressed using the sum of the dioxin-like PCBs after adjustment using toxicity equivalence factors (TEF). Limited dioxin and furan compounds in single and mixture studies are also included.

    URL: http://www.epa.gov/med/Prods_Pubs/pcbres.htm

    ORGANIZATION: ORD, NHEERL, MED

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

    COMMENT: As I mentioned in an email, I think a general category on environmental toxicology would be useful.

  20. EPA Staff permalink*
    May 27, 2009

    TITLE: Enforcement and Compliance History Online (ECHO)

    DESCRIPTION: Enforcement data

    URL: http://www.epa-echo.gov/echo/index.html

    ORGANIZATION:

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

  21. EPA Staff permalink*
    May 27, 2009

    TITLE: Envirofacts

    DESCRIPTION: Various data sets. Could be referenced individually in Data Finder. Don’t necessarily need Envirofacts.

    URL: http://www.epa.gov/enviro/index.html

    ORGANIZATION: Various

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

  22. EPA Staff permalink*
    May 27, 2009

    TITLE: Cleanups in My Community

    DESCRIPTION: Cleanups in My Community is a mapping and listing tool that shows sites where pollution is being or has been cleaned up throughout the United States. It maps, lists and provides cleanup progress profiles for:

    Sites, facilities and properties that have been contaminated by hazardous materials and are being, or have been, cleaned up under EPA’s Superfund, RCRA and/or Brownfields cleanup programs.
    Federal facilities that have been contaminated by hazardous materials and are being, or have been, cleaned up under EPA’s Superfund and/or RCRA cleanup programs.

    URL: http://iaspub.epa.gov/Cleanups/

    ORGANIZATION: OSWER

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

  23. EPA Staff permalink*
    May 27, 2009

    TITLE: Geo Spatial Data

    DESCRIPTION: Geospatial data for various EPA data @ http://www.epa.gov/enviro/geo_data.html

    URL: http://www.epa.gov/enviro/geo_data.html

    ORGANIZATION: various

    GEOGRAPHIC_SCALE: National

    REGISTRY_SYSTEM: No

    REGISTRATION: No

    SERVICE: None

  24. Debbie Westerman permalink
    May 27, 2009

    Ecosystems is not on your list. Climate change is on it twice. It doesn’t look like you are following the EPA Web Standards; i.e. I think the pictures should have the grey border not green and quickfinder. I think it is a great idea.

    - Debbie Westerman

  25. Cindy Walke permalink
    May 27, 2009

    1) The Clean Air Markets Division (OAR) has a few data sources and I don’t see them listed. We have “Data and Maps” at http://camddataandmaps.epa.gov/gdm/ and we have CASTNet (deposition data) at http://www.epa.gov/castnet/

    - Cindy Walke

  26. Maureen O’Neill permalink
    May 27, 2009

    Consider including the ACE (America’s Children and the Environment) Summary List of Measures. While they are data summaries and not raw data, it’s useful for seeing how EPA data are combined with other sets. If you go to the measures themselves, it lists the way they combine EPA sets with CDC and Census, for example.

    For those of us that work on environmental health issues, EPA data are great to have, but not much help if we don’t have the health end combined. These combinations are important and what can add power to the enviro set.

    While the EPA links may be repetitive to what you already have, combining with outside federal agency datasets is not repetitive and it does provide links. Here’s a link an example.
    http://www.epa.gov/economics/children/contaminants/e1-sources.htm

    Thanks again for sharing. I’m glad you are doing this.

    - Maureen O’Neill

  27. Zenny Sadlon permalink
    May 27, 2009

    Two of my favorites are BASINS for water and NATA for air that are not found.

    Zenny Sadlon

  28. emcmah02 permalink*
    May 26, 2009

    Thanks for these ideas. We’re looking for “data sources,” sites where you can download numerical data from EPA.

    I see that AFS is owned by OECA and we’ll note it as such. I think you’re right about AFS and its focus on facilities rather than data about air. We’ll look into which sites are data sources (AFS, AQS, NEI, AirData).

  29. Tom Dessent permalink
    May 26, 2009

    Additional sites for air pollution / air quality data:
    Air Compare – http://www.epa.gov/aircompare/
    Air Data – http://www.epa.gov/air/data/
    Air Emission Sources – http://www.epa.gov/air/emissions/
    Air Explorer – http://www.epa.gov/airexplorer/
    Clean Air Markets Data and Maps – http://camddataandmaps.epa.gov/gdm/
    National-Scale Air Toxics Assessment – http://www.epa.gov/ttn/atw/natamain/
    (See http://www.epa.gov/air/airpolldata.html for thumbnail descriptions of these sites.)

    Correction/suggestion:
    The AIRS/AFS database (listed in Data Finder) is mainly regulatory compliance information, and does not have much data about air releases. Metadata should say AIRS/AFS is owned by the Office of Compliance, rather than OAR.

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS