loading page

Seamless Long-Tail and Big Data Access via the EarthCube Brokering Cyberinfrastructure BALTO
  • +8
  • D. Sarah Stamps,
  • James Gallagher,
  • Scott Peckham,
  • Anne Sheehan,
  • Nathan Potter,
  • Kodi Neumiller,
  • Emmanuel Njinju,
  • Maria Stoica,
  • Zachary Easton,
  • Daniel Fuka,
  • David Fulker
D. Sarah Stamps
Virginia Tech

Corresponding Author:dstamps@vt.edu

Author Profile
James Gallagher
OPeNDAP
Author Profile
Scott Peckham
University of Colorado Boulder
Author Profile
Anne Sheehan
University of Colorado Boulder
Author Profile
Nathan Potter
OPeNDAP
Author Profile
Kodi Neumiller
OPeNDAP
Author Profile
Emmanuel Njinju
Virginia Tech
Author Profile
Maria Stoica
University of Colorado Boulder
Author Profile
Zachary Easton
Virginia Tech
Author Profile
Daniel Fuka
Virginia Tech
Author Profile
David Fulker
OPeNDAP
Author Profile

Abstract

The EarthCube BALTO broker (Brokered Alignment of Long-Tail Observations) provides streamlined access to both long-tail and big data using Web Services through several distinct mechanisms. First, we updated the OPeNDAP framework Hyrax, software that serves big data from USGS, NASA, and other sources, with a BALTO extension that tags dataset landing pages with JSON-LD encoding automatically. Therefore, the big data made available through Hyrax are now searchable via EarthCube GeoCODES (formerly P418) and Google Dataset Search. The BALTO broker extension to Hyrax makes thousands of datasets easily searchable and accessible. Second, we focused our efforts on a geodynamics use-case aimed at advancing our understanding of continental rifting processes through the use of an NSF mantle convection code called ASPECT. By addressing this use-case, we implemented a web services brokering capability in ASPECT that allows for remotely accessing datasets via a URL defined in an ASPECT parameter file. Third, through another use-case in ASPECT aimed at testing hypotheses involving global mantle flow, we developed a brokering mechanism for a “plug-in” that accesses NetCDF seismic tomography data from the NSF seismology facility IRIS, then transforms it into the format needed by ASPECT to run global mantle flow models constrained by seismic tomography. Fourth, we demonstrate methods to allow any scientist or citizen scientist to make their in-situ IoT based sensor data collection efforts available to the world. Finally, we are developing a Jupyter Notebook with a GUI that allows for users to search Hyrax servers for big datasets and long-tail data. These cyberinfrastructure developments comprise the entire EarthCube BALTO brokering capabilities.