Towards Deeper, Faster Voluminous Spatiotemporal Data Explorations

The Sustain project is a collaboration between Colorado State University, Arizona State University, University of California — Irvine, and the University of Maryland – Baltimore County.

The project includes an advisory board that includes representation from academia, industry, and citizen science.

The United States is highly urbanized with more than 80% of the population residing in cities. Cities draw from and impact natural resources and ecosystems while utilizing vast, expensive infrastructures to meet economic, social, and environmental needs. The NSF has invested in several strategic research efforts in the area of urban sustainability, all of which generate, collect, and manage large volumes of spatiotemporal data.

Voluminous datasets are also made available by governmental agencies and NGOs in domains such as climate, ecology, health, and census. These data can spur exploration of new questions and hypotheses, particularly across traditionally disparate disciplines, and offer unprecedented opportunities for discovery and innovation. However, the data are encoded in diverse formats and managed using a multiplicity of data management frameworks — all contributing to a Balkanization of the observational space that inhibits discovery. A scientist must reconcile not only the encoding and storage frameworks, but also negotiate authorizations to access the data.

A consequence is that data are locked in institutional silos, each of which represents only a sliver of the observational space. This project, SUSTAIN (Software for Urban Sustainability to Tailor Analyses over Interconnected Networks), facilitates and accelerates discovery by targeting the alleviation of data-induced inefficiencies.

  • Federating data across diverse administrative domains.
  • Data encoded in diverse formats such as netCDF, BUFR, HDF 4/5, CSV, etc.
  • Data stored using diverse storage frameworks such as file systems, relational databases, NoSQL systems, and document stores.
  • Imputations of missing data at diverse spatiotemporal scopes.
  • Visualization of voluminous spatiotemporal datasets
  • Interoperation with commercial spatial analyses software such as ESRI’s ArcGIS and Google Earth Engine.

© The Sustain Project
This research has been supported by funding from the US National Science Foundation’s CSSI program through awards 1931363, 1931324, 1931335, and 1931283. The project is a joint effort involving Colorado State University, Arizona State University, the University of California-Irvine, and the University of Maryland – Baltimore County.