Ruth Duerr

and 5 more

The five divisions of NASA’s Science Mission Directorate (SMD) represent a very broad spectrum of academic disciplines, ranging from Astronomy, to Planetary science, to Heliophysics, Earth science, Biology and Physical science with measurement scales ranging from components of atoms to the structure of the entire universe. In addition, the systems that support access to these data range from systems based on formal and broadly accepted OWL ontologies, to those based on current and historical disciplinary metadata standards, to ad-hoc or bespoke systems dating back to NASA’s very earliest missions; all generally developed to support the mission or, more recently, discipline focussed data users. Consequently the access mechanisms, data structures, vocabularies, terms in use, etc. vary widely across the divisions making cross-disciplinary research at best difficult if not impossible. Currently NASA SMD is working to improve support for cross-disciplinary/transdisciplinary research by developing a system that supports discovery across all of SMD’s data products, a model that can be extended to all forms of scientific output including software, tools, models, publications, etc. The core underpinnings of such a system is an information model being developed using the methodology developed by Dr. Peter Fox and Dr. Deborah McGuinness. Here we discuss the model (a knowledge graph), lessons learned along the way, and key findings for other systems attempting to bridge across broad disciplinary challenges.
As the problems humanity faces become ever more obvious and dangerous, the need for interdisciplinary, cross-disciplinary and trans-disciplinary research and solutions becomes ever more apparent. The problems themselves are often intertwined in complex ways - for example the impact of climate change on human health, food & water security, disasters, and so on and how all of these are exacerbated by human population growth and a general lack of recognition that humanity is part of the ecosystem upon which Earthly life depends. Underpinning our ability to understand and solve these complex problems are data of all kinds, more importantly the information that has and continues to be garnered through obtaining and analyzing these and most importantly their inclusion in the larger body of knowledge and ways of knowing that hopefully leads to the wisdom needed to directly address the root causes of each problem. Over the past decades, through involvement in projects on science topics as diverse as quasars and stellar pediatrics, solar physics, social science, and a wide variety of Earth Science topics; where informatics topics such as data management and curation, systems and framework development; and tools and methodologies such as Use Cases, Natural Language Processing, Deep Learning, Knowledge Graphs, Information Models, etc. were included; a number of principles and lessons learned which unite the technical underpinnings of these projects have come to the fore. These will be discussed in this presentation.

Rebecca Koskela

and 4 more

The EarthCube Technology & Architecture Committee formed a Resource Registry Working Group (WG) to develop a framework for a registry of EarthCube (EC) resources, enabling users to discover scientific and technical resources (software, tools, vocabularies, etc.) that are relevant to their research. The registry will promote EC investments, reduce time to science, help enable interdisciplinary research, more clearly define what is EC, and provide a vehicle for tool and software producers to notify the community about new products, increase visibility, and gain recognition. A primary requirement is to enable systematic description of EarthCube computational resources in terms of their functionality and interfaces for utilization, to enable users to identify components that can work together in integrated workflows. This requires understanding the specifics of how a software component communicates—both the messaging protocol, and the syntax and semantics of information formats getting data into and out of a component. This registry would work in conjunction with schema.org dataset descriptions being developed by the community to streamline linkage of data and software components for research workflows. The WG created definitions for a set of resources to include in a first iteration of the registry, and a set of properties that should be specified for all resources, as well as properties specific to particular resource types. The suggested resource types are: Software, Interface/API, Interchange format, Dataset, Repository, Service, Platform, Vocabulary/ontology/Information model, Specification, Catalog/registry, and Use Case. Dataset and Use Case resources registration is out of scope for the WG project, to be handled separately. Elaboration of this registry is in the workplan for EarthCube, with the goal maximum reuse of existing vocabularies and technology and compatibility with related registry activities.