SemSci 2018: Enabling Open Semantic Science

2nd International Workshop co-located with ISWC 2018, October 2018, Monterey, California, USA


NEWS: Check out the follow up Special Issue on Semantic Science on Semantic Web Journal (deadline 16 Nov 2018): http://www.semantic-web-journal.net/blog/call-papers-special-issue-semantic-escience-methods-tools-and-applications#


In the past few years, a push for open reproducible research has led to a proliferation of community efforts for publishing datasets, software and methods, described in scientific publications. These efforts underpin research outcomes much more explicitly accessible. However, the actual time and effort required to achieve this new form of scientific communication remains a key barrier to reproducibility. Furthermore, scientific experiments are becoming increasingly complex, and ensuring that research outcomes become understandable, interpretable, reusable and reproducible is still a challenge. The goal of this workshop is to incentivise practical solutions and fundamental thinking to bridge the gap between existing scientific communication methods and the vision of a reproducible and accountable open science.

Semantic Web technologies provide a promising means for achieving this goal, enabling more transparent and well-defined descriptions for all scientific objects required for this envisioned form of science and communication. We are particularly interested in four kinds of contributions:

  1. Novel approaches to analyze scientific publications in order to explicitly describe the relationship between their methods and research outputs
  2. Novel approaches that use the research outputs of a scientific publication to facilitate its understanding and reuse (e.g., by generating explanations of results, interactive visualizations or linking datasets and methods)
  3. Novel approaches that help comparing and relating software, datasets and methods used in different publications
  4. Novel approaches to apply Semantic Web and Linked Data techniques to scientific workflows used in research.

Topics of Interest

Topics for submissions include, but are not limited to:

  • Tools, methods and use cases for helping linking existing papers to their research products: data, software, methods and execution traces.
  • New methods for helping linking scientific papers to other papers (e.g., papers that use similar approaches, similar methods, common software, common data, etc.)
  • New methods for helping visualizing and presenting scientific information to scientists (e.g., provenance-based visualizations, summaries, presenting results at different levels of granularity, etc.)
  • New approaches for extracting the specific steps used in a method described expressed in a scientific paper.
  • New methods for generating automated explanations of scientific results.
  • New approaches for comparing methods, protocols and methodologies expressed in scientific papers.
  • New methods to highlight the differences between execution runs of a scientific experiment (based on their configuration, performance, results, etc.)
  • Tools and methods for discovering data and software used in similar publications or to address similar problems.
  • Vocabularies and ontologies that help relate and describe software, data, methods and provenance used in a scientific publication.
  • Vocabularies and ontologies that help capturing and presenting experiment information to scientists.
  • Automatic annotation of scientific research
  • Provenance, quality, privacy and trust of scientific information
  • Novel visualizations of scientific data
  • Novel approaches to apply Linked Data and Semantic Web techniques to scientific workflows

Workshop schedule - Tuesday, October 9th

09:00-09:10 Introduction (slides)
Session 1: Knowledge Graphs in Semantic Science
09:10-09:50 Keynote speaker: Paul Groth. The Challenge of Deeper Knowledge Graphs for Science (slides). Over the past 5 years, we have seen multiple successes in the development of knowledge graphs for supporting science in domains ranging from drug discovery to social science. However, in order to really improve scientific productivity, we need to expand and deepen our knowledge graphs. To do so, I believe we need to address two critical challenges: 1) dealing with low resource domains; and 2) improving quality. In this talk, I describe these challenges in detail and discuss some efforts to overcome them through the application of techniques such as unsupervised learning; the use of non-experts in expert domains, and the integration of action-oriented knowledge (i.e. experiments) into knowledge graphs.
09:50-10:10 Jim McCusker, Sabbir Rashid, Nkechinyere Agu, Kristin Bennett and Deborah McGuinness. Developing Scientific Knowledge Graphs Using Whyis
10:10-10:30 Chun Lin, Hang Su, Craig Knoblock, Yao-Yi Chiang, Weiwei Duan, Stefan Leyk and Johannes Uhl. Building Linked Data from Historical Maps
10:30-11:00 Coffee break
Session 2: Reproducibility of scientific experiments
11:00-11:40 Keynote speaker: Hala Skaf. From Scientific Workflows to Linked Experiment Reports. Scientific Workflow management systems have been largely adopted by data-intensive science communities. PROV has been adopted by a number of workflow systems for encoding the traces of workflow executions. Exploiting these provenance traces is hampered by the heterogeneity of the generated provenance traces in cross-workflow provenance and the difficultly for a human user to browse and understand the large generated provenance graphs. In this talk, I present SHARP a linked data approach for harmonizing cross-workflow provenance and mining provenance graph. SHARP allows to produce linked in silico domain-specific experiment reports represented as Micropublications or nanopublications. Experimental results using real-world omic experiments involving workflow traces generated by Taverna and Galaxy systems demonstrate the feasibility of the approach.
11:40-12:00 Alasdair Gray. Using a Jupyter Notebook to perform a reproducible scientific analysis over semantic web sources
12:00-12:20 Carlos Buil Aranda and Maximiliano Osorio. Reproducibility of computational environments for Scientific Experiments using Container-based virtualization
12:20-14:00 Lunch
Session 3: Disseminating Open Semantic Science
14:00-14:20 Marilena Daquino, Ilaria Tiddi, Silvio Peroni and David Shotton. Creating Open Citation Data with Bcite
14:20-15:20 Round table: Challenges for communication and dissemination of Open Science
15:20-16:00 Coffee break
Session 4: Understandability of experiment results
16:00-16:20 Gully Burns, Xiangyang Shi, Yue Wu, Huaigu Cao and Premkumar Natarajan. Towards Evidence Extraction: Analysis of Scientific Figures from Studies of Molecular Interactions
16:20-16:40 Raul Alejandro Vargas Acosta, Luis Garnica Chavira, Natalia Villanueva Rosales and Deana Pennington. Towards SWIM Narratives for Sustainable Water Management
16:40-17:20 Keynote speaker: Yolanda Gil. Computational Knowledge Graphs.
This talk proposes Computational Knowledge Graphs (CKGs) as a new paradigm that combines the structure of knowledge graphs and the reasoning power of semantic workflows. CKGs connect physical entities and variables of interest via computations that reflect natural laws and constraints. CKGs can provide important capabilities to understand complex dynamic systems in science.
17:20-17:30 Wrap up and town hall.

Accepted papers and reviews

Workshop proceedings: http://ceur-ws.org/Vol-2184/

Accepted papers:

  • Marilena Daquino, Ilaria Tiddi, Silvio Peroni and David Shotton. Creating Open Citation Data with BCite. [PDF, HTML, Reviews]
  • Alasdair Gray. Using a Jupyter Notebook to perform a reproducible scientific analysis over semantic web sources. [PDF, HTML, Notebook, Reviews]
  • Raul Alejandro Vargas Acosta, Luis Garnica Chavira, Natalia Villanueva Rosales and Deana Pennington. Towards SWIM Narratives for Sustainable Water Management. [PDF, Reviews]
  • Gully Burns, Xiangyang Shi, Yue Wu, Huaigu Cao and Premkumar Natarajan. Towards Evidence Extraction : Analysis of Scientific Figures from Studies of Molecular Interactions. [PDF, Reviews]
  • Carlos Buil Aranda and Maximiliano Osorio. Reproducibility of computational environments for Scientific Experiments using Container-based virtualization. [PDF, Reviews]
  • Jim McCusker, Sabbir Rashid, Nkechinyere Agu, Kristin Bennett and Deborah McGuinness. Developing Scientific Knowledge Graphs Using Whyis. [PDF, Reviews]
  • Chun Lin, Hang Su, Craig Knoblock, Yao-Yi Chiang, Weiwei Duan, Stefan Leyk and Johannes Uhl. Building Linked Data from Historical Maps. [PDF, Reviews]

Submission Guidelines

Paper submission and reviewing for this workshop will be electronic via EasyChair. The papers should be written in English, following the Springer LNCS format, and be submitted in PDF on or before June 1st June 8th, 2018. SemSci2018 explicitly encourages alternative and enhanced submission formats such as HTML or communicative online materials. Authors who are preparing such a submission should contact the workshop organizers in advance to make sure we can accommodate for them in the submission and review process. All deadlines are midnight Hawaii time.

Papers submitted to the workshop are also encouraged to share their research products online, assigning a DOI when necessary. Workshop organizers will provide pointers and guidelines for this purpose, based on the ISWC Resources Track submission guidelines.

The following types of contributions are welcome.

  • Full research papers (8 pages)
  • Position papers (4-6 pages)
  • Short research papers (4-6 pages)
  • System/tool papers (4-6 pages)
  • Posters (2 pages)

Accepted papers will be published at the CEUR workshop series and in the SemSci website.

Open review: Reviewers and authors are encouraged to participate in an open review process to make the discussion as transparent as possible.

Important Dates

  • Workshop papers due: June 1 June 8 (extension), 2018
  • Notification of accepted workshop papers: June 27 29-30, 2018
  • Camera ready workshop papers: July 31, 2018
  • Publication of workshop proceedings: August 15, 2018
  • Workshops held: October 9th, 2018

Invited Speakers

This year SemSci will have three invited speakers:
  • Paul Groth, Elsevier Labs.
    Title: The Challenge of Deeper Knowledge Graphs for Science.
    Abstract: Over the past 5 years, we have seen multiple successes in the development of knowledge graphs for supporting science in domains ranging from drug discovery to social science. However, in order to really improve scientific productivity, we need to expand and deepen our knowledge graphs. To do so, I believe we need to address two critical challenges: 1) dealing with low resource domains; and 2) improving quality. In this talk, I describe these challenges in detail and discuss some efforts to overcome them through the application of techniques such as unsupervised learning; the use of non-experts in expert domains, and the integration of action-oriented knowledge (i.e. experiments) into knowledge graphs.

  • Hala Skaf, Associate Professor, Nantes University.
    Title: From Scientific Workflows to Linked Experiment Reports.
    Abstract: Scientific Workflow management systems have been largely adopted by data-intensive science communities. PROV has been adopted by a number of workflow systems for encoding the traces of workflow executions. Exploiting these provenance traces is hampered by the heterogeneity of the generated provenance traces in cross-workflow provenance and the difficultly for a human user to browse and understand the large generated provenance graphs. In this talk, I present SHARP a linked data approach for harmonizing cross-workflow provenance and mining provenance graph. SHARP allows to produce linked in silico domain-specific experiment reports represented as Micropublications or nanopublications. Experimental results using real-world omic experiments involving workflow traces generated by Taverna and Galaxy systems demonstrate the feasibility of the approach.
  • Yolanda Gil, Information Sciences Institute, USC.
    Title: Thoughtful Artificial Intelligence: Forging a New Partnership for Data Science and Scientific Discovery.
    Abstract: We face increasingly complex integrative problems of societal importance that are orders of magnitude more challenging every decade. Computing has had a prime role in handling that complexity by scaling up calculations over data, more recently through powerful artificial intelligence techniques such as deep learning. I foresee qualitatively different advances stemming from new artificial intelligence approaches for scaling up reasoning over knowledge to systematically search through complex information spaces. In this talk, I will describe recent research to develop intelligent systems capable of automating hypothesis-driven discovery by capturing knowledge about experiment design strategies that determine what data and analysis methods can be used to test and revise a given hypothesis. I will propose seven principles and a research agenda for developing “thoughtful artificial intelligence” with capabilities that would significantly augment our ability to tackle fundamental problems in data science and scientific discovery that have been a barrier for progress in many areas.

Program Chairs

Program Committee

ISWC 2018 logo