The Mythical Data Lake
What is a data lake and why is it "a unicorn"?
FAIR-EASE is dedicated to opening gateways for the field of earth and environmental sciences by addressing the limitations of the current digital architecture. Our primary focus is to facilitate integrated use of environmental data, overcoming the challenges posed by distributed and domain-specific data repositories.
In our inaugural webinar, titled "First steps towards integrated use of environmental data", we unveiled the FAIR-EASE project. This webinar featured insightful discussions on various use cases and key areas of emphasis. One highlight was the presentation by Marc Portier, whose interview during the FAIR-EASE Kick Off meeting can be viewed below. Marc introduced the FAIR-EASE technical architecture, which revolves around the concept of a 'data lake.' Identified as a distinct work package in our project proposal, this architecture sparked fundamental inquiries right from the project's beginning, the most crucial being: 'What exactly is a Data Lake?' Marc guided us through the team's thought process, elucidating the architecture's components and demonstrating how they can be tailored to specific processes. Crucially, the approach and outcomes of this infrastructure hold potential for adaptation and reuse in various contexts.
In this second edition, we'll dive deeper into this FAIR-EASE component and how it is being developed to innovate the access to cross domain research data.
Objective of the webinar
In this dedicated webinar on 12 July 2023, 15:00 CEST, FAIR-EASE will discuss the "Data Lake" infrastructure improving access to data both in terms of data harmonisation and in terms of technical efficiency of data access
Our discussion will focus on the transformative journey from a 'data lake' to a dynamic 'data space.' Furthermore, we will explore a modern web-based perspective on this architecture, where the FAIR-EASE team's vision has been shaped by both the project's unique challenges and the insights derived from computer science research. We will therefore explore innovative approaches to federated queries on a distributed knowledge graph. Additionally, we will shine a spotlight on Examind, and dive into a talk centered around geospatial interoperability.
The objective of this meeting is to engage external communities, communicate our vision, and solicit valuable feedback.
Who should attend and why?
Anyone who is interested in Earth systems or environmental data and easily accessing them and integrating them across domains should attend, but most especially:
- Research and academia representatives who are interested in discovering the potential outcomes of integrating multidisciplinary data in their research;
- Policy makers at the European, national and even local levels that seek integrated cross-domain data to support decision-making;
- EOSC Ecosystem members who want to know more about how FAIR-EASE fits into EOSC’s FAIR data and services
- Earth and environmental data providers that want to see how to promote further use of their data
- General public with an interest in earth systems and environmental data
- 15:00 - 15:05: Welcome
- 15:05 - 15:20: From Data Lake to Data Space - Marc Portier (VLIZ)
- 15:20 - 15:35: Novel approaches for decentralized knowledge graph querying in Data Space contexts - Julian Andres Rojas Melendez (UGent-imec)
- 15:35 - 15:50: Examind and Geospatial Interoperability - Dorian Ginane (Geomatys)
- 15:50 - 16:00: Q&A, Wrap-up & closure