Graph-Kaleidoscope: Handling Multiple Perspectives in Graph Databases

People involved:

Jaudete Daltio

Person in charge:

Project Overview

To specify and implement a framework that allows to build multiple perspectives, and correlate these perspectives and resources using graphs as the main underlying representation. Perspectives are defined following an adaptation of the concept of views in relational databases – i.e., each perspective is handled as a view in the graph database. Following relational theory, a graph view is defined in terms of a view generating function: a combination of operations that represents queries applied to existing database objects – i.e., vertices and edges. Since there is no consensus on a formal definition of graph data model, the first challenge addressed by our research was to formalize the operators for graph data that underpins our framework. As a result, we define PGDM – a property graph data model – together with its operators. For uniformity, all results of graph operators in PGDM are graphs. Our second contribution lies in the definition of the framework itself, based on the use of PGDM.

To solve problems of multi-perspective research in applications that are characterized by inter-disciplinarity (and thus multiple ways of analyzing a problem) by using of graph databases to store and analyze datasets of highly connected data and adapting the concept of views from relational databases to represent the idea of focus.

2017

Daltio, Jaudete; Medeiros, Claudia Bauzer

Views over Graph Databases: A Multifocus Approach for Heterogeneous Data (PhD Thesis)

University of Campinas - Institute of Computing, 2017.

2016

Daltio, Jaudete; Medeiros, Claudia Bauzer

A View Handler for Semantic Graphs (Conference)

Proceedings 10th IEEE ICSC, Los Angeles, 2016.

2015

Daltio, Jaudete; Medeiros, Claudia Bauzer

Hydrograph: Exploring Geographic Data in Graph Databases (Conference)

XVI Brazilian Symposium on Geoinformatics (GEOINFO), Campos do Jordao, 2015.

2014

Daltio, Jaudete; Medeiros, Claudia Bauzer

Handling Multiple Foci in Graph Databases (Conference)

Lecture Notes in Bioinformatics (LNBI) - Proceedings of 10th International Conference on Data Integration in the Life Sciences, 8574 Lisboa, Portugal, 2014.

To organize the data and support management tasks, Brazilian National Water Agency (ANA) constructed a relational database to manage Brazilian water resources, representing the hydrography as a drainage network, i.e., a set of drainage points and stretches. This network is represented as a binary tree-graph whose edges (i.e., the drainage stretches)
go from upstream to downstream.

The Brazilian drainage network is currently composed by 620.280 drainage points and 620.279 drainage stretches. Drainage points represent diverse geographic entities, such as a watercourse start/end point or a stream mouth point. The drainage stretches represent only one geographic entity – the connection between two drainage points.

Despite this organization, the official territorial unit for the management of water resources adopted by ANA is a watershed. A watershed delimits a drainage system channel and comprises a set of drainage stretches and points. The drainage network can be repeatedly split given rise to watershed and sub-watersheds. Another element in this database is rivers – connected drainage stretches that have the same waterbody name.

Rivers and watersheds are only examples of the many useful elements that need to be derived from this water resources database. These elements are calculated based on drainage network attributes and, once data is updated, they may change. Updates occur, for instance, to reflect human actions (e.g., by river transposition or construction of artificial channels), natural disasters or cartographic refinement process (more accurate scales).