June 9, 2020
Stefania Dumbrava
Modern graph query engines are gaining momentum, driven by exponentially growing, interconnected data volumes that populate scientific and industrial repositories, as part of the Linked Open Data and the Semantic Web. While successful commercial implementations exist, no standard graph query language has emerged and, despite the mature tools developed in the formal methods community and the security-sensitive applications currently involving graph-shaped data, no principled framework for the reliable evaluation of such queries has been proposed. In this talk, I will explain how we can address these issues, by employing the Coq proof assistant to develop a mechanically-certified engine for evaluating graph queries and for incrementally maintaining materialized graph instances, also called views. The language we use for defining queries and views is Regular Datalog (RD), a notable fragment of non-recursive Datalog that can express complex navigational queries, with transitive closure as a native operator. To this end, we mechanize an RD-specific evaluation algorithm capable of fine-grained, incremental graph view computation, which we prove sound with respect to the declarative RD semantics. By using the Coq program extraction mechanism, we test an OCaml version of the verified engine on a set of preliminary benchmarks, confirming the feasibility of obtaining a unified, machine-verified, formal framework for graph query processing.