U.S. Semantic Technologies Symposium Series

4th U.S. Semantic Technologies Symposium
Sept. 29 - Oct. 1, 2022 at Michigan State University, East Lansing, MI

Keynote Speaker - Dr. Joshua Shinavier

Dragon: Schema Integration at Uber Scale

Slides: Dragon: Schema Integration at Uber Scale


The compelling vision of a “giant global graph” of interconnected things, or entities identified by symbols, has been somewhat overshadowed in recent years by the successes of sub-symbolic techniques in big data processing. Certainly, at large tech companies like Uber, you will hear the words “machine learning” far more often than you will hear the words “ontology” or “semantics”. However, most of our data is in fact symbolic at its core, conforming to schemas that can be described in terms of various type algebras. There is enormous untapped potential for data integration and data discovery using controlled vocabularies, the main obstacles being the sheer number and heterogeneity of the schemas, languages, and data modeling practices in use, as well as their conceptual distance from familiar ontology frameworks. What is the simplest unifying abstraction for graph schemas that will carry strongly-typed entities and well-defined relationships into every component of our data infrastructure? Does a “knowledge graph” ultimately belong at the periphery or the center of data modeling efforts of this scale? In this talk, we will explore an algebraic approach to schema integration and a new open source tool named Dragon, both part of a broader data standardization and metadata management effort currently underway at Uber.

Biography - Dr. Joshua Shinavier

Joshua Shinavier

Research Scientist, Uber

Dr. Joshua Shinavier is a research scientist at Uber and a co-founder of what is now Apache TinkerPop. He contributed to the first common APIs for graph databases, the original TinkerPop query language which led to Gremlin, and the first tools which aligned the property graph and RDF data models, starting in 2008. At Uber, he leads a company-wide effort to unify data models and schemas across RPC, streaming, and storage. As much as possible, Joshua tries to stand with one foot in industry, another in open source software, and yet another in academia. He feels that these communities have a lot to learn from each other with respect to graphs and knowledge representation. Joshua holds a PhD in computer science from RPI’s Tetherless World Constellation.

© 2022 U.S. Semantic Technologies Symposium 2022
Designed using a Minimalistic Ed Jekyll Theme