Graph databases are made up of nodes, edges (relationships between nodes), and properties. Patterns are identified by searching or “traversing” the graph, and use cases include recommendation engines, social networks, and fraud detection.
Microsoft have made Graph Database functionality available in both SQL Server 2017 and Azure Database, and the following is an assessment.
In Microsoft Graph Database (Graph DB), both nodes and edges are logically represented by database tables and are configured using new T-SQL extensions. For example, CREATE TABLE X AS NODE, CREATE TABLE Y AS EDGE. When an edge is created, associated graph specific metadata including edge direction are added, e.g. “from node X, to node Y”. This makes configuration relatively straightforward given that if you are already familiar with MS SQL Server, then it’s a small step to start working with Graph DB.
The syntax is loosely based on a subset of the openCypher standard which is used by e.g. the Neo4J graph database. While the T-SQL MATCH statement extension plus ascii operators are available, graph traversal is a little clunky; each time a node or edge is traversed, it needs to be included in the FROM part of the query statement. This means that complex queries become verbose, and it won’t be long before you’re asking whether it would be easier to manage this in SQL Server using the relational database approach. This shortcoming is not due to the openCypher standard, but rather the MS Graph DB implementation.
I do like the fact that out of the box MS SQL Server functionality can be used with Graph DB, such as Row Level Security, Data Masking, and Clustered Column Store indexes. Going by Books on Line, here you can expect up to 6x performance improvements using Clustered Column Store indexes. However other functionality such as In-Memory tables, Temporal tables (where a table can track its own history), and cross database queries are currently not supported.
Visualizing graphs in Graph DB is a challenge. On paper you can use Machine Learning Services combined with the R igraph package. I could not get this to work, but your mileage may vary. A PowerBI Custom Visual called the “Force Directed Graph” does offer a workaround of sorts.
Interestingly, a separate Microsoft offering, Cosmos DB, also offers Graph DB functionality, but follows the Gremlin syntax which is quite different to openCypher. While Cosmos DB also has different use cases, it’s clear that both SQL Server Graph DB and the graph offering in Cosmos DB have grown independently of each other, and a few alignment wrinkles may need to be ironed out.
From a data model perspective, there are no constraints to stop a node or edge from being deleted, so if you didn’t know what you were doing your data could look like Swiss cheese before you knew it.
Going by the SQL Bits conference in London last week, there is no indication of enhanced functionality via MS SQL Server 2017 Cumulative updates. Considering Microsoft’s cloud first approach, Graph DB updates may appear in Azure DB down the line, but there’s nothing definite on the horizon.
In summary, MS Graph Database appears to have been added to MS SQL Server with the intention of enriching it’s multi-model offering. I like the positioning of the technology i.e. T-SQL extensions within the familiar environs of Management Studio and SQL Server Data Tools. However, as mentioned traversing the model is not all that pretty e.g. working out the shortest route between nodes, or inferred relationships. So, if I had a requirement for a Graph Database, would I use Graph DB in SQL ’17 or Azure Database? Right now, no, I would not. Instead I would look further afield.
However it’s useful to be aware of the technical offering as it stands, and a solid foundation has been set for what could be powerful graph functionality in the future.