SQL Server 2017图形数据库简介

最新推荐文章于 2023-01-04 16:10:39 发布

culuo4781

最新推荐文章于 2023-01-04 16:10:39 发布

阅读量842

点赞数

文章标签：数据库 python java 大数据人工智能

原文链接：https://www.sqlshack.com/introduction-sql-server-2017-graph-database/

版权

The graph database is a critically important new technology for data professionals. As a database technologist always keen to know and understand the latest innovations happening around the cutting edge or next-generation technologies, and after working with traditional relational database systems and NoSQL databases, I feel that the graph database has a significant role to play in the growth of an organization. Not only are traditional database systems generally inefficient in displaying complex hierarchical data, but even NoSQL lags a little. We usually see a degradation in performance with the number of levels of relationship and database size. Also, depending on the relationship, the number of joins may increase as well.

图形数据库是对数据专业人员至关重要的新技术。作为一名数据库技术专家，一直热衷于了解和理解围绕前沿技术或下一代技术的最新创新，并且在与传统的关系数据库系统和NoSQL数据库一起工作之后，我觉得图数据库在数据库开发中起着重要的作用。组织的成长。传统的数据库系统通常不仅不能有效地显示复杂的分层数据，而且NoSQL也有一些滞后。通常，关系级别和数据库大小的数量会降低性能。同样，根据关系，联接的数量也可能会增加。

There are many workaround ways to display the relationship (such as using Recursive CTE) but that’s still a workaround. Having said that, the SQL graph feature is really capable of drilling through many levels of relationships with ease. The design of the model and the execution of query has made the process much simpler and seamless, and thereby, efficient. The amount of coding has been brought down significantly.

有很多解决方法来显示关系（例如使用递归CTE），但这仍然是解决方法。话虽这么说，SQL图功能确实能够轻松地钻研许多层次的关系。模型的设计和查询的执行使过程变得更加简单和无缝，从而变得高效。编码量已大大降低。

The graph database is an expressive language for representing complex database management systems. This technology already has a strong footprint in the IT industry. It has the capability to influence various fields such as social networking, fraud detection, IT network analysis, social recommendations, product recommendation, and content recommendation.

图形数据库是一种表示复杂数据库管理系统的表达语言。这项技术已经在IT行业中占有重要的地位。它具有影响各种领域的能力，例如社交网络，欺诈检测，IT网络分析，社交推荐，产品推荐和内容推荐。

The SQl graph feature is suited in scenarios where data is more interconnected and has strongly defined relationships.

SQl图形功能适用于数据之间相互联系更紧密且关系定义明确的场景。

总览 (Overview)

A graph is composed of two elements: a NODE (vertices) and an EDGE (relationship). Each node represents entities, and the nodes are connected to one another with edges; these provide details on the relationship between two nodes with their own set of attributes and properties.

图由两个元素组成：一个节点（顶点）和一个边缘（关系）。每个节点代表实体，并且节点之间通过边相互连接。这些提供了有关两个节点之间具有自己的一组属性和特性的关系的详细信息。

The graph database can be defined as the data structure representation of an entity modeled as graphs. It is derived from the graph theory. The data structures are the Node and the Edge. The attributes are the properties of the node or the edge. The relationship defines the interconnection between the nodes.

图数据库可以定义为建模为图的实体的数据结构表示。它源自图论。数据结构是节点和边缘。属性是节点或边的属性。该关系定义了节点之间的互连。

Relationships are prioritized in graph databases, unlike other databases. Therefore, no data inference using foreign keys or out-of-band processing is needed. We can build sophisticated data models simply by assembling abstractions of nodes and edges into a structure. Given the priority for relationships over data, the development stack receives the biggest value here.

与其他数据库不同，关系在图数据库中具有优先级。因此，不需要使用外键或带外处理进行数据推断。我们可以简单地通过将节点和边的抽象组装到结构中来构建复杂的数据模型。在关系优先于数据的情况下，开发堆栈在此处获得最大价值。

In today’s world, relationship modeling requires more sophisticated techniques. SQL Server 2017 offers graph capabilities to model relationships. Graph DB has nodes and edges—two new table types NODE and EDGE. And a new TSQL function called MATCH(). The Node and Edge (relationships) represent entities of the graph database. And since this capability is built into SQL Server 2017, already-existing databases don’t have to be ported to another system, so to speak.

在当今世界，关系建模需要更复杂的技术。 SQL Server 2017提供了图功能来建模关系。 Graph DB具有节点和边线-两种新的表类型NODE和EDGE。还有一个新的TSQL函数，称为MATCH（）。 节点和边缘（关系）表示图数据库的实体。而且由于此功能内置于SQL Server 2017中，因此可以说已经不必将现有数据库移植到另一个系统上。

目的 (Purpose)

Today’s business and user requirements demand applications that interconnect more and more of the world’s data, yet still expect high levels of performance and data reliability. Graph representation offers a convenient means of handling complex relationships. This technique provides solutions to many important problems and helps derive the results within the scope of the given context.

当今的业务和用户需求要求应用程序能够互连越来越多的世界数据，但仍希望获得高水平的性能和数据可靠性。图表示法提供了一种处理复杂关系的便捷方法。该技术为许多重要问题提供了解决方案，并有助于在给定上下文范围内得出结果。

From what it seems like, many applications of the future will benefit a lot, since they’d be built using graph databases.

从表面上看，未来的许多应用程序将从中受益，因为它们将使用图形数据库进行构建。

数据建模–与图相关 (Data modeling – relational to graph)

示范 (Demonstration)

It’s demo time!

现在是演示时间！

Let’s consider an example of an organization where an employee is mapped to Manager, Manager is mapped to Senior Manager, and so on. The hierarchy may go on and on, depending on the number of levels in the organization. As the number of levels increases, deriving a relationship in a relational database becomes a daunting task. Representing an employee hierarchy or marketing hierarchy or the social network connections are quite difficult using a relational database. Let’s see how SQL Graph can offer a solution when we drive down to the various levels of the hierarchy.

让我们考虑一个组织的示例，在该组织中，员工映射到Manager，Manager映射到Senior Manager，依此类推。层次结构可能会持续不断，这取决于组织中的级别数。随着级别数的增加，在关系数据库中推导关系成为一项艰巨的任务。使用关系数据库来表示员工层次结构或营销层次结构或社交网络连接是非常困难的。让我们看看当深入到层次结构的各个级别时，SQL Graph如何提供解决方案。

Let’s consider the simple employee data model for the entire illustration. In the emp table, the employee is identified with a unique identifier (empno), and the MGR column indicates an employee supervisor. The employee is a self-contained entity and can be queried using empno and MGR column

让我们考虑整个插图的简单员工数据模型。在emp表中，使用唯一标识符（ empno ）标识员工，而MGR列指示员工主管。该员工是一个独立的实体，可以使用empno和MGR列进行查询

The following organization diagram depicts the most famous employee relationship model. It has four levels. The employee is a NODE of the employee table. The employee node is self-connected pointer with a ReportsTo relationship. In graph terms, the relationship is termed as the EDGE of the employee NODE

以下组织图描述了最著名的员工关系模型。它有四个级别。雇员是雇员表的节点。员工节点是具有ReportsTo关系的自连接指针。用图形表示，该关系称为员工NODE的EDGE。

Let’s build a relational EMP table and insert the relational values as per the values of the organization chart

让我们建立一个关系EMP表，并根据组织结构图的值插入关系值

CREATE TABLE EMP
(EMPNO INT NOT NULL,
ENAME VARCHAR(20),
JOB VARCHAR(10),
MGR INT,
JOINDATE DATETIME,
SALARY DECIMAL(7, 2),
COMMISIION DECIMAL(7, 2),
DNO INT)
 
INSERT INTO EMP VALUES 
(7369, 'SMITH', 'CLERK', 7902, '02-MAR-1970', 8000, NULL, 2),
(7499, 'ALLEN', 'SALESMAN', 7698, '20-MAR-1971', 1600, 3000, 3),
(7521, 'WARD', 'SALESMAN', 7698, '07-FEB-1983', 1250, 5000, 3),
(7566, 'JONES', 'MANAGER', 7839, '02-JUN-1961', 2975, 50000, 2),
(7654, 'MARTIN', 'SALESMAN', 7698, '28-FEB-1971', 1250, 14000, 3),
(7698, 'BLAKE', 'MANAGER', 7839, '01-JAN-1988', 2850, 12000, 3),
(7782, 'CLARK', 'MANAGER', 7839, '09-APR-1971', 2450, 13000, 1),
(7788, 'SCOTT', 'ANALYST', 7566, '09-DEC-1982', 3000, 1200, 2),
(7839, 'KING', 'PRESIDENT', NULL, '17-JUL-1971', 5000, 1456, 1),
(7844, 'TURNER', 'SALESMAN', 7698, '08-AUG-1971', 1500, 0, 3),
(7876, 'ADAMS', 'CLERK', 7788, '12-MAR-1973', 1100, 0, 2),
(7900, 'JAMES', 'CLERK', 7698, '03-NOV-1971', 950, 0, 3),
(7902, 'FORD', 'ANALYST', 7566, '04-MAR-1961', 3000, 0, 2),
(7934, 'MILLER', 'CLERK', 7782, '21-JAN-1972', 1300, 0, 1)

In the below show figure, the empno 7369, ReportsTo 7902, 7902 Reports To 7566, and 7566 ReportsTo 7839

在下面的显示图中，empno 7369，ReportsTo 7902、7902 Reports 7566和7566 Reports 7839

Now, let’s look at the Graph representation of the same data. The EMPLOYEE node has a several attributes. The employee node is connected to itself with a reportsTo relationship. ReportsTo is the name of the EDGE.

现在，让我们看一下相同数据的图形表示。 EMPLOYEE节点具有多个属性。员工节点通过reportsTo关系连接到自身。 ReportsTo是EDGE的名称。

We can have attributes on the EDGE table as well

我们也可以在EDGE表上具有属性

Create the node table, EmpNode

创建节点表EmpNode

The syntax of creating a node is pretty straight forward: the create table syntax with AS NODE construct at the end of the table creation step

创建节点的语法非常简单：在表创建步骤的最后使用AS NODE构造创建表语法

CREATE TABLE dbo.EmpNode(
ID Int Identity(1,1),
EMPNO NUMERIC(4) NOT NULL,
ENAME VARCHAR(10),
MGR NUMERIC(4),
DNO INT
) AS NODE;

Let’s now convert the relational table data into graph data. The following insert statement inserts the data from the emp relational table.

现在让我们将关系表数据转换为图形数据。以下插入语句从emp关系表中插入数据。

INSERT INTO EmpNode(EMPNO,ENAME,MGR,DNO) select empno,ename,MGR,dno from emp

The Node table metadata field $node_id_* stores the nodeId values as JSON. The other columns of the NODE table are the attributes or the actual values of the node.

节点表元数据字段$ node_id_ *将nodeId值存储为JSON。 NODE表的其他列是节点的属性或实际值。

Create the EDGE

创建边缘

Creating an EDGE is similar to creating a node, except the use of keyword AS EDGE at the end of the edge creation.

创建EDGE与创建节点相似，不同之处在于在边缘创建结束时使用关键字AS EDGE 。

CREATE TABLE empReportsTo(Deptno int) AS EDGE

创建表empReportsTo（Deptno int）作为EDGE

Now, let’s define the relationship between the employees using EMPNO and MGR columns. The organization chart gives a good idea of defining the insert the statement

现在，让我们使用EMPNO和MGR列定义员工之间的关系。组织结构图给出了定义插入语句的好主意

INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 1), 
       (SELECT $node_id FROM EmpNode WHERE id = 13),20);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 2), 
       (SELECT $node_id FROM EmpNode WHERE id = 6),10);
	   INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 3), 
       (SELECT $node_id FROM EmpNode WHERE id = 6),10)
   INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 4), 
       (SELECT $node_id FROM EmpNode WHERE id = 9),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 5), 
       (SELECT $node_id FROM EmpNode WHERE id = 6),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 6), 
       (SELECT $node_id FROM EmpNode WHERE id = 9),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 7), 
       (SELECT $node_id FROM EmpNode WHERE id = 9),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 8), 
       (SELECT $node_id FROM EmpNode WHERE id = 4),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 9), 
       (SELECT $node_id FROM EmpNode WHERE id = 9),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 10), 
       (SELECT $node_id FROM EmpNode WHERE id = 6),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 11), 
       (SELECT $node_id FROM EmpNode WHERE id = 8),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 12), 
       (SELECT $node_id FROM EmpNode WHERE id = 6),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 13), 
       (SELECT $node_id FROM EmpNode WHERE id = 4),30);
INSERT INTO empReportsTo  VALUES ((SELECT $node_id FROM EmpNode WHERE ID = 14), 
       (SELECT $node_id FROM EmpNode WHERE id = 7),30);OR 
 
INSERT INTO empReportsTo 
SELECT e.$node_id, m.$node_id ,e.dno
FROM dbo.EmpNode e 
inner JOIN dbo.EmpNode m 
ON e.empno = m.mgr;

The Edge table, by default, has three columns. First, the $edge_id represents the identity of the edge in JSON. The other two columns $from_id and $to_id represents the relationship between the edges. Also, the edges may have additional properties as well. In this case its deptno.

默认情况下，Edge表具有三列。首先， $ edge_id表示JSON中边缘的身份。其他两列$ from_id和$ to_id表示边之间的关系。同样，边缘也可以具有其他属性。在这种情况下，它是deptno 。

系统视图 (System view)

The system view, sys.tables, has two new columns:

系统视图sys.tables包含两个新列：

is_edge
is_edge
is_node
is_node

 
  SELECT t.is_edge,t.is_node,*
FROM sys.tables t
WHERE name like 'emp%'

SSMS视图 (SSMS view)

The objects of the Graph database are located in the graph tables folder of the SQLShackDemo database. The Node of the table is represented by dot, the edge is represented by open connectors (that look a little like a pair of glasses)

Graph数据库的对象位于SQLShackDemo数据库的graph表文件夹中。表格的节点用点表示，边缘用开放式连接器表示（看上去有点像一副眼镜）

匹配条款 (Match clause)

The match clause is derived from CQL (Cypher query language). It’s an efficient way of querying graph properties. The CQL starts with the match clause. Using T-SQL the matching data nodes are traversed through the relationship.

match子句派生自CQL（密码查询语言）。这是查询图形属性的有效方法。 CQL以match子句开头。使用T-SQL，通过关系遍历匹配的数据节点。

The syntax

语法

MATCH (<graph_search_pattern>)
 
<graph_search_pattern>::=
    {<node_alias> { 
                     { <-( <edge_alias> )- } 
                   | { -( <edge_alias> )-> }
                 <node_alias> 
                 } 
     }
     [ { AND } { ( <graph_search_pattern> ) } ]
     [ ,...n ]
 
<node_alias> ::=
    node_table_name | node_alias 
 
<edge_alias> ::=
    edge_table_name | edge_alias

Examples

例子

Let’s now look at some examples.

现在让我们看一些示例。

Here’s the query to display first-level employees who report to Smith, and Smith’s manager

这是显示向史密斯报告的第一级员工和史密斯经理的查询

SELECT 
E.EMPNO,E.ENAME,E.MGR,E1.EMPNO,E1.ENAME,E1.MGR
FROM 
    empnode e, empnode e1, empReportsTo m 
WHERE 
    MATCH(e-(m)->e1)
and e.ENAME='SMITH'

And here’s the query to display second-level employee and manager details for Smith. If we take out the where clause, the result will be displayed for all the employees

这是显示Smith的二级员工和经理详细信息的查询。如果我们删除where子句，则将为所有员工显示结果

SELECT 
E.EMPNO,E.ENAME,E.MGR,E1.EMPNO,E1.ENAME,E1.MGR,E2.EMPNO,e2.ENAME,E2.MGR
FROM 
    empnode e, empnode e1, empReportsTo m ,empReportsTo m1, empnode e2
WHERE 
    MATCH(e-(m)->e1-(m1)->e2)
and e.ENAME='SMITH'

And now, the query to display the third level employees and manager details

现在，查询显示第三级员工和经理详细信息

SELECT 
E.EMPNO,E.ENAME,E.MGR,E1.EMPNO,E1.ENAME,E1.MGR,E2.EMPNO,e2.ENAME,E2.MGR,E3.EMPNO,e3.ENAME,E3.MGR
FROM 
    empnode e, empnode e1, empReportsTo m ,empReportsTo m1, empnode e2, empReportsTo M2, empnode e3
WHERE 
    MATCH(e-(m)->e1-(m1)->e2-(m2)->e3)
 and e.ENAME='SMITH'

Now, let’s change the direction to fetch all the managers details for Smith

现在，让我们更改方向以获取Smith的所有经理详细信息

SELECT 
E.EMPNO,E.ENAME,E.MGR,E1.EMPNO,E1.ENAME,E1.MGR,E2.EMPNO,e2.ENAME,E2.MGR,E3.EMPNO,e3.ENAME,E3.MGR
FROM 
    empnode e, empnode e1, empReportsTo m ,empReportsTo m1, empnode e2, empReportsTo M2, empnode e3
WHERE 
    MATCH(e<-(m)-e1<-(m1)-e2<-(m2)-e3)

结语 (Wrapping up)

SQL Server 2017 is proving to be a complete enterprise solution for various Business IT needs. The first version of SQL Graph very is promising, even though there are a quite some limitations, there is enough room to explore the graph features so far to be hopeful that Microsoft can deliver a fully-functional graph database within SQL Server.

事实证明，SQL Server 2017是满足各种业务IT需求的完整企业解决方案。 SQL Graph的第一个版本非常有前途，尽管存在很多限制，但到目前为止，仍有足够的空间来探索图形功能，以期希望Microsoft可以在SQL Server中提供功能齐全的图形数据库。

The SQL Graph feature is fully integrated into the SQL Engine. As I mentioned, though, there are some limitations in this feature on SQL Server 2017, as of writing this article:

SQL Graph功能已完全集成到SQL Engine中。但是，正如我提到的那样，在撰写本文时，SQL Server 2017的此功能有一些限制：

No support for polymorphism
不支持多态
SQL supports only unidirectional mapping
SQL仅支持单向映射
Update on the edge columns is not allowed
不允许在边缘列上更新
Transitive closure is not supported, but we can still achieve this using CTE
不支持传递闭包，但我们仍然可以使用CTE来实现
Support for In-Memory OLTP objects is limited
对内存中OLTP对象的支持是有限的
System table, Temporary table, and Global Temporary tables are not supported
不支持系统表，临时表和全局临时表
Table types and table variables are not declared as NODE or EDGE
表类型和表变量未声明为NODE或EDGE
Cross-database queries are not supported
不支持跨数据库查询
There is no direct way or a wizard available to convert existing traditional database tables to graph
没有直接的方法或向导可以将现有的传统数据库表转换为图形
There is no GUI, so we have to rely on Power BI to plot and view the graph
没有GUI，因此我们必须依靠Power BI来绘制和查看图形

Stay tuned for more updates on this topic….

请继续关注有关此主题的更多更新…。

目录 (Table of contents)

A Quick start Guide to Managing SQL Server 2017 on CentOS/RHEL Using the SSH Protocol

How to use Python in SQL Server 2017 to obtain advanced data analytics

Data Interpolation and Transformation using Python in SQL Server 2017

An introduction to a SQL Server 2017 graph database

Top string functions in SQL Server 2017

Top 8 new (or enhanced) SQL Server 2017 DMVs and DMFs for DBAs

Overview of Resumable Indexes in SQL Server 2017

Understanding automatic tuning in SQL Server 2017

使用SSH协议在CentOS / RHEL上管理SQL Server 2017的快速入门指南

如何在SQL Server 2017中使用Python获得高级数据分析

在SQL Server 2017中使用Python进行数据插值和转换

SQL Server 2017图形数据库简介

SQL Server 2017中的顶级字符串函数

适用于DBA的前8大新（或增强）SQL Server 2017 DMV和DMF

SQL Server 2017中的可恢复索引概述

了解SQL Server 2017中的自动调整

翻译自: https://www.sqlshack.com/introduction-sql-server-2017-graph-database/

culuo4781

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
SQL Server 2017图形数据库简介

The graph database is a critically important new technology for data professionals. As a database technologist always keen to know and understand the latest innovations happening around the cutting...
复制链接

扫一扫