SQL Server中的XML数据类型简介

The XML data type is a very common data type that is used to store unstructured or heterogeneous data in SQL Server. In this article, we will discuss the use of the XML data type along with its benefits, disadvantages, and limitations within various use cases.

XML数据类型是一种非常常见的数据类型,用于在SQL Server中存储非结构化或异构数据。 在本文中,我们将讨论XML数据类型的使用以及它在各种用例中的优点,缺点和局限性。

Prior to the use of the JSON data type in SQL Server, developers worked with the XML data type. These days, we store all data in the relational model with a dynamic column structure. However, early data models may not fit with the relational model. The conventional approach was to store information as XML and then store the XML document in the file system. Sometimes in worse situations, a physical disk can raise problems with the read/write operation with the XML document. So, it makes more data availability to store XML files in the database within the XML data typed column.

在SQL Server中使用JSON数据类型之前,开发人员使用XML数据类型。 如今,我们将所有数据存储在具有动态列结构的关系模型中。 但是,早期的数据模型可能不适合关系模型。 常规方法是将信息存储为XML,然后将XML文档存储在文件系统中。 有时在更糟的情况下,物理磁盘可能会对XML文档的读/写操作产生问题。 因此,它提供了更多的数据可用性,可以将XML文件存储在XML数据类型列中的数据库中。

The XML data type was introduced with SQL Server 2005 and it was dominant in the enterprise products. Before SQL Server 2005, the XML document was stored inside the VARCHAR or TEXT information type. However, it raised XML validation issues. Indeed, even with an XML data type, we confronted numerous performance issues. For example, querying with an XML tag, performing data manipulations within a document or other documents in the same table or cross table, or making updates resulted in cumbersome XML. The main problem was the cost of parsing the fragment each time it was accessed.

XML数据类型是在SQL Server 2005中引入的,在企业产品中占主导地位。 在SQL Server 2005之前,XML文档存储在VARCHAR或TEXT信息类型内。 但是,它提出了XML验证问题。 确实,即使使用XML数据类型,我们也遇到了许多性能问题。 例如,使用XML标记进行查询,在同一表或交叉表中的一个文档或其他文档中执行数据操作,或者进行更新都会导致XML繁琐。 主要问题是每次访问片段时都要解析的开销。

XML has many different use cases in SQL Server. Without XML, a developer would use the VARCHAR or TEXT data types on a table column to store the XML document. As an industry-standard, we need to store unstructured data in a database, where the end-user chooses the structure of information. Unfortunately, this is a tedious way to manage and apply logic or metrics to unstructured information.

XML在SQL Server中有许多不同的用例。 如果没有XML,开发人员将在表列上使用VARCHAR或TEXT数据类型来存储XML文档。 作为行业标准,我们需要将非结构化数据存储在数据库中,最终用户可以在其中选择信息结构。 不幸的是,这是管理逻辑或度量并将其应用于非结构化信息的乏味方法。

XML data type usage

These cases increase the data processing time on the back-end platform. As we can see in scenario 1 above, the static pieces of information are being stored in a database with the normal tabular form. To display that form of data to the user, the back-end application needs to retrieve it from the database and then process it for return to the front-end application. With this method, it will take considerable data processing time and consume application resources to prepare large information sets.

这些情况会增加后端平台上的数据处理时间。 正如我们在上面的方案1中所看到的,静态信息以标准表格形式存储在数据库中。 为了向用户显示该形式的数据,后端应用程序需要从数据库中检索它,然后对其进行处理以返回到前端应用程序。 使用这种方法,将花费大量的数据处理时间并消耗应用程序资源来准备大型信息集。

In scenario 2, we show storing the same kind of details in XML format in the database. In this scenario, we do not process the information for return to the front-end application.

在方案2中,我们显示了以XML格式在数据库中存储相同类型的详细信息。 在这种情况下,我们不处理返回前端应用程序的信息。

Today’s platforms, such as those for e-commerce, dynamic site builders, audit and others, are pushing user-defined forms of data to the database. Nowadays, we also have the NoSQL database engines which can store JSON based documents; those were not available earlier. In SQL Server, the XML and JSON documents may have little overhead with the taking up more space compared to other data typed data considering repetitive Element and Attribute names in the XML / JSON data. Even, XML data may not be quickly processable as compare to the regular SQL statements. Because it doesn’t follow a SQL Server standard to store the data into the table. However, we can fetch and manipulate the data from the XML with the business logic using the XPATH method with XQuery.

当今的平台,例如用于电子商务,动态站点构建器,审计和其他平台,正在将用户定义的数据形式推入数据库。 如今,我们还有NoSQL数据库引擎,可以存储基于JSON的文档。 那些以前没有。 在SQL Server中,考虑到XML / JSON数据中的重复元素和属性名称,与其他数据类型的数据相比,XML和JSON文档占用的空间可能很少。 甚至,与常规SQL语句相比,XML数据也可能无法快速处理。 因为它不遵循SQL Server标准将数据存储到表中。 但是,我们可以使用带有XQuery的XPATH方法和业务逻辑来从XML中获取和操作数据。

在SQL Server数据库中存储非结构化数据的最佳方法是什么? (What is the best way to store unstructured data in the SQL Server database?)

Most of the user-defined forms related platforms stores the XML template into the database with XML data type and reference to that template user information will fill-up in the XML document. A template will be the blank structure to the user for filling the details and filled XML will be stored into the database as a user response.

与用户定义的表单相关的大多数平台都将XML模板以XML数据类型存储到数据库中,对该模板用户信息的引用将在XML文档中填写。 模板将是用户的空白结构,用于填充详细信息,并且已填充的XML将作为用户响应存储在数据库中。

XML Template:

XML模板:

<employee>
    <firstname type="textbox" />
    <lastname type="textbox" />
    <email type="textbox" />
    <title type="textbox" />
    <division_list type="dropdown">
      <division isselected="1">Network</division>
      <division isselected="0">Database</division>
      <division isselected="0">Development</division>
    </division_list>
    <Employee_type_list type="dropdown">
      <Employee_type isselected="1">Permenant</Employee_type>
      <Employee_type isselected="0">Contract</Employee_type>
      <Employee_type isselected="0">Internship</Employee_type>
    </Employee_type_list>
    <address type="textbox" />
    <Age type="textbox" />
</employee>

Here, the same XML has been filled by the end user’s input. A user can update the same XML again.

在这里,最终用户的输入已填写了相同的XML。 用户可以再次更新相同的XML。

XML with user Information:

带有用户信息的XML:

<employee>
    <firstname type="textbox">Jimmy</firstname>
    <lastname type="textbox">Bischoff</lastname>
    <email type="textbox">jimmy0@adventure-works.com</email>
    <title type="textbox">Database Engineer</title>
    <division_list type="dropdown">
      <division isselected="0">Network</division>
      <division isselected="0">Database</division>
      <division isselected="1">Development</division>
    </division_list>
    <Employee_type_list type="dropdown">
      <Employee_type isselected="0">Permanent</Employee_type>
      <Employee_type isselected="1">Contract</Employee_type>
      <Employee_type isselected="0">Internship</Employee_type>
    </Employee_type_list>
    <address type="textbox">51, United States</room>
    <Age type="textbox">27</Age>
</employee>

This is considered one of the best approaches to using the XML data type in SQL Server; however, each approach has its own limitations, advantages and disadvantages:

这被认为是在SQL Server中使用XML数据类型的最佳方法之一。 但是,每种方法都有其自身的局限性,优点和缺点:

SQL Server中XML数据类型的优点: (Advantages of XML data type in SQL Server:)

  • XML stores unstructured data in a relational database model

    XML将非结构化数据存储在关系数据库模型中
  • XML itself validates the information inside the file with tags and its values using DTD (Document Type Definition) and schema

    XML本身使用DTD(文档类型定义)和模式使用标签及其值来验证文件内的信息
  • XML data type stores independent data structure, hence it can be integrated or used in other database sources easily

    XML数据类型存储独立的数据结构,因此可以轻松集成或在其他数据库源中使用
  • In a way, the XML data type reduces the back-end application burden as the XML can be easily used with a UI

    在某种程度上,XML数据类型减轻了后端应用程序的负担,因为XML可以轻松地与UI一起使用
  • The XML data type can be used with the input parameter in a function or stored procedure

    XML数据类型可以与函数或存储过程中的输入参数一起使用

SQL Server中XML数据类型的缺点 (Disadvantages of XML data type in SQL Server)

  • The XML data type consumes more space when compared with the relational (row/column) format in SQL Server data pages because the XML format is redundant with the XML tag-value set

    与SQL Server数据页中的关系(行/列)格式相比,XML数据类型占用更多空间,因为XML格式对于XML标记值集是多余的
  • The query execution time and data manipulation process could take longer compared to a normalized structure

    与规范化结构相比,查询执行时间和数据处理过程可能需要更长的时间
  • A large XML document will use more server resources like CPU, memory, IO and few others in SQL Server

    大型XML文档将使用更多服务器资源,例如CPU,内存,IO和SQL Server中的其他资源
  • XML query structure is complex

    XML查询结构很复杂

XML数据类型的局限性 (Limitation of XML data type)

  • Primary, unique or foreign constraints cannot be applied on an XML-based column

    主约束,唯一约束或外部约束不能应用于基于XML的列
  • The XML data type cannot be used in remote query execution; it needs to be cast or converted with VARCHAR in a query

    XML数据类型不能用于远程查询执行。 它需要在查询中使用VARCHAR进行转换或转换
  • Index performance cannot be achieved on the key of XML with a pre-defined XML path

    具有预定义的XML路径的XML键无法实现索引性能
  • To perform any action (SELECT, INSERT, UPDATE or DELETE) on the XML column of the table, a user needs to follow the XML query standard

    要对表的XML列执行任何操作(SELECT,INSERT,UPDATE或DELETE),用户需要遵循XML查询标准

SQL Server中的XML声明 (XML declaration in SQL Server)

XML is a SQL Server system data type. The variable and column of the table can be defined with the XML column in SQL Server. The current SQL Server databases are native XML integrated and allow the user to store big unstructured data into the XML format and index over the XML data type in order to enhance the Query performance:

XML是SQL Server系统数据类型。 可以使用SQL Server中的XML列定义表的变量和列。 当前SQL Server数据库是本机XML集成的,并允许用户将大的非结构化数据存储为XML格式并在XML数据类型上建立索引,以增强查询性能:

DECLARE @var_XML XML = ''
SET @var_XML = '<employee>
  <firstname type="textbox">Jimmy</firstname>
  <lastname type="textbox">Bischoff</lastname>
  <email type="textbox">jimmy0@adventure-works.com</email>
  <title type="textbox">Database Engineer</title>
</employee>'
    
SELECT @var_XML as employee_xml

XML Result

The XML data type itself validates the input values while assigning value to the XML column or variable. If something is wrong with the input, the query will return an error. For example, I removed one closing tag (</title>) in the above XML, while assigning the value to the variable. It will return an error “end tag does not match start tag” while assigning the value to the variable:

XML数据类型本身会在将值分配给XML列或变量时验证输入值。 如果输入有问题,查询将返回错误。 例如,在将值分配给变量的同时,我删除了上述XML中的一个结束标记(</ title>)。 将值分配给变量时,它将返回错误“结束标记与起始标记不匹配”:

DECLARE @var_XML XML = ''
SET @var_XML = '<employee>
  <firstname type="textbox">Jimmy</firstname>
  <lastname type="textbox">Bischoff</lastname>
  <email type="textbox">jimmy0@adventure-works.com</email>
  <title type="textbox">Database Engineer
</employee>'
    
SELECT @var_XML as employee_xml

Error message:

错误信息:

XML Parsing Error

As explained in the above sample data, the way of storing the data into the XML is in plain text format. XML is the easiest way to exchange data between multiple different systems or applications.

如以上示例数据中所述,将数据存储到XML中的方式为纯文本格式。 XML是在多个不同系统或应用程序之间交换数据的最简单方法。

结论 (Conclusion)

XML is quite useful for a hierarchical data structure, which could be very difficult to manage in an RDBMS. For unstructured data, it’s an out-of-the-box process to extract it with XML if you are using SQL Server. Here, we explained the basics of XML and the role of the XML data type in SQL Server. We will have more detail about the SELECT, INSERT, UPDATE and DELETE functions in upcoming XQUERY articles.

XML对于分层数据结构非常有用,这在RDBMS中可能很难管理。 对于非结构化数据,如果您使用的是SQL Server,则是使用XML提取数据的现成过程。 在这里,我们解释了XML的基础知识以及XML数据类型在SQL Server中的作用。 在即将到来的XQUERY文章中,我们将更详细地介绍SELECT,INSERT,UPDATE和DELETE函数。

翻译自: https://www.sqlshack.com/introduction-to-xml-data-types-in-sql-server/

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值