XML should be used only where it is really required. Using XML where
relational data would
best be suited is not a good idea. Similarly,
using a relational model where XML might run
better won’t produce the desired
results.
XML is good for storing semi-structured or unstructured
data. XML is a better choice if the
physical order of values is
significant and the data represents a hierarchy. If the values are
valid
XML documents and need to be queried, storing them on an XML column will be
a
better choice over VARCHAR, NVARCHAR, or VARBINARY columns.
If the
structure of the XML documents is defined, using typed XML columns will be a
better
choice. Typed XML columns provide better metadata information and
allow SQL Server to
optimize queries running over typed XML columns.
Furthermore, typed XML provides storage
optimization and static type
checking.
Creating a primary XML index and a secondary XML index (or
more, depending on the workload)
might help improve XQuery performance. An
XML primary index usually uses up to
three times the storage space than the
data in the base table. This indicates that, just like SQL
indexes, XML
indexes also should be used wisely. Keep in mind that a full-text index can
be
created on an XML column. A wise combination of a full-text index with XML
indexes might
be a better choice in many situations.
Creating property
tables to promote multi-valued properties from the XML column may be
a good
idea in many cases. One or more property tables may be created from the
data in an
XML column, and these tables can be indexed to improve
performance further.
Two common mistakes that add a lot of overhead to
XQuery processing are usage of wildcards
in the path expression and
using a parent node accessor to read information from
upper-level
nodes.
Using specific markups instead of generic markups
will enhance performance significantly.
Generic markups do not perform well
and do not allow XML index lookups to be
done
efficiently.
Attribute-centric markup is a better choice than
element-centric markup. Processing information
from attributes is much more
efficient than processing information from elements.
Attribute-centric
markups take less storage space than element-centric markups, and
the
evaluation of predicates is more efficient because the attribute’s value
is stored in the same row
as its markup in the primary XML index.
An
in-place update of the XML data type gives better performance in most
cases. If the update
operation requires modifying the value of one or
more elements or attributes, it is a better
practice to modify those elements
and attributes using XML DML functions, rather than
replace the whole
document.
Using the exist() method to check for the existence of a value
is much more efficient than
using the value() method. Parameterizing XQuery
and XML DML expressions is much more
efficient than executing dynamic SQL
statements.