Embedding XML

XML Attributes and Elements

Like TermL terms, an XML Element contains a sequence of child Elements, so these concepts can map directly. An XML Element also contains a set, not a sequence, of name-value pairs, or attributes. An example valid in HTML and XHTML (HTML4 encoded in XML):

<font size="3" color="#FF0000">foo<b>bar</b>baz</font>

This can be written in TermL as follows:

font({size: "3", color: "#FF0000"},
     "foo", b("bar"), "baz")

The curly brackets indicate that order should not be taken as significant among the terms within the curly brackets. The ‘:’ shorthand is used for attributes to emphasize that the attribute isn’t of self-contained meaning, but is rather an encoding of properties of the enclosing Element.

The above is syntactic shorthand for:

font(.bag.(.attr.(size("3")), .attr.(color("#FF0000"))),
     "foo", b("bar"), "baz")

Since the .bag. functor cannot correspond to any XML-QName, an XML-oriented interpreter of this term-tree knows that this first argument is to be interpreted as holding attributes. If an Element has no attributes, this optional first argument may be omitted. If an Element has no children, such as the following XHTML,

<img alt="ELM architecture" src="eLanguageMachine.gif" />

then the parentheses which would enclose only the attributes can be left out:

img{alt: "ELM architecture", src: "eLanguageMachine.gif"}

which is shorthand for

img(.bag.(.attr.(alt("ELM architecture")), .attr.(src("eLanguageMachine.gif"))))

XML Namespaces

A tag can be used to represent both unresolved and resolved XML-QNames. An unresolved XML-QName is written as it is in XML – as a Prefix, a separator, and a LocalPart. However, in TermL the separator is ‘::’ instead of ‘:’. As explained by XML Namespaces, the semantics of an unresolved XML-QName is represented by the corresponding resolved XML-QName. An XML-QName is resolved by substituting for the Prefix the URI which is the namespace name found by evaluating this Prefix in the current namespace scope. An example from XML-NS:

<?xml version="1.0"?>
<!-- both namespace prefixes are available throughout -->
<bk:book xmlns:bk='urn:loc.gov:books'
    <bk:title>Cheaper by the Dozen</bk:title>

Written manually in TermL without resolving XML-QNames:

xml(.pi.("version", "1.0"),
    .comment.("both namespace prefixes are available throughout"),
    .letns.({bk: <urn:loc.gov:books>,
             isbn: <urn:ISBN:0-395-36341-6>},
            bk::book(bk::title: "Cheaper by the Dozen",
                     isbn::number: 1568491379)))

As seen above, we follow SXML’s example and represent additional XML node types, like processing instructions or comments, by using special tags as functors, like .pi. or .comment…

An automated translation would need some kind of schema or type information in order to translate the isbn number to 1568491379 rather than “1568491379”.

Despite the syntax of XML, modern XML infosets (XPath and XML-Infoset) do not treat namespace definitions as attributes. Rather, the set of namespace definitions in scope at an Element is yet another first-class part of the Element, in addition to the Element’s tag, attributes, and child Elements. Namespace defintions are lexically scoped and shadowed in the conventional manner, though the syntax of XML obscures this as well – in the XML example above, the “bk” in the first “bk:book” is resolved in the scope of the “bk” defintion that occurs textually to its right. To embedding XML in TermL, we emphasize the conventional lexical semantics of these definitions by introducing a separate .letns. construct, similar to the Scheme let.

Transforming the above term-tree through a (yet to be written) .letns. resolver tool, we’d get:

xml(.pi.("version", "1.0"),
    .comment.("both namespace prefixes are available throughout"),
    <urn:loc.gov:books>::book(<urn:loc.gov:books>::title: "Cheaper by the Dozen",
                              <urn:ISBN:0-395-36341-6>::number: 1568491379))

The first form is easier for humans to read and (especially) write. The second is easier for programs to manipulate.

Note that the latter does not retain all the information defined to be significant by modern XML infosets. These require that the Prefix and redundant namespace definitions be retained. Therefore, it is an application-specific decision to employ the resolver tool. Most applications do not care about the information that would be thrown away by such a tool.

Speculative: With this embedding of XML into TermL, an XML DTD or a Relax-NG Schema can be translated into a TermL Schema (yet to be defined), such that an XML document D1 valid according to Schema S1 would translate to a TermL term-tree D2 valid according to translated Schema S2.

*** We have yet to specify the embedding of the definition and use of the XML default namespace.

发布了52 篇原创文章 · 获赞 161 · 访问量 9909


©️2019 CSDN 皮肤主题: 技术黑板 设计师: CSDN官方博客