XML

In its simplest form, XML is nothing more than a text file containing a single well-formed XML document. Come to think of it, the same is pretty much true in its most complex form as well. Looking past all the hype surrounding XML, it is easy to see that XML is merely the text representation of self-describing data in a tree data structure. When you understand this, all that is left are the nitty-gritty little details, as in "What's a tree data structure?" and "How exactly does data describe itself?"


The format of this chapter goes along the following lines:
  • Elements:
    A Well-Formed XML Document
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <one>
          <two>
                <three>
                      <four/>
                </three>
          </two>
    </one>
    
    
    

    An XML Document with Text Data
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <library>
          <book>
                <series>The Lord of the Rings</series>
                <title>The Fellowship of the Ring</title>
                <author>J.R.R. Tolkien</author>
          </book>
          <book>
                <series>The Lord of the Rings</series>
    
                <title>The Two Towers</title>
                <author>J.R.R. Tolkien</author>
          </book>
          <book>
                <series>The Lord of the Rings</series>
                <title>The Return of the King</title>
                <author>J.R.R. Tolkien</author>
          </book>
    </library>
    




  • Attributes:
    Attributes are a name-value pair that is contained in an element's start tag
    An XML Document with Attributes
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <library>
          <book series="The Lord of the Rings" title="The Fellowship of the
    Ring" author="J.R.R. Tolkien"/>
          <book series="The Lord of the Rings" title="The Two Towers"
    author="J.R.R. Tolkien"/>
          <book series="The Lord of the Rings" title="The Return of the King"
    author="J.R.R. Tolkien"/>
    </library>
    


  • Handling Verboten Characters:
    Entities
    Character
    Entity
    Description
    <
    &lt;
    Less than
    >
    &gt;
    Greater than
    '
    &apos;
    Apostrophe/single quote
    "
    &qout;
    Double quote
    &
    &amp;
    Ampersand




  • XML declarations:
    Before proceeding any further, I want to explain a little about the stuff between the <? and the ?>. It is called the XML declaration, which is an example of a META data tag that appears at the beginning of an XML document. Its purpose is to specify the version of XML, the character encoding, and whether there is an external markup declaration.
    Determining whether the XML document has an external markup declaration (standalone="no") or not (standalone="true") is based upon three rules. An XML document has an external markup declaration if attributes have default values, there are entities used other than the five default entities, or either elements or attributes are subject to whitespace nominalization.


No comments:

Post a Comment