XML and the Document Object Model

XML, short for eXtensible Markup Language, was one of the first formats used to store and transport data on the web without the use of a database. The language is software independent, so it can be used with a wide variety of programming languages. As a result, it is often used as a medium to transport data from server-side applications built in languages like Python and Java to public-facing websites.

What is XML

XML is a markup language designed for storing and tranporting data. XML is easy to understand and when you look at an XML document you should be able to understand the meaning of the contents of that document quickly and intuitively.

The readability of XML is due to the fact that XML has no predefined tags. The author of an XML document is free to dream up their own tags based on the specific meaning of the data contained in that file. Let's look at an example:


<?xml version="1.0" encoding="UTF-8"?>
<pets>
  <pet>
    <name>Max</name>
    <type>Dog</type>
    <birthday>July 7, 2014</birthday>
  </pet>
  <pet>
    <name>Lucy</name>
    <type>Cat</type>
    <birthday>October 12, 2008</birthday>
  </pet>
  <pet>
    <name>Oscar</name>
    <type>Giant Tortoise</type>
    <birthday>Unknown, estimated 1916</birthday>
  </pet>
</pets>

You could look at that document and tell pretty quickly that it was a list of pets that includes the name, type, and date of birth (if known) of each pet. A well-written XML document is easy to understand even if you've never written a line of code.

XML and HTML

The first time you hear of XML, you might think of XML as an alternative to Hypertext Markup Language (HTML). While XML can be used in that way, that's pretty unusual. In most cases, XML isn't used as an alternative to HTML but as a data storage format for pulling data into an HTML document.

Let's look at a conceptual example.


<!DOCTYPE html>
<html>
<div id="data"></div>
<script>
  function getXMLData() {
    /* insert JavaScript function to get data from an XML file */
  }
  document.getElementById("data").innerHTML = getXMLData();
</script>
</html>

Alright, so that code doesn't actually do anything, but we can use it to explain how HTML and XML can be used to work together. In the code above, the HTML defines an empty div which will serve as a container for data in an XML file. Then, a JavaScript function is defined. The function is empty, but in practical application this function would identify an XML file, pull data out of the file, and add HTML tags to the data so that it is rendered properly by the browser.

With a properly written function in place, when this bit of HTML was loaded the data div would not be empty, but instead would contain the contents defined by the JavaScript function.

Are you starting to see the power of XML? With this arrangement, the data displayed on a webpage can be updated dynamically by updating the referenced XML file, much in the same way a database can be used to update the contents of a webpage.

What is the Document Object Model

The Document Object Model (DOM) is the programming interface used to manipulate HTML and XML documents. When you use JavaScript, or another scripting language, to manipulate an element on a webpage what you're actually doing is manipulating the DOM, not the HTML document itself.

The DOM is the virtual layer between the source documents used to build a web page and the scripting that modifies that webpage. Think of the DOM as the version of a webpage rendered by a browser and stored in the browser's memory. The DOM is a dynamic representation of a web page that exists within a web browser and can be accessed and modified by scripting — most commonly JavaScript.

Conceptualizing the XML DOM

The contents of the XML DOM can be manipulated with scripting. However, we have to understand the relationships between XML DOM elements, called nodes, before we can do anything with them.

Let's look at a simplified version of our earlier XML example code:


<?xml version="1.0" encoding="UTF-8"?>
<pets>
  <pet>
    <name>Max</name>
    <type>Dog</type>
    <birthday>July 7, 2014</birthday>
  </pet>
</pets>

The XML DOM is built of nodes. Every part of the XML DOM is a node.

  • Document node: The entire contents of the XML document represent the document node.
  • Root node: The first element in an XML document is called the root node. In this case, the root node is <pets>.
  • Parent and child nodes: The terms parent and child are used to describe the relationship between DOM elements and the elements nested within them. In our sample code, The <pets> element node is the parent of the <pet> node, and the <pet>node has three children: name, type, and birthday. Every node in an XML document, except for the root node, has exactly one parent node and may have any number of children nodes.
  • Sibling nodes: When two nodes are both the children of the same parent they are referred to as sibling nodes. In our example, name, type, and birthday are sibling nodes.
  • Text node: The text contained within a element is defined as a text node within the XML DOM. This is an important distinction. If we want to get at the text in a text node, we need to refer to it as the value of the text node, not the value of the child node. In other words, the path to the text "Max" looks like this: pets > pet > name > text node > value:"Max"

Manipulating the XML DOM

In general, you JavaScript is used to manipulate the XML DOM. JavaScript can be used to retrieve a variety of properties from the nodes in the XML DOM. Commonly accessed XML DOM properties include:

  • nodeValue: Gets the value contained within the node.
  • parentNode: References the parent node. If were were to apply this property to the name node in our sample XML, we would be referring to the pet node.
  • childNodes: References a node's children. If applied to the pet node in our code above, this property would return the name, type, and birthday nodes.

JavaScript can be used to do more than just reference the properties of XML DOM nodes. Here are some of the most common JavaScript methods used to actively manipulate the XML DOM.

  • getElementsByTagName: You might recognize this method if you've ever used JavaScript to manipulate HTML elements. Drop in the name of any XML DOM element, such as "pet" or "name" from our example XML code, to access those elements.
  • appendChild: This method is used to add child nodes to a node.
  • removeChild: Remove a node from a parent node. Keep in mind that the data will remain in the original XML file, it's just removed from the DOM built by the browser.

There are many additional XML DOM methods and properties. However, you really need to have a strong grasp of JavaScript, XML, and know how you plan to use XML data to get much further with this topic.

Resources

There seems to be an endless number of online tutorials, and some are much better than others. After looking at dozens of XML DOM tutorials, we think the following tutorials will get you up-to-speed the fastest.

If you prefer a learning format that offers a bit more structure than a tutorial, you might be interested in one of the following online courses that cover XML and the XML DOM.

XML has been around for a long time. As a result, many XML texts have been written over the years. Below are some modern XML titles that cover the XML DOM and are highly-rated by readers:

Conclusion

XML is a powerful and simple language for transporting data in a format that can be used in many different ways. The XML DOM is the model built by the browser to interact with and manipulate XML data. Once you understand how to work with the XML DOM you'll be able to get, change, and style XML data for use in webpages and applications.


Further Reading and Resources

We have more guides, tutorials, and infographics related to coding and development:

What Code Should You Learn?

Confused about what programming language you should learn to code in? Check out our infographic, What Code Should You Learn? It not only discusses different aspects of the languages, it answers important questions such as, "How much money will I make programming Java for a living?"