XML Resources and Validators
XML is short for Extensible Markup Language. It is a highly structured markup language that is designed to be both human and machine readable. But XML is not a language in the way that HTML is a language. XML has no tags like
Instead, XML allows the coder to create any tags at all. And, more important, it allows those tags to be related to each other. So XML allows you to store data in a powerful way. But it doesn't provide any information on what ought to be done with that data. That's where XML based languages come in — things like: XHTML, RSS, and SOAP. It is also a common way that programs like word processors and spreadsheets can save data in an application independent way.
A Brief History of Markup Languages
Markup languages started as a way to combine the best elements of text files (readability of data) and binary files (precise description of data). So in the late 1980s, the Standard Generalized Markup Language (SGML) was created. It was a text-base language that allowed data and its display to be precisely described. HTML was a very simple system that was based on SGML.
But when HTML became hugely popular as the basis of the world wide web, it became apparent that something better was needed. HTML was limited and not well formatted so that browsers had to parse all kinds of code. For example, closing tags were often omitted and tag attributes were not placed inside quotation marks. Remember code like this?
<ul type=square> <li>Bugs Bunny <li>Daffy Duck <li>Foghorn Leghorn </ul>
Poorly structured HTML couldn't be replaced with SGML, because it is ridiculously complicated. It would have been something like replacing HTML with PostScript. So in the mid-1990s, work began on XML. It is a subset of SGML that allows coders to describe data and its relationships. And with the use of style sheets, it can be used to format and transmit data in almost any way imaginable. But unlike SGML, writing parsing programs for it is fairly simple. And in early 1998, the W3C released the first XML standard.
Why Use XML?
This may all sounds kind of abstract. After all, regardless of how powerful XML is at storing data, how does a web browser display anything but a list of data? But that's the point. The big problem with HTML in the early days was that data and layout information were scattered throughout a document. Remember when any kind of page layout had to be done with tables, making HTML code almost unreadable? Today, we use style sheets to separate the layout code from the information presented. Thus, once the layout is completed, it is a simple matter to maintain and add data.
But XML is not a replacement for HTML. In the most general system, XML is a kind of human readable database. But it can be turned into an HTML webpage (And a whole lot more!) by using another took, the Extensible Stylesheet Language Transformations (or XSLT). It converts XML documents into other XML documents — for example: XHTML documents. But even more interestingly, XML is used for things like RSS and SOAP.
A Basic Example
Let's start with a very basic example of how data is entered into an XML file.
<?xml version="1.0" ?> <cartoon_characters> <character> <name>Bullwinkle</name> <intelligence>2</intelligence> <luck>10</luck> </character> <character> <name>Boris Badenov</name> <intelligence>4</intelligence> <luck>0</luck> </character> </cartoon_characters>
Notice that none of these tags are defined by XML. They are defined by the coder. What XML does know (and this is critical) is that
character is a kind of
cartoon_characters and that each
character has characteristics
luck. Other characteristics (like species) as well as more characters (like Wrongway Peachfuzz) could be added and it wouldn't affect any XML parser.
We can take this a step further by creating an XSL transformation file that will create an XHTML file that displays the characters names in an unordered list. First, we would have to add an extra line of code to the previous XML code, right after the first line that defines the file as XML. It would look like this:
<?xml version="1.0" ?> <?xml-stylesheet type="text/xsl" href="bullwinkle.xsl"?> <cartoon_characters> . . .
Next, create an XSL file with the name "bullwinkle.xsl":
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml"> <xsl:output method="xml" indent="yes" encoding="UTF-8"/> <xsl:template match="/"> <html> <head> <title>Rocky and Bullwinkle Show</title> </head> <body> <h1>Cartoon Characters</h1> <ul> <xsl:for-each select="cartoon_characters/character"> <li><xsl:value-of select="name"/></li> </xsl:for-each> </ul> </body> </html> </xsl:template> </xsl:stylesheet>
Then load the original XML file, and it will display just like an XHTML file.
You can experiment with these files to get a better idea of what's going on. But what's most important is that you can leave the XSL file alone, while you add more and more data to the XML file.
XML is a huge subject. We've just dipped a toe into some very deep waters. Wikipedia lists roughly 200 XML languages. These include things like XHTML, of course. But they also include closely related XML tools like XML Encryption (for data encryption) and XML Signature (for digital signatures). But more than that, there are various important aspects to the language:
- Namespaces: a way to allow different datasets to exist in a single XML file without naming conflicts.
- Document Type Definitions: the dreaded DTD that website coders normally just copy and paste into their documents without understanding.
- Schema: a way of structuring an XML document to limit how it is used.
- Database: a non-SQL approach to database storage. There are a number different ones available.
There is an amazing amount of XML related material online. In fact, there is so much that it is overwhelming. As a result, we've tried to stick to just the core XML topics. But you will find links here that will answer just about any question you will ever have in your XML coding career.
- Introduction to XML: the basic W3 Schools introduction to XML — easy to understand with lots of examples
- XML Basics - An Introduction to XML: an old introduction, but one that takes you a long way with some advanced examples.
- Møller and Schwartzbach XML Tutorial: a basic but very broad introduction on XML.
- XML Master Basic Edition: a certification oriented tutorial that is very clear.
- Webucator's XML Free Tutorial: a detailed tutorial — an excellent choice after you run through one of the more simple tutorials.
- The Skew.org XML Tutorial: another advanced tutorial.
- XML Tutorial for Beginners: Portnov Computer School's introduction to XLM.
- XML with Java: a free online course consisting of 13 video lectures by David J Malan.
- XML Tutorial Video For Beginners 2015: a three hour long lecture that goes far into XML including a lot of information on XSLT.
- Computer Science E-75 Lecture 3: from the Harvard extension course "Building Dynamic Websites." This lecture focuses on XML. In less than two hours, it provides everything you need to know to create your own XML based webpages. Note: it assumes knowledge of PHP.
- W3C XML Page: everything about XML — especially upcoming events.
- W3C Archive: lots of recommendations and group notes. The general page has links to information about other XML related topics.
- Annotated XML 1.0 Specification: the raw specification can be hard to get through, but this version provides extra history, technical details, advice, and a whole lot more.
- Extensible Markup Language Frequently Asked Questions: a very basic FAQ for quick answers.
- The XML FAQ: a great collection of questions and answers about basic and advanced aspects of XML.
- XML Namespaces FAQ: detailed questions and answers about namespaces.
- XML and Databases: Ronald Bourret's thorough introduction on XML Databases. It includes an exhaustive list of links. Many of them are dead, but can be found on the Internet Archive.
- The Skew.org XML & XSLT Resources: mostly a bunch of XSLT examples, but other information as well, including its excellent list of links to all things XML related.
Given what a large subject XML is, it can be really helpful to have a book or two: to learn and for reference.
- Beginning XML, 4th Edition by David Hunter and Jeff Rafter: excellent introduction with detailed sections on things like RSS and SOAP.
- Learning XML, Second Edition by Erik Ray: a thorough introduction to XML.
- Beginning XML by Fawcett and Ayers: a basic introduction to XML.
- XML in a Nutshell by Harold and Means: a classic, but out of print and generally expensive. But you might be able to find a copy at a yard sale.
- XML Pocket Reference by St Laurent and Fitzgerald: just what it says — a booklet you can keep in your shirt pocket for reference.
- XML: The Complete Reference by Heather Williamson: an old thousand page reference; good to have around.
XML Coding Tools
- Altova XMLSpy: a complete XML integrated development environment for Microsoft Windows. It's fairly expensive, but for the professional developer, a good investment.
- <oXygen/> XML Editor: more than an editor, it provides debugging, profiling, and other tools. It is Java-based and so will run on any platform. It is also expensive, although it has reasonably priced academic and personal licenses available.
- Stylus Studio: a Microsoft Windows based XML development suite including editor and XSLT visual mapping tool. It is fairly expensive, but offers a reasonably priced home edition.
- EditiX XML Edit: a reasonably priced editor, debugger, and so on. It also offers a free EditiX Lite Version.
- Wikipedia's List of XML Editors: there are lots of editors available from open source to proprietary to web based.
XML is a great tool in part because it is highly standardized. This means that it is picky. So it is critical that you make sure that your code is valid XML. Many of the tools that we've highlighted here contain their own XML validators. But there are plenty of free XML validators to help you with your coding projects.
- The W3C Markup Validation Service: a general tool that allows you to validate by URI, file upload, and direct input.
- W3 Schools XML Validator: an easy to use, online validator.
- XML Validation: a simple online validator that allows direct input or file upload.
- Code Beautify XML Validator: a simple validator that also formats your code so that it is easy to read.
- XML Check: a standalone Windows XML validator.
- XML Schema Validator: a validator for your XML and schema definition.
XML itself is pretty straightforward. For an experienced XHTML coder, it can seem almost trivial. But there are so many related technologies and so much that can be done with it that you could spend the rest of your life doing nothing else. We've just scratched the surface here.