What is XML, and Why Should You Care?--McIlroy
If you've been following the latest developments in the world of electronic publishing, you've probably come across the abbreviation "XML" (short for eXtended Markup Language). If you've seen it, you're probably confused about it. Most people are—it's very complicated, and there's a lot of imprecise and overly technical information out there. If you haven't heard of XML before, well, I guess you just did. Let's take a look at what XML is, and why it might matter to printers and publishers.
Do you know style sheets? We've had them for years now in our desktop applications, from Microsoft Word to PageMaker to QuarkXPress. Style sheets are used to simplify the process of formatting text. See that sub-heading just above? It says "Style Sheets" and it's in bold text. If I were working in Word and wanted to change the format of that sub-heading and all the other ones in this article to bold-italic, I'd have two choices. The hard way would be to do it manually: Select each one, and change the format on each from bold to bold-italic. The easy way is to assign a "style" to each one, let's say a style called
If I didn't like the look of bold-italic, I could change
Power of a Database
Now let's think about databases for a moment. When you set up a database, every field receives a name. If it's a contact database, you'd create a named field for each piece of data. The first field could be
What about a database used to generate invoices at a printing company? Here you could specify a field
Databases for Text
I think of XML as offering a combination of the power of style sheets with the power of databases. In the world of XML, software can understand the text's defined style, along with its significance. Using XML-encoded documents, software can recognize that a field called
So much of the excitement around XML ties into its strengths in supporting commerce, specifically to enable e-commerce applications. That's the reason why heavyweights like Microsoft, IBM and Sun endorse it.
Publishers appreciate e-commerce as much as the next Web user, but have some of their own reasons for falling in love with XML's promise. Among the primary reasons is that XML encoding would offer the basis for dynamic, cross-media publishing. Web publishing software can work with XML codes as easily as print software (perhaps even more so). Once a stream of text has been coded with XML tags, it can be published very rapidly into both print and Web formats, without manual intervention.
This is the Holy Grail for publishers—the ability to publish quickly and dynamically to multiple media, without having to hire a troop of designers to rework each paragraph or page by hand.
There's another reason publishers are getting excited about XML. Because it's a kind of database, a text file can include data about the file, rather than just the text and pictures themselves.
Let's say a publisher wanted to specify that a QuarkXPress file should be printed four-color in a quantity of 10,000, trimmed and bound saddled-stitched, and then shipped by next Friday to its Chicago warehouse. All of those instructions could also be coded with XML tags, and included in the same file with the QuarkXPress graphic data. The possibility for error declines, as the efficiency increases. The possibilities for full print automation expand rapidly.
Why Isn't XML Everywhere?
So if XML has all of these powerful features, why is it not already in use everywhere? There are several reasons.
First of all, it's very new. The full specifications for the format are still under review (although a partial specification is now available). With the specifications still in flux, the available software tools are few, far-between and pricey. Mainstream software like PageMaker and QuarkXPress has little or no XML support. XML remains complicated, and we've seen again and again that the market moves slowly when technology gets complex.
When I talk to people who are familiar with XML, I find that they divide into one of two camps.
The first group, let's call them The Technophiles, love the promise and power of XML, and feel that it's just a matter of time (and not too much time) before XML is widely supported and widely adopted.
The second group, let's call them the Once-burned, Twice-shy group, think that XML will find limited adoption, and even that it will take a long time to develop.
As a consultant, I, of course, fall in-between. On the one hand I see the promise of XML, and have dreams of what we could do with its power. On the other hand, until we have inexpensive and simple tools to harness its power (and they're not going to be easy to design), I think XML will have a relatively small group of adherents.
Companies that have the infrastructure and the wherewithal to harness XML early will reap great benefits. The rest will play catch up, driven inexorably toward a technology that's optimized both for publishing and commerce.
About the Author
Thad McIlroy is a San Francisco-based electronic publishing consultant and author, and serves as program director of Seybold Seminars. He welcomes comments at firstname.lastname@example.org.