What is XML ? Programming in XML
Although no one’s made any inroads into putting these things together, people do know that the most likely candidate for data interchange for the next decade is looking like XML, and its primarily this technology that we’ll be covering in this category. But here we shall be discussing What is XML and How to publish data in XML.
Extensible Markup Language – XML
Let’s begin by looking at extensible Markup Language (XML) in its historical context, before we start to see how we use it and what it looks like.
A markup language is a set of rules, which you use to add special meanings or provide extra information to a document.
An example of a markup language that we are all familiar with is HTML. In HTML we define how a document should be rendered by adding tags such as <FONT>, <I>, and <DIV>. But in fact, another markup language; called SGML, defines HTML.
Standard Generalized Markup Language (SGML) is a markup language that is used to create other markup languages. It’s been around since the 1960s and became an international standard in 1986 (ISO 8879). However, SGML is a very complex language and many of its features are rarely used.
HTML was originally created by Tim Berners-Lee in 1991 as a way of marking up technical papers so that they could easily be transferred across different platforms. He created a set of tags that he used to mark up a document so that it could be easily rendered by others into a usable format.
When HTML was used exclusively by the scientific community they were not concerned with how the documents looked, only that it be readable and usable. However, as Web browsers became increasingly popular with people who were not in the scientific community and they started to create their own Web pages, appearances became much more important. To meet the demand, the creators of Web browsers began to add their own tags that allowed users to display ever more appealing documents. This led to the inevitable drawback of Web pages being designed for one browser over the others, so that today we see signs on Web pages advising us to view them in Netscape Navigator or Internet Explorer.
What’s Wrong with HTML?
As more and more users are getting onto the Internet, the inherent weaknesses of HTML are becoming increasingly apparent.
The major failures being:
HTML does not allow you to define your own tags
HTML does not carry any information about the meaning of the content held within tags
A hierarchy of data cannot be defined as HTML
If we’re to find a way to integrate our computer systems with our suppliers (and anyone else we might want to work with) we’ll need to find a different tool than HTML, which makes it easier for humans to read a document, but not computers. If you receive an invoice as an HTML document it’s fairly easy for a human to realize what it is and pull out the relevant information. If a computer is sent an invoice as an HTML document it has no way of recognizing its purpose or extracting the data it needs.
To get around these problems SGML could be used, but as we discussed earlier, it’s complex and not widely used. This is where XML comes in. XML is a simplified version of SGML, and is much easier to understand and write.
XML is not fixed like HTML. Users of XML can define their own tags, hence the name extensible Markup Language. These user-defined tags relate to the actual content of the document, not the way the document should appear.
In short, HTML, is used to define how a document should be rendered, whereas XML is used to define the data contained within that document.
The best way to learn what XML is and what is does is to actually see some, which is what we’ll do now.
How Does It Work?
When people first started talking about XML about eighteen months ago, they said it would be the next big, important thing to happen to the Internet. Today, that’s starting to become a truism as more and more vendors, developers, and consultants start believing XML is going to do the job it was designed for and start to make it happen.
As a side note, Bill Gates has strongly committed Microsoft to using XML wherever possible. XML is used heavily in Office 2000’s file formats, and SQL Server 2000 will have native support for XML. Additionally, Windows DNA 2000 (the next revision of the Distributed interNet Architecture concept that the majority of the technology of this book borrows from) will make heavy use of XML.
Technically, XML is very simple. It comprises a set of nested tags and data, like this simple <greeting> tag that contains the word Hello:
In HTML, we might want to define how we would like Hello ! to be displayed, perhaps in bold or in a blue font. In XML, we define the content between the tags. Hello ! is a greeting, so the tags state that.
XML is a very flexible language because its structure allows specific data format definitions. Imagine we wanted to describe a web; we might do it something like this:<web>
<author>Shah Sultan Rony</author>
<title>How to make money at home</title>