XML
Dan Tobias (Talk | contribs) |
(Add category) |
||
Line 39: | Line 39: | ||
* [http://www.w3.org/XML/Core/ W3C XML Core Working Group Public Page] | * [http://www.w3.org/XML/Core/ W3C XML Core Working Group Public Page] | ||
+ | [[Category:Metaformats]] | ||
[[Category:Text-based data]] | [[Category:Text-based data]] | ||
− | [[Category:XML based file formats | + | [[Category:XML based file formats]] |
Revision as of 19:16, 28 September 2013
Extensible Markup Language (XML) is a markup language used to encode data.
XML is a language from which languages are made. A body of rules for how an XML document for specific purpose may be constructed is often called a "language" or a "format" in its own right. These rules may be specified in several different ways, the most common being Document Type Definition (DTD) and Schema. A document which follows the syntactic rules of XML is considered "well-formed." A document which is well-formed and also conforms to its DTD or schema declarations is considered "valid."
A Document Type Definition may be included in an XML document or be referenced by a Document Type Declaration, or both approaches may be combined. An external reference to a DTD is provided by a Document Type Declaration, which confusingly has the same initials.
A Schema, unlike a DTD, is itself written in XML. A document can have a Schema for each of its namespaces. DTDs have been largely superseded by Schemas because of the former's limitation of one DTD per document and the latter's greater capacity for describing rules and namespace support.
XML documents refer to both Schemas and DTDs by a URI. It is crucial to remember that this reference is a Universal Resource Identifier, nor a Universal Resource Locator (URL). There is no requirement that the URI point to a resource on the Internet, or even that such a resource exist. This is a potential preservation risk with XML documents, as they may outlive the DTD and Schema documents that characterize them, or the documents may move and be difficult to locate.
There are variants of HTML which are expressed in XML-compliant syntax (which, for instance, requires the tags to be consistently lowercase, and elements with no closing tag have a slash before the right angle bracket at the end of the tag, like <BR /> instead of <BR>), and these are known as XHTML. This format may be served under the XML or HTML MIME types, and browsers might treat them differently in these cases.
While XML is a text-based format, there are also binary XML representations.