Sunteți pe pagina 1din 3

Exploring the XHTML DTD

Choosing Your DTD XHTML 1.0 provides three DTDs that describe different sets of XHTML elements
and reflect the three choices provided in HTML 4.0: strict, transitional, and frameset. The probably the
one that the W3C would like to see developers adhere to, but transitional DTDs reflect the reality of
HTML usage much more accurately. Appendix A lists the in the three different DTDs, along with notes
regarding attributes. To identify the DTD for a given document, you must use a DOCTYPE declaration in
the prologue of your document. The XHTML 1.0 Recommendation provides three options, one for each
DTD. They look much like their HTML 4.01 predecessors, although their names are slightly different and
the HTML root element is now html. For the strict DTD, this HTML 4.01 declaration:

<!DOCTYPE HTML
PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">

becomes this XHTML 1.0 declaration:

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

For the transitional DTD, this HTML 4.01 declaration:

<!DOCTYPE HTML
PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

becomes this XHTML 1.0 declaration:

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

And for the frameset DTD, this HTML 4.01 declaration:

<!DOCTYPE HTML
PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">

becomes this XHTML 1.0 declaration:

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

Whichever declaration you choose, it must appear after the XML declaration (if there is one) and before
the root element of the document. If your document passes through a validating parser, it checks your
document to make sure that its contents conform to the rules laid out in the DTD.

Caution The XHTML 1.0 Recommendation doesn't say anything about using another XML feature, the
internal subset of the DOCTYPE declaration. While its use isn't prohibited, you should avoid using it with
XHTML documents.
Starting Out All three DTDs follow roughly the same layout, with a few sections more or less depending
on the particular DTD you read. The first few sections of a DTD are often the most frustrating (they often
put people off) because they lay groundwork for later declarations rather than make concrete declarations.
Reading somewhat abstract collections of declarations outside of their context for page after page may not
feel rewarding, but it's important to understand these preliminaries in order to make sense of the concrete
declarations.

Tip While these preliminaries are important in XHTML 1.0, they will become even more important when
XHTML is modularized in XHTML 1.1. Then you may need to choose which modules are used in
documents. Understanding how these pieces fit together is critical as the specification is broken into
smaller pieces.

Including character entities After some introductory comments, the three XHTML DTDs all start by
referencing the entity sets – character mnemonic entities – supported by HTML: Latin-1, Symbols, and
Special. Because these entity sets are stored in separate files, the DTDs can reference them easily without
requiring a special set for each DTD. (It also means that other XML applications can reference the
XHTML entity sets easily without needing to incorporate the entire DTD.) The declaration for the Latin-1
set, immediately followed by a reference including the material referenced by the declaration, looks like:

<!ENTITY % HTMLlat1 PUBLIC


"-//W3C//ENTITIES Latin 1 for XHTML//EN"
"xhtml-lat1.ent">
%HTMLlat1;

The entity declaration creates a parameter entity named HTMLlat1. HTMLlat1 references a set of
declarations using two different identifiers, including a public identifier (-//W3C//ENTITIES Latin 1 for
XHTML//EN) that applications can use if they already know what these entities are and don't want to
retrieve information from the URL. Applications that don't understand the public identifier, like most
XML processors, can use the URL to retrieve the full set of declarations. Either way, documents that use
the XHTML DTDs may use the full set of entities.

Note The URLs for the entity set locations are given as local URLs. If you want to reference these sets in
your own XML declarations, use the full form: http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent. You
also may want to create a local copy — not all users of your XML DTDs may have access to the Internet
or the W3C site. The copyright statement at the top of the DTD makes it clear that this kind of usage is
acceptable.

Imported names This declaration, for instance, creates the Character parameter entity; meanwhile, the
comment tells developers that attributes declared using this parameter entity must contain a single
character as defined in ISO 10646.

Note Appendix E of the XHTML 1.0 specification mostly omits the specs listed in square brackets, but
they are available at http://www.w3.org/TR/xhtml1/#refs. If you need to look up the RFCs, see
http://www.rfceditor.org. For more on ISO 10646, see the XML 1.0 references at
http://www.w3.org/TR/REC-xml#secexisting- stds. Many of the types are defined more simply, without
referring to outside specifications. The Number entity, for instance, is described as "one or more digits."
The Shape entity doesn't have a description, but its declaration limits it to a small set of well-known types:
<!ENTITY % Shape "(rect|circle|poly|default)"> The transitional and frameset DTDs include two
additional entities, ImgAlign and Color, which support formatting properties left out of the strict DTD.
These entities are declared in a slightly different style, with their descriptive comments preceding the
declaration rather than following it. These DTDs also provide a list of commonly supported colors in
comments, although they aren't formally a part of the DTD that an XML parser understands.
Generic attributes The next section of each of the DTDs defines entities describing numerous attributes
that are applied to many different elements. For the most part, all three DTDs define the same set of
attributes for their elements. This section, in a sense, defines the framework with which the W3C wants
developers to build XHTML applications. It contains the hooks for styling, internationalization, and
scripting – all key tools for moving beyond static Web pages built for Western organizations. The generic
attributes make XHTML more active and more inclusive at the same time. The next two sets of entities
define attributes used to connect XHTML elements to user interfaces and the scripts that respond to user
activities. The events entity defines a set of attributes that connect scripts to particular user-driven events,
such as onclick and onkeypress, and is employed widely on elements in the body of HTML documents.
The focus entity provides additional hooks for elements that can receive and lose user-interface focus.
(Oddly enough, the focus entity is never used anywhere in the three DTDs, although its contents appear
regularly.) Then, three of these entities – coreattrs, i18n, and events – are combined into a single large
attrs attribute for use on many of the textual elements. The transitional and frameset DTDs also declare
the TextAlign entity, which defines the align formatting attribute for many of the block-level elements.

Text elements The next few sections define element content for various parts of XHTML. The first, text
elements, defines content that is used throughout the set of elements that present text. In this section, the
first large differences between the strict and the transitional and frameset DTDs become clearly apparent.
While all of the DTDs declare the same set of entities, the strict DTD omits many of the content models
permitted by the other DTDs' special and fontstyle entities and effectively abolishes iframe, u, s, strike,
font, and basefont from the XHTML vocabulary. This isn't new – it happened in HTML 4.0 – but it's an
indicator of the direction the W3C wants to see developers take, moving away from explicit formatting in
markup to a more abstracted approach applying style sheets to the structures formed by that markup. The
rest of the text elements entities, culminating in the Inline entity, describe different content models that
can appear inside textual content. This section defines markup that you can use inside of paragraphs and
other block-level elements. One entity, misc, provides support for content that may appear in both the
textual and block-level contexts, such as ins, del, script, and noscript.

Block-level elements The next section describes structures that operate at a higher level than the text
elements, creating the structures in which those text elements can appear. Here the three DTDs almost
converge, defining sets of block-level elements that fit into the relatively neat categories of heading, lists,
and blocktext, and then adding the p, div, fieldset, and table element types for a main block element. The
strict DTD leaves out isindex, menu, dir, center, and noframes, which appear in the other two DTDs.
These element models then combine with the misc entity and form element to create the Block entity.
Remember, XML's case sensitivity means that block and Block are completely different things. For cases
in which an element may contain either block-level or textual content, this section also defines the Flow
entity. This entity adds the inline entity and text to the combination of components that make up Block.
The Flow entity functions in elements that step outside the usual block-text distinctions and permit either
form to appear.

Content models for exclusions This is one of the odder sections of the XHTML 1.0 DTDs. Effectively, it
declares content models for particular elements using models much like those in the block-level area – but
with minor changes explained in comments. This section of the DTD is the result of the switch to XML.
Older versions of HTML used a feature of SGML, called exclusions, to specify rules such as "no a
element can contain another a element." XML dropped that feature for the sake of simplicity. As a result,
this section of the DTD redefines a few of the models from the previous section in terms of needs for
particular elements – a, pre, form, and button. There are also some differences among the DTDs. The
content model for Form, for instance, includes the Block model in the strict DTD but the Flow model in
the transitional and frameset DTDs.

S-ar putea să vă placă și