Sunteți pe pagina 1din 105

1

What is XML?

Extensible Markup Language,


or XML for short, is a
new technology for web
applications. XML is a
World Wide Web Consortium
standard that lets you
create your own tags. XML
simplifies business-to business
transactions on the web.

2
What is XML?
• XML stands for eXtensible Markup Language
• XML is a markup language much like HTML that is
used to describe data
• XML tags are not predefined - you must define your
own tags
• XML uses a Document Type Definition (DTD) or an
XML Schema to describe the data
• XML with a DTD or XML Schema is designed to be
self-descriptive
• XML uses a CSS (Cascading Style Sheets) to display
information
3
 “XML is going to be the main language for
exchanging financial information between
businesses over the Internet. A lot of interesting
B2B applications are under development .”
 A Document Type Definition or an XML Schema
to specify what tags are allowed or required
 A validation service to confirm that a given XML
document is syntactically correct
 Is independent of any hardware, software, operating
system, programming language…

4
Why Do We Need XML?

Why do we need XML when


everyone's browser
supports HTML today? To
answer this question,
look at the sample HTML Code
shown. HTML tags
are for browsing; they're meant
for interactions
between humans and computers.

5
Rendering HTML
When rendered, the HTML
in the previous example
looks like this. As you can
see, HTML tags describe
how something should
render. They don't contain
any information about what
the data is, they only
describe how it should look.

6
Sample XML Code
Now let's look at some sample
XML Code. With
XML, you can understand the
meaning of the tags.
More importantly, a computer
can understand them
as well. It's easier for a
computer to understand
that the tag
<zipcode>34829</zipcode> is
a zip code.

7
Rendering XML
XML from the previous
example might be rendered
like this. Notice that even
though the tags are
different, they can still be
rendered just like HTML.

8
XML Document Structure
An XML document consists of three parts, in the order given:

1. An XML declaration (which is technically optional, but


recommended in most normal cases)
2. A document type declaration that refers to a DTD (which is
optional, but required if you want validation)
3. A body or document instance (which is required)

Collectively, the XML declaration and the document type


declaration are called the XML prolog

9
XML Declaration
The XML declaration is a piece of markup that identifies this as an
XML document.
The declaration also indicates whether the document can be
validated by referring to an external Document Type Definition
(DTD).
The minimal XML declaration is:
<?xml version="1.0" ?>

The formal definition of an XML declaration, according to the XML


1.0 specification :
XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

The following declaration means that there is an external DTD on


which this document depends.
<?xml version="1.0" standalone="no" ?> 10
if your XML document has no associated DTD, the correct XML
declaration is:
<?xml version="1.0" standalone="yes" ?>

The optional encoding part of the declaration tells the XML processor
(parser) how to interpret the bytes based on a particular character set.
The default encoding is UTF-8
<?xml version="1.0" encoding="UTF-8" ?>
<?xml version="1.0" encoding="ISO-8859-7" ?>
<?xml version="1.0" standalone="no" encoding="UTF-8" ?>

Is the next example valid?


<?xml version="1.0" encoding='UTF-8' standalone='no'?>

Neither of the following XML declarations is valid.


<?XML VERSION="1.0" STANDALONE="no"?>
<?xml version="1.0" standalone="No"?> 11
Document Type Declaration
The purpose of this declaration is to announce the root element and to
provide the location of the DTD.
The general syntax is:
<!DOCTYPE RootElement (SYSTEM | PUBLIC)
ExternalDeclarations? [InternalDeclarations]? >

Examples:
<!DOCTYPE Employees SYSTEM "employees.dtd"> Similarly,
<!DOCTYPE PriceList SYSTEM "prices.dtd">
<!DOCTYPE Employees SYSTEM "../dtds/employees.dtd">
<!DOCTYPE Employees SYSTEM
"http://somewhere.com/dtds/employees.dtd">

Processing instructions - An example processing instruction that


causes style to be determined by a style sheet:
<?xml-stylesheet type="text/css" href="xmlstyle.css"?> 12
PUBLIC identifier: This is used in formal environments to declare that a
given DTD is available to the public for shared use.

Syntax is a little different:


<!DOCTYPE RootElement PUBLIC PublicID URI>

Example:
<!DOCTYPE Instrument PUBLIC "-//NASA//Instrument Markup
Language 0.2//EN" "http://pioneer.gsfc.nasa.gov/public/iml/iml.dtd">
In this case the PublicID is:
"-//NASA//Instrument Markup Language 0.2//EN“
The URI that locates the DTD is:
http://pioneer.gsfc.nasa.gov/public/iml/iml.dtd
The complete prolog for NASA example is:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/css" href="xmlstyle.css"?>
<!DOCTYPE Instrument PUBLIC "-//NASA//Instrument Markup Language
13
0.2//EN" "http://pioneer.gsfc.nasa.gov/public/iml/iml.dtd">
Document Body

The document body, or instance, is the bulk of the information content of


the document.
A complete well formed XML document may look like:
<?xml version="1.0"?>
<LAND>
<FOREST>
<TREE>Oak</TREE>
<TREE>Pine</TREE>
<TREE>Maple</TREE>
</FOREST>
<MEADOW>
<GRASS>Bluegrass</GRASS>
<GRASS>Fescue</GRASS>
<GRASS>Rye</GRASS>
</MEADOW>
</LAND>

14
The below document is not an XML document since it does not qualify
by the rules of a well formed document. There is more than one top level
element which disqualifies the document from being well formed.

<?xml version="1.0"?>
<FOREST>
<TREE>Oak</TREE>
<TREE>Pine</TREE>
<TREE>Maple</TREE>
</FOREST>
<MEADOW>
<GRASS>Bluegrass</GRASS>
<GRASS>Fescue</GRASS>
<GRASS>Rye</GRASS>
</MEADOW>

15
An example
<book>
<title>My First XML</title>
Book Title: My First XML
<prod id="33-657"
Chapter 1: Introduction to XML media="paper"></prod>
<chapter>Introduction to XML
• What is HTML <para>What is HTML</para>
• What is XML <para>What is XML</para>
</chapter>
Chapter 2: XML Syntax <chapter>XML Syntax
• Elements must have a closing <para>Elements must have a
closing tag</para>
tag <para>Elements must be
• Elements must be properly properly nested</para>
</chapter>
nested
</book>

16
XML Elements
Elements have Content
Elements can have different content types.
An XML element is everything from (including) the element's start
tag to (including) the element's end tag.
An element can have element content, mixed content, simple
content, or empty content. An element can also have attributes.

In the example above, book has element content, because it contains


other elements. Chapter has mixed content because it contains both
text and other elements. Para has simple content (or text content)
because it contains only text. Prod has empty content, because it
carries no information.
In the example above only the prod element has attributes. The
attribute named id has the value "33-657". The attribute named
media has the value "paper".
17
Elements & Attributes:
:• XML elements must follow these naming rules:
• Names can contain letters, numbers, and other characters
• Names must not start with a number or punctuation character
• Names must not start with the letters xml (or XML or Xml ..)
• Names cannot contain spaces

 Use elements to describe data


 Use attributes to present information that is not part of the data
– For example, the file type or some other information that would
be useful in processing the data, but is not part of the data.

<note date="12/11/99">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
18
An Example XML Document
<?xml version=“1.0”?> xml-declaration
<artist>
<name><first>Vincent</first><last>van Gogh</last></name>
<born><date>1853</date><place>Holland</place></born>
<died><date>1890</date><place>France</place></died>
<artwork>
<artifact> start-tag element content end-tag
<title>The Starry Night</title>
<date>1889</date>
root element
<material>Oil on Canvas</material>
<dim>
<height metrics_type="in">29</height>
<width metrics_type="in">36 1/4</width>
</dim>
<location>Museum of Modern Art, NewYork</location>
<image file="starry-night.jpg"></image>
</artifact>
</artist>
19
An Example XML Document
<?xml version=“1.0”?>
<artist>
<name><first>Vincent</first><last>van Gogh</last></name>
<born><date>1853</date><place>Holland</place></born>
<died><date>1890</date><place>France</place></died>
<artwork>
<artifact>
<title>The Starry Night</title>attribute name attribute value
<date>1889</date>
<material>Oil on Canvas</material>
<dim>
<height metrics_type="in">29</height>
<width metrics_type="in">36 1/4</width>
</dim>
<location>Museum of Modern Art, NewYork</location>
<image file="starry-night.jpg"></image>
</artifact>
</artist>
20
Document Type Definition (DTD)
• The purpose of a DTD is to define the structure of an
XML document. It defines the structure with a list of
legal elements
• With DTD, each of your XML files can carry a
description of its own format with it.
• With a DTD, independent groups of people can agree to
use a common DTD for interchanging data.
• Your application can use a standard DTD to verify that
the data you receive from the outside world is valid.
• You can also use a DTD to verify your own data.
• XML files can be validated against the definition before
a program tries to process the data.
– Then you don’t have to worry about accounting for all kinds
of error conditions. 21
DTD
• Internal
– Put the DTD right into the XML file
<?xml version="1.0"?>
<!DOCTYPE note [
First, there is the
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)> definition of the
<!ELEMENT from (#PCDATA)> file structure, then
<!ELEMENT heading (#PCDATA)>
the actual data
<!ELEMENT body (#PCDATA)> ]>
comes later
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note> 22
DTD
• External
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM
"note.dtd">
<note> Where the note.dtd file=
<to>Tove</to> <!ELEMENT note
<from>Jani</from> (to,from,heading,body)>
<heading>Reminder <!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
</heading>
<!ELEMENT heading (#PCDATA)>
<body>Don't forget me this
<!ELEMENT body (#PCDATA)>
weekend!</body>
</note>

23
An Example XML DTD
<?xml version=“1.0”?>
<!ELEMENT artist (name , born , died? , artwork?)>
<!ELEMENT name (first , last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
GROUP: element
<!ELEMENT born (date , place?)> with children
<!ELEMENT died (date , place?)> (sequences)
<!ELEMENT date (#PCDATA)>
<!ELEMENT place (#PCDATA)> Declaring zero or
<!ELEMENT artwork (artifact*)> one occurrences of
<!ELEMENT artifact (title , date? , material , dim , location ,the same element
image)>
<!ELEMENT title (#PCDATA)> (?)
<!ELEMENT material (#PCDATA)>
<!ELEMENT dim (height , width)> Declaring zero or
<!ELEMENT height (#PCDATA)> more occurrences of
<!ATTLIST height metrics_type CDATA #IMPLIED > the same element
<!ELEMENT width (#PCDATA)> (*)
<!ATTLIST width metrics_type CDATA #IMPLIED >
<!ELEMENT location (#PCDATA)> Elements with only
<!ELEMENT image EMPTY)> character data
<!ATTLIST image file CDATA #IMPLIED >
24
XML DTD Building Blocks
• Elements
– Elements are the main building blocks of both XML and HTML
documents
– Tags are used to markup elements (e.g. <name>… </name>)
– Elements are nested and may define a specific sequence
• Attributes
– Attributes provide extra information about elements (other than
their content and type)
– Attributes are always placed inside the starting tag of an element
Attributes always come in name/value pairs
(e.g. <width metrics_type="in">36 1/4</width>)
– May be applied to one specific instance of a given element
• Entities
– Entities are variables used to define common text
25
XML DTD Elements
Em pty elem ents < !ELEM EN Tim ageEM PTY> , XM L exam ple < im age />

Elem ents w ith only character < !ELEM EN Theight(#P CD ATA)>


data

Elem ents w ith any contents < !ELEM EN Tnote AN Y>

Elem ents w ith children


< !ELEM EN Tnam e (first, last)>
(sequences)

Declaring only one occurrence <!ELEM EN Tartw o rk(artifact) >


of the sam e elem ent

Declaring m inim um one


< !ELEM EN Tartw o rk(artifact+)>
occurrence of the sam e elem ent

Declaring zero or m ore


occurrences of the sam e < !ELEM EN Tartw o rk(artifact
*)>
elem ent

Declaring either/or content < !ELEM EN Tartifact(title, d ate, (im ag e| location))>

Declaring m ixed content < !ELEM EN Tartifact(#PC D AT A| title | date | im age)*>

26
XML DTD Attributes and Entities
Attribute declaration <!ATTLISTsquare width CDATA "0">
example XML example: <
square width="100"></
s quare>
<!ATTLISTpayment type CDATA "check">
Default attribute value
XML example: <
payment type="check" />
<!ATTLISTcontact fax CDATA #IMPLIED>
Implied attribute
XML example: <
contact fax="555-667788" />
<!ATTLIST person number CDATA #REQUIRED>
Required attribute
XML example: <
person number="5677" />
<!ATTLISTsender company CDATA #FIXED "Microsoft">
Fixed attribute value
XML example: <
sender company="Microsoft" />
<!ATTLISTpayment type (check|cash) "cash">
Enumerated attribute XML example: <
payment type="check" />
values
or <payment type="cash" />

<!ENTITY writer "Donald Duck.">


Entity declaration <!ENTITYcopyright"CopyrightWalt Disney.">
XML example: <
author>&writer;&copyright;</
author>
27
• Elements can contain text, other elements, or be empty.
Examples of empty HTML elements are "hr", "br" and
"img".
• Tags are used to markup elements.
• Attributes provide extra information about elements.
• Entities are variables used to define common text.
Entity references are references to entities.
• PCDATA - Parsed Character Data
• CDATA - Text that will not be parsed (Think
comment)

28
Element with only character data:
<!ELEMENT element-name (#PCDATA)>
example:<!ELEMENT from (#PCDATA)>

Element with any data:


<!ELEMENT element-name ANY>
example:<!ELEMENT note ANY>

Elements with children


<!ELEMENT element-name (child-element-name)> or
<!ELEMENT element-name (child-element-name,child-element-
name, .....)>
example: <!ELEMENT note (to,from,heading,body)>

Full declaration of note


<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)> 29
XML Namespaces
• XML Namespaces provide a method to avoid
element name conflicts.
• Conflicting names make consistent interpretation
impossible
• XML identifies namespaces to make the usage
clear.
• In XML, element names are defined by the
developer. This often results in a conflict when
trying to mix XML documents from different XML
applications.

30
An example
<table> <table>
<tr> <name>African Coffee Table</name>
<td>Apples</td> <width>80</width>
<td>Bananas</td> <length>120</length>
</tr> </table>
</table>

If these XML fragments were added together, there would be a name


conflict. Both contain a <table> element, but the elements have
different content and meaning.

An XML parser will not know how to handle these differences.


31
Solving the Name Conflict Using a Prefix
Name conflicts in XML can easily be avoided using a name prefix.
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>

<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>

In the example above, there will be no conflict because the two


32
<table> elements have different names.
XML Namespaces - The xmlns Attribute
When using prefixes in XML, a so-called namespace for the prefix
must be defined. The namespace is defined by the xmlns attribute in
the start tag of an element.
Syntax: xmlns:prefix="URI".
<root>
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
33
</root>
• In the example above, the xmlns attribute in the <table> tag
give the h: and f: prefixes a qualified namespace.
• When a namespace is defined for an element, all child
elements with the same prefix are associated with the same
namespace.
• Namespaces can be declared in the elements where they are
used or in the XML root element:
The use of a prefix (h or f here) clearly separates the two meanings of the tag
“table.”
Tag attribute xmlns (xml name space) identifies each prefix with a unique
name. The link is not used by the parser to check the definition. The link may
contain information about the namespace.

34
Default name space
• Using a default <table
xmlns="http://www.w3.o
namespace in an rg/TR/html4/">
element avoids the <tr>
need to repeat the <td>Apples</td>
prefix on each child <td>Bananas</td>
</tr>
element. </table>

Here, all subelements of “table” have the name related to the


namespace indicated in the xmlns attribute.
35
XML Schema
XML Schema is an XML-based alternative to DTD.
An XML schema describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema
Definition (XSD).

<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType> 36

</xs:element>
• Defines the elements, attributes that can appear in a document -- and
their data types
• Defines relationships (parent-child) and order and number of child
elements
• Defines fixed and default values for elements and attributes
• With XML Schemas, the sender can describe the data in a way that the
receiver will understand.
Example:
A date like: "03-11-2004" will, in some countries, be interpreted as
3.November and in other countries as 11.March.
However, an XML element with a data type like this:
<date type="date">2004-03-11</date>
ensures a mutual understanding of the content, because the XML data type
"date" requires the format "YYYY-MM-DD".

37
Syntax and semantics
• Syntax determines the rules for a well-
formed statement.
• Semantics determines how a well-formed
statement will be used.
• Use of a schema can help detect errors in
use within a well-formed statement.
– Ex: May include a range of legal values.

38
Schema example for “note”
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3s.com"
xmlns=http://www.w3schools.com elementFormDefault="qualified">

<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

39
The <schema> Element
The <schema> element is the root element of every XML Schema:
<?xml version="1.0"?>
<xs:schema>
... ...
</xs:schema>
The <schema> element may contain some attributes. A schema
declaration often looks something like this:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns=http://www.w3schools.com
elementFormDefault="qualified">
...
40
</xs:schema>
Reference to a schema
<?xml version="1.0"?> Default namespace. All elements
used in this XML doc are declared
<note here.
xmlns="http://www.w3s.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com note.xsd">

<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

A document using the note schema


41
Common simple data types

• xs:string Default or fixed values:


• xs:decimal • <xs:element name="color"
• xs:integer type="xs:string"
• default="red"/>
xs:boolean
• <xs:element name="color"
• xs:date
type="xs:string" fixed="red"/>
• xs:time

42
Attributes
Defining an attribute for an
All attributes are declared as element:
simple types. Elements
<lastname
with attributes are
lang="EN">Smith</lastname>
considered complex type.

Syntax for defining an Corresponding attribute


attribute: definition:
<xs:attribute name="xxx" <xs:attribute name="lang"
type="yyy"/> type="xs:string"/>

43
Displaying raw XML documents
Viewing XML Files in Firefox and Internet Explorer: Open the
XML file (typically by clicking on a link) - The XML document will
be displayed with color-coded root and child elements. A plus (+) or
minus sign (-) to the left of the elements can be clicked to expand or
collapse the element structure. To view the raw XML source (without
the + and - signs), select "View Page Source" or "View Source" from
the browser menu.

Viewing XML Files in Netscape 6: Open the XML file, then right-
click in XML file and select "View Page Source". The XML
document will then be displayed with color-coded root and child
elements.

Viewing XML Files in Opera 7: Open the XML file, then right-click
in XML file and select "Frame" / "View Source". The XML document
will be displayed as plain text. 44
XML support in IE 5.0+
Internet Explorer 5.0 has the following XML support:
• Viewing of XML documents
• Full support for W3C DTD standards
• XML embedded in HTML as Data Islands
• Binding XML data to HTML elements
• Transforming and displaying XML with XSL
• Displaying XML with CSS
• Access to the XML DOM (Document Object Model)

*Netscape 6.0 also have full XML support


45
Viewing XML Documents with IE 5.0

46
Displaying XML documents with CSS

47
What is CSS ?
• CSS stands for Cascading Style Sheets
• Styles define how to display HTML/XML elements
• Styles are normally stored in Style Sheets
• Styles were added to HTML 4.0 to solve a problem
• External Style Sheets can save you a lot of work
• External Style Sheets are stored in CSS files
• Multiple style definitions will cascade into on
– cascading order

48
CSS Syntax
• Use of External Style Sheets
<?xml version=“1.0”?>
<!DOCTYPE artist SYSTEM ”artist.dtd”>
<?xml-stylesheet type="text/css" href="artist.css"?>
<artist>
. . .

• Basic syntax
title {
font-family: Palatino;
font-size: 16pt;
font-weight: bold;
text-align: center;
color: #6699CC;
}

• Grouping
artifact, artist, artwork, name, title {
font-family: Arial;
font-size: 16pt;
display: block;
}
49
CSS Display Properties
• Background properties (color, image, position)
• Text properties (color, alignment, direction, letter-spacing)
• Font properties (family, size, style, weight)
• Border properties (color, style, width)
• Margin properties
• Padding properties
• Classification properties (visibility, inline, block)
• Pseudo-elements (:before, :after)

50
The following document shows a document with a link to a cascading
style sheet:
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="xmlstyle.css"?>
<DATABASE>
<TITLE>List of Items Important to Markup Languages</TITLE>
<TITLE2>Languages</TITLE2>
<LANGUAGES>SGML<LANGUAGES>
<LANGUAGES>XML</LANGUAGES>
<LANGUAGES>HTML<LANGUAGES>
<TITLE2>Other</TITLE2>
<OTHER>DTD<OTHER>
<OTHER>DSSL<OTHER>
<OTHER>Style Sheets</OTHER>
</DATABASE>
The below line, which is a part of the XML document above, is a
processing instruction and is a part of the prolog. 51
<?xml-stylesheet type="text/css" href="xmlstyle.css"?>
xmlstyle.css TITLE2
DATABASE { display: block;
{ display: block } font-family: arial;
TITLE {
color: #000080;
display: block;
font-weight: 400;
font-family: arial; color: #008000;
font-size: 20 }
font-weight: 600; LANGUAGES
font-size: 22; {
text-align: center }
display: block;
OTHER {
list-style-type: decimal;
display: block;
font-family: arial;
list-style-type: square;
color: #000000;
font-family: arial; font-weight: 400;
color: #0000ff; font-size: 18 } 52

font-weight: 200; font-size: 14 }


Displaying XML Documents
There are several ways you can
display XML
documents. If your browser can
display XML, you
can simply send the document out to
the browser.
Or use an XSL stylesheet to transform
the XML into
something your browser can handle.
An XSL
stylesheet contains some number of
templates that
define how the elements in an XML
document
should be transformed.
53
Displaying XML Documents
If you want to do complicated
sorting or restructuring that's
beyond the realm of XSL, use
DOM.
In this method, you parse the XML
document, then write Java code to
manipulate the DOM tree in
whatever way you wish.
Your code has complete access to
the DOM and all of its methods, so
you're not bound by the limitations
or design decisions of XSL.

54
Interpreting XML Documents
When you need to interpret an
XML document, there are two
APIs you can use:
1. The Document Object Model,
or DOM, and
2. the Simple API for XML, or
SAX.
The DOM is a standard of the
World Wide Web Consortium that
creates a tree view of your
XML document. The DOM
provides standard functions
for manipulating the elements in
your document.
55
Interpreting XML Documents

The SAX API notifies you when


certain events happen as it parses
your document. When you
respond to an event, any data you
don't specifically store is discarded.

56
SAX or DOM?
Why would you use SAX or
DOM?
If your document is very large,
using SAX will save significant
amounts of memory when
compared to using DOM.
This is especially true if you only
need a few elements in a large
document. On the other
hand, the rich set of standard
functions provided by
the DOM isn't available when you
use SAX.

57
A Simple Example
XML Declaration (“this is XML”) Binary encoding used in file

<?xml version="1.0" encoding="iso-8859-1"?>


<partorders
xmlns=“http://myco.org/Spec/partorders”>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
<desc> Gold sprockel grommets,
with matching hamster
</desc>
<part number=“23-23221-a12” />
<quantity units=“gross”> 12 </quantity>
<deliveryDate date=“27aug1999-12:00h” />
</order>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
. . . Order something else . . .
</order>
</partorders> 58
Example Revisited
element tags attribute of this
quantity element
<partorders
xmlns=“http://myco.org/Spec/partorders” >
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
<desc> Gold sprockel grommets,
with matching hamster
</desc>
<part number=“23-23221-a12” />
<quantity units=“gross”> 12 </quantity>
<deliveryDate date=“27aug1999-12:00h” />
</order>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
. . . Order something else . . .
</order>
</partorders> Hierarchical, structured information
59
XML Data Model - A Tree
ref=

date=

<partorders xmlns="..."> desc


text
<order date="..." order
part
ref="...">
<desc> ..text..
quantity
</desc> partorders
text
<part /> xmlns=
delivery-date
<quantity />
<delivery-date />
order
</order> ref=

<order ref=".." .../> date=

</partorders> 60
XSLT style sheets
XSL - XML Style Sheet Language consists of three parts:
XSLT - a language for transforming XML documents
XPath - a language for navigating in XML documents
XSL-FO - a language for formatting XML documents

XSLT is the recommended style sheet language of XML.


XSLT (eXtensible Stylesheet Language Transformations) is far more
sophisticated than CSS.
XSLT is a language for transforming XML documents into XHTML
documents or to other XML documents.
With XSLT you can add/remove elements and attributes to or from
the output file. You can also rearrange and sort elements, perform
tests and make decisions about which elements to hide and display,
and a lot more.
61
XSL Style Sheet Declaration

The root element that declares the document to be an XSL style sheet is
<xsl:stylesheet> or <xsl:transform>
Both are completely synonymous and either can be used!

The correct way to declare an XSL style sheet :


<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
or:
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

To get access to the XSLT elements, attributes and features we must


declare the XSLT namespace at the top of the document.

62
XSLT ELEMENTS
<xsl:template> Element
The <xsl:template> element is used to build templates.
The match attribute is used to associate a template with an XML
element. The match attribute can also be used to define a template
for the entire XML document. The value of the match attribute is
an XPath expression (i.e. match="/" defines the whole document).

<xsl:value-of> Element
The <xsl:value-of> element can be used to extract the value of an
XML element and add it to the output stream of the transformation

<xsl:for-each> Element
The <xsl:for-each> element allows you to do looping in XSLT.
The XSL <xsl:for-each> element can be used to select every XML
element of a specified node-set 63
<xsl:sort> Element
is used to sort the output. To sort the output, simply add an <xsl:sort> element
inside the <xsl:for-each> element in the XSL file

<xsl:if> Element
is used to put a conditional test against the content of the XML file.
To put a conditional if test against the content of the XML file, add an <xsl:if>
element to the XSL document.

<xsl:choose> Element
is used in conjunction with <xsl:when> and <xsl:otherwise> to express multiple
conditional tests.
Syntax
<xsl:choose>
<xsl:when test="expression">
... some output ...
</xsl:when>
<xsl:otherwise>
... some output ....
</xsl:otherwise> 64
</xsl:choose>
<xsl:apply-templates> Element

The <xsl:apply-templates> element applies a template to the


current element or to the current element's child nodes.
If we add a select attribute to the <xsl:apply-templates> element it
will process only the child element that matches the value of the
attribute. We can use the select attribute to specify the order in
which the child nodes are processed.

65
Examples on XSLT: Consider to Transform the following XML document
("cdcatalog.xml") into XHTML:
<cd>
<?xml version="1.0" encoding="ISO-8859-1"?> <title>Greatest Hits</title>
<catalog> <artist>Dolly Parton</artist>
<cd> <country>USA</country>
<title>Empire Burlesque</title> <company>RCA</company>
<artist>Bob Dylan</artist> <price>9.90</price>
<country>USA</country> <year>1982</year>
<company>Columbia</company> </cd>
<price>10.90</price> <cd>
<year>1985</year> <title>Still got the blues</title>
</cd> <artist>Gary Moore</artist>
<cd> <country>UK</country>
<title>Hide your heart</title> <company>Virgin Records
<artist>Bonnie Tyler</artist> </company>
<country>UK</country> <price>10.20</price>
<company>CBS Records</company> <year>1990</year>
<price>9.90</price> </cd>
<year>1988</year> … … …
66
</cd> </catalog>
Create an XSL Style Sheet: ("cdcatalog.xsl") with a transformation template:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
< h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th> </tr>
<xsl:for-each select="catalog/cd">
<tr> <td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td> </tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet> 67
Link the XSL Style Sheet to the XML Document
Add the XSL style sheet reference to your XML document ("cdcatalog.xml"):

<?xml version="1.0" encoding="ISO-8859-1"?>


<?xml-stylesheet type="text/xsl" href="cdcatalog.xsl"?>
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
.
.
</catalog>
68
OUTPUT:

My CD Collection
Title Artist

Empire Burlesque Bob Dylan

Hide your heart Bonnie Tyler

Greatest Hits Dolly Parton

Still got the blues Gary Moore

Eros Eros Ramazzotti

One night only Bee Gees

Sylvias Mother Dr.Hook

Maggie May Rod Stewart

Romanza Andrea Bocelli

When a man loves a woman Percy Sledge

Black angel Savage Rose

1999 Grammy Nominees Many

69
Modifying cdcatalog.xsl
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<tr>
<td>.</td>
<td>.</td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet> 70
OUTPUT:

My CD Collection
Title Artist

. .

71
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<xsl:sort select="artist" />
<tr> <td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td> </tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template> 72
</xsl:stylesheet>
OUTPUT:

My CD Collection

Title Artist

Romanza Andrea Bocelli

One night only Bee Gees

Empire Burlesque Bob Dylan

Hide your heart Bonnie Tyler

The very best of Cat Stevens

73
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html> <body> <h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32"> <th>Title</th> <th>Artist</th> </tr>
<xsl:for-each select="catalog/cd">
<tr> <td><xsl:value-of select="title"/></td>
<xsl:choose>
<xsl:when test="price > 10">
<td bgcolor="#ff00ff">
<xsl:value-of select="artist"/> </td>
</xsl:when>
<xsl:when test="price > 9">
<td bgcolor="#cccccc">
<xsl:value-of select="artist"/></td>
</xsl:when>
<xsl:otherwise>
<td><xsl:value-of select="artist"/></td>
</xsl:otherwise>
</xsl:choose> 74
</tr> </xsl:for-each> </table></body></html></xsl:template>
OUTPUT:
My CD Collection

Title Artist

Empire Burlesque Bob Dylan

Hide your heart Bonnie Tyler

Greatest Hits Dolly Parton

Still got the blues Gary Moore

Eros Eros Ramazzotti

One night only Bee Gees

Sylvias Mother Dr.Hook

Maggie May Rod Stewart

Romanza Andrea Bocelli

When a man loves a woman Percy Sledge

Black angel Savage Rose

1999 Grammy Nominees Many

75
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html> <body> <h2>My CD Collection</h2>
<xsl:apply-templates/>
</body> </html>
</xsl:template>
<xsl:template match="cd">
<p>
<xsl:apply-templates select="title"/>
<xsl:apply-templates select="artist"/>
</p>
</xsl:template>
<xsl:template match="title">
Title: <span style="color:#ff0000">
<xsl:value-of select="."/></span> <br />
</xsl:template>
<xsl:template match="artist">
Artist: <span style="color:#00ff00">
<xsl:value-of select="."/></span> <br />
</xsl:template> 76
</xsl:stylesheet>
My CD Collection
Title: Empire Burlesque
Artist: Bob Dylan

Title: Hide your heart


Artist: Bonnie Tyler

Title: Greatest Hits


Artist: Dolly Parton

Title: Still got the blues


Artist: Gary Moore

Title: Eros
Artist: Eros Ramazzotti

Title: One night only 77


Artist: Bee Gees
XML processors

XML processors are software programs that are used to parse XML
documents and provide access to the documents' content and
structure.
XML processors can also validate the structure of XML documents
and transform the documents into a variety of formats.
IBM XML4j
Microsoft MSXML in Java
Oracle XML Parser
Sun ``Java Project X''

78
79
What is a web service?
• “Web services” is an effort to build a distributed
computing platform for the Web.
• Web-based applications that dynamically interact with other Web
applications using open standards that include XML, UDDI and
SOAP.
• Web Services is a technology applicable for
computationally distributed problems, including access to
large databases
– A Service-Oriented Architecture (SOA) is a collection of services
or software agents that communicate freely with each other.
– Sub-topic definition: Web Services protocols and standards are the
technology that promote the sharing and distribution of
information and business data.

80
Today’s Web
• Web designed for application to human interactions
• Served very well its purpose:
– Information sharing: a distributed content library.
– Enabled B2C e-commerce.
– Non-automated B2B interactions.
• How did it happen?
– Built on very few standards: http + html
– Shallow interaction model: very few assumptions made about
computing platforms.
– Result was ubiquity.

81
What’s next?
• The Web is everywhere. There is a lot more we can do!
– E-marketplaces.
– Open, automated B2B e-commerce.
– Business process integration on the Web.
– Resource sharing, distributed computing.
• Current approach is ad-hoc on top of existing standards.
– e.g., application-to-application interactions with HTML forms.
• Goal:
enabling systematic application-to-application interaction
on the Web.

82
Designing Web Services
• Goals
– Enable universal interoperability.
– Widespread adoption, ubiquity: fast!
• Compare with the good but still limited adoption of the OMG’s OMA.
– Enable (Internet scale) dynamic binding.
• Support a service oriented architecture (SOA).
– Efficiently support both open (Web) and more constrained
environments.
• Requirements
– Based on standards. Pervasive support is critical.
– Minimal amount of required infrastructure is assumed.
• Only a minimal set of standards must be implemented.
– Very low level of application integration is expected.
• But may be increased in a flexible way.
– Focuses on messages and documents, not on APIs.

83
Web Services Model

Web service applications are


encapsulated, loosely coupled Web
“components” that can bind
dynamically to each other

84
Why to use more than one computer?
• Distributed resources
– access to shared data
– access to shared programs
– access to CPU (e.g. many desktop PCs together),
to memory, to special devices (e.g. printer)
• Complete independence on the internal
implementation

85
Distributed architecture
• gives
– access to distributed resources
– development encapsulation
• maintainability, re-usability, legacy-awareness
– implementation independence
• requires
– adding a communication layer between parts
– synchronization of efforts
• including such nasty things as distributed garbage
collection 86
Distributed architecture
Waiting
Waitingfor
for Sending
Sending
requests
requests Communication protocol, Data format requests,
requests,
(known
(knownlocation,
location, getting
known
knownport)
port)
getting
results
results

• Basic questions are:


– What kind of protocol to use, and what data to
transmit
– What to do with requests on the server side
87
Traditional CGI-based approach
Waiting
Waitingfor
for Sending
Sending
requests
requests Data as name/value pairs requests,
requests,
(known
(knownlocation,
location, getting
known
knownport)
port)
getting
results
results

• cgi-bin scripts:
– Data transmitted as name-value pairs (HTML forms)
– Transport over (state-less) HTTP protocol
– no standards for keeping user sessions (state-fullness)
– server side: a script is called
88
CORBA-based approach
Waiting
Waitingfor
for Sending
Sending
requests
requests Data in binary format requests,
requests,
(known
(knownlocation,
location, getting
known
knownport)
port)
getting
results
results

• CORBA:
– Data transmitted as objects (at least it looks like that)
– Transport (usually) over well standardised IIOP protocol
– user sessions (state-fullness) very inter-operable
– server side: an RPC call is made
89
SOAP-based communication
Waiting
Waitingfor
for Sending
Sending
requests
requests Data in XML format requests,
requests,
(known
(knownlocation,
location, getting
known
knownport)
port)
getting
results
results

• SOAP:
– Data in a well-defined XML format
– Transport over various protocols
• HTTP, SMTP are the most used, perhaps because they are
firewall-friendly
90
– server side: either an RPC call or a message delivered
Web services
• A collection of XML-based technologies developed
by the e-business community to address issues of:
– service discovery
– interoperable data exchange and/or application invocation
– service compositions (workflow, business processes)
• Major developers include:
– Apache, IBM, HP, SUN & Microsoft (.NET)
• http://www.webservices.org/

91
Web Services Architecture

92
Let a program “click on a web page”
Web Services Stack

93
SOAP
• Simple Object Access Protocol
– http://www.w3c.org/TR/SOAP/
• A lightweight protocol for exchange of
information in a decentralised, distributed
environment
• Two different styles to use:
– to encapsulate RPC calls using the extensibility and
flexibility of XML
– …or to deliver a whole document without any
method calls encapsulated
94
Request:
setHelloMessage

Request:
getHelloMessage

95
XML Messaging Using SOAP

96
WSDL
• Web Services Definition Language
– http://www.w3.org/TR/wsdl/
• An XML-based language for describing Web
Services
– what the service does (description)
– how to use it (method signatures)
– where to find the service
• It does not depend on the underlying protocol
• But: It is not much human-readable
97
UDDI (and alternatives)
• Universal Description, Discovery and Integration
– http://www.uddi.org
• UDDI creates a platform-independent, open
framework & registry for:
– Describing services
– Discovering businesses
– Integrating business services
• The UDDI may be less used than predicted, especially
on the Internet level
• BioMoby - an alternative for Life Sciences domain?
98
A Web Service example in Java
HTTP Server

Servlet engine (e.g. Apache Tomcat)

Any class
Any class
processing
Any class
processing
Any class
the processing
incoming
the processing
incoming SOAP-aware Sending
Sending
requests
the incoming
requests
the incoming Servlet requests,
requests,
(“business logic”
requests
(“business logic”
requests (e.g. Apache Axis) getting
getting
(“business logic” results
(“business logic” results

99
Why to use Web Services…
(comparing to CORBA)
• WS are easier to deploy because of their
firewall-friendliness
• WS are quite well marketed (both from IT
companies and Open Source projects)
• However:
– user sessions are less standardised
– many parts yet-to-be-done (notification,
transactions, security, etc.)
• The programming effort and maintainability is
similar to other distributed technologies 100
1. What is similar
• The programming effort and maintainability is roughly
the same both for Web Services and CORBA
– For CORBA I need an ORB
• …but do you know anybody doing WS without a SOAP toolkit?
– For CORBA I need an IDL compiler
• …not always (ask Perl folks)
• …for WS you use frequently stubs generated from WSDL
– …similar answers for valuetype/custom encoding, etc.

101
2. What is better
• WS are easier to deploy because of their
firewall-friendliness
• WS are quite well marketed (both from IT
companies and Open Source projects)
• Integration of WS into workflows seems to
be very dynamic and very real topic
– comparing with CORBA Components

102
3. What is worse
• Peer-to-peer access is problematic
– notification by “server-push” is harder to achieve
• User sessions (server’s state-fullness) are less
standardised
– …and therefore less inter-operable
• Many parts yet-to-be-done, or they are quite
complex (notification, transactions, security, etc.)

103
So what?
• Don't throw the baby out with the bathwater
– combine the existing projects with a new Web
Services layer; in most cases it is not so difficult
• Apply existing standards to new Web Services
projects
– think MDA – it may help, even without the whole
OMG adoption process

104
105

S-ar putea să vă placă și