Sunteți pe pagina 1din 9

Introduction to Semistructured

Data and XML


Chapter 27, Part E
Based on slides by Dan Suciu
University of Washington

Database Management Systems, R. Ramakrishnan

Management of XML and


Semistructured Data
Based upon slides by Dan Suciu

Database Management Systems, R. Ramakrishnan

Path Expressions
Examples:
Bib.paper
Bib.book.publisher
Bib.paper.author.lastname
Given an OEM instance, the value of a path
expression p is a set of objects

Database Management Systems, R. Ramakrishnan

Path Expressions
Bib
&o1

Examples:

paper

paper

book

references
&o12

&o24
references

DB =

author
title
&o43

year

&o44

http

&o70

lastname

&o52

&243

Abiteboul

&25

&96

&o47 &o48 &o49 &o50 &o51


firstname

&o71

Serge

page

author
title publisher
title
author
author
author

&o45 &o46
1997

firstname

&o29
references

author

lastname

first

last

&206

Victor

Vianu

122

Bib.paper={&o12,&o29}
Bib.paper={&o12,&o29}
Bib.book.publisher={&o51}
Bib.book.publisher={&o51}
Bib.paper.author.lastname={&o71,&206}
Bib.paper.author.lastname={&o71,&206}
Database
Management Systems, R. Ramakrishnan

133

XQuery
Summary:

FOR-LET-WHERE-ORDERBY-RETURN = FLWOR

FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
ORDERBY/RETURN Clause

Database Management Systems, R. Ramakrishnan

Instance of Xquery data model

XQuery

FOR $x in expr -- binds $x to each value in


the list expr

LET $x = expr -- binds $x to the entire list


expr
Useful for common subexpressions and for
aggregations

Database Management Systems, R. Ramakrishnan

FOR v.s. LET


FOR
FOR$x
$xIN
INdocument("bib.xml")/bib/book
document("bib.xml")/bib/book
RETURN
RETURN<result>
<result>$x
$x</result>
</result>

Returns:
<result> <book>...</book></result>
<result> <book>...</book></result>
<result> <book>...</book></result>
...

LET
LET$x
$xIN
INdocument("bib.xml")/bib/book
document("bib.xml")/bib/book
RETURN
RETURN<result>
<result>$x
$x</result>
</result>

Returns:
<result> <book>...</book>
<book>...</book>
<book>...</book>
...
</result>

Database Management Systems, R. Ramakrishnan

Path Expressions

Abbreviated Syntax

/bib/paper[2]/author[1]
/bib//author
paper[author/lastname=Vianu"]
/bib/(paper|book)/title

Unabbreviated Syntax
child::bib/descendant::author
child::bib/descendant-or-self::*/child::author
parent, self, descendant-or-self, attribute

Database Management Systems, R. Ramakrishnan

XQuery
Find all book titles published after 1995:
FOR
FOR$x
$xIN
INdocument("bib.xml")/bib/book
document("bib.xml")/bib/book
WHERE
WHERE$x/year
$x/year>>1995
1995
RETURN
RETURN$x/title
$x/title
Result:
<title> abc </title>
<title> def </title>
<title> ghi </title>
Database Management Systems, R. Ramakrishnan

XQuery
For each author of a book by Morgan
Kaufmann, list all books she published:
FOR
FOR$a
$aIN
INdistinct(document("bib.xml")
distinct(document("bib.xml")
/bib/book[publisher=Morgan
/bib/book[publisher=MorganKaufmann]/author)
Kaufmann]/author)
RETURN
<result>
RETURN<result>
$a,
$a,
FOR
IN/bib/book[author=$a]/title
/bib/book[author=$a]/title
FOR$t$tIN
RETURN
RETURN$t$t
</result>
</result>

distinct = a function that eliminates duplicates


Database Management Systems, R. Ramakrishnan

10

XQuery
Result:
<result>
<author>Jones</author>
<title> abc </title>
<title> def </title>
</result>
<result>
<author> Smith </author>
<title> ghi </title>
</result>

Database Management Systems, R. Ramakrishnan

11

XQuery
<big_publishers>
<big_publishers>
FOR
$pIN
INdistinct(document("bib.xml")//publisher)
distinct(document("bib.xml")//publisher)
FOR$p
LET
$b:=
:=document("bib.xml")/book[publisher
document("bib.xml")/book[publisher==$p]
$p]
LET$b
WHERE
count($b)>>100
100
WHEREcount($b)
RETURN
$p
RETURN$p
</big_publishers>
</big_publishers>

count = a (aggregate) function that returns the number of elms

Database Management Systems, R. Ramakrishnan

12

XQuery
Find books whose price is larger than average:
LET
LET$a=avg(document("bib.xml")/bib/book/price)
$a=avg(document("bib.xml")/bib/book/price)
FOR
$binindocument("bib.xml")/bib/book
document("bib.xml")/bib/book
FOR$b
WHERE
$b/price>>$a
$a
WHERE$b/price
RETURN
$b
RETURN$b

Database Management Systems, R. Ramakrishnan

13

FOR v.s. LET


FOR
Binds node variables iteration
LET
Binds collection variables one value

Database Management Systems, R. Ramakrishnan

14

Collections in XQuery

Ordered and unordered collections


/bib/book/author = an ordered collection
Distinct(/bib/book/author) = an unordered collection

LET $a = /bib/book $a is a collection


$b/author a collection (several authors...)

RETURN
RETURN<result>
<result>$b/author
$b/author</result>
</result>

Database Management Systems, R. Ramakrishnan

Returns:
<result> <author>...</author>
<author>...</author>
<author>...</author>
...
</result>
15

Collections in XQuery
What about collections in expressions ?

$b/price

$b/price * 0.7
list of n numbers??
$b/price * $b/quantity list of n x m numbers ??

list of n prices

Valid only if the two sequences have at most one element


Atomization

$book1/author eq "Kennedy" - Value Comparison


$book1/author = "Kennedy" - General Comparison

Database Management Systems, R. Ramakrishnan

16

Sorting in XQuery
<publisher_list>
<publisher_list>
FOR
FOR$p
$pIN
INdistinct(document("bib.xml")//publisher)
distinct(document("bib.xml")//publisher)

ORDERBY
ORDERBY $p
$p

RETURN
RETURN<publisher>
<publisher><name>
<name>$p/text()
$p/text()</name>
</name>, ,
FOR
FOR$b
$bIN
INdocument("bib.xml")//book[publisher
document("bib.xml")//book[publisher==$p]
$p]

ORDERBY
ORDERBY$b/price
$b/priceDESCENDING
DESCENDING

RETURN
RETURN<book>
<book>
$b/title
$b/title, ,
$b/price
$b/price
</book>
</book>
</publisher>
</publisher>
</publisher_list>
</publisher_list>

Database Management Systems, R. Ramakrishnan

17

If-Then-Else
FOR
FOR$h
$hIN
IN//holding
//holding

ORDERBY
ORDERBY$h/title
$h/title
RETURN
RETURN<holding>
<holding>

$h/title,
$h/title,
IF
IF$h/@type
$h/@type=="Journal"
"Journal"
THEN
THEN$h/editor
$h/editor
ELSE
ELSE$h/author
$h/author
</holding>
</holding>
Database Management Systems, R. Ramakrishnan

18

Existential Quantifiers
FOR
FOR$b
$bIN
IN//book
//book
WHERE
WHERESOME
SOME$p
$pIN
IN$b//para
$b//paraSATISFIES
SATISFIES
contains($p,
contains($p,"sailing")
"sailing")
AND
ANDcontains($p,
contains($p,"windsurfing")
"windsurfing")
RETURN
RETURN$b/title
$b/title

Database Management Systems, R. Ramakrishnan

19

Universal Quantifiers
FOR
FOR$b
$bIN
IN//book
//book
WHERE
WHEREEVERY
EVERY$p
$pIN
IN$b//para
$b//paraSATISFIES
SATISFIES
contains($p,
contains($p,"sailing")
"sailing")
RETURN
RETURN$b/title
$b/title

Database Management Systems, R. Ramakrishnan

20

Other Stuff in XQuery

If-then-else
Universal and existential quantifiers
Sorting
Before and After

Filter

Recursive functions

for dealing with order in the input


deletes some edges in the result tree

Database Management Systems, R. Ramakrishnan

21

Group-By in Xquery ??

No GROUPBY currently in XQuery


A recent proposal (next)
What do YOU think ?

Database Management Systems, R. Ramakrishnan

22

Group-By in Xquery ??
FOR
FOR$b
$bIN
INdocument("http://www.bn.com")/bib/book,
document("http://www.bn.com")/bib/book,
$y
$b/@year
$yIN
IN$b/@year
WHERE
$b/publisher="MorganKaufmann"
Kaufmann"
WHERE$b/publisher="Morgan
RETURN
GROUPBY$y
$y
RETURN GROUPBY
WHERE
count($b)>>10
10
WHEREcount($b)

with GROUPBY

IN
<year>$y
$y</year>
</year>
IN<year>

Equivalent SQL

Database Management Systems, R. Ramakrishnan

SELECT
SELECTyear
year
FROM
Bib
FROMBib
WHERE
WHEREBib.publisher="Morgan
Bib.publisher="MorganKaufmann"
Kaufmann"
GROUPBY
GROUPBYyear
year
HAVING
count(*)>>10
10
HAVINGcount(*)

23

Group-By in Xquery ??
FOR
FOR$b
$bIN
INdocument("http://www.bn.com")/bib/book,
document("http://www.bn.com")/bib/book,
$a
$aIN
IN$b/author,
$b/author,
$y
$yIN
IN$b/@year
$b/@year
RETURN
RETURN GROUPBY
GROUPBY$a,
$a,$y
$y
IN
IN<result>
<result>$a,
$a,
<year>
<year>$y
$y</year>,
</year>,
<total>
<total>count($b)
count($b)</total>
</total>
</result>
</result>

with GROUPBY

FOR
FOR$a
$aIN
INdocument(http://www.bn.com)/
document(http://www.bn.com)/bib/book/author,
bib/book/author,
$y
$yIN
IN$a/../@year
$a/../@year

Without GROUPBY

LET
LET$b
$b==document("http://www.bn.com")/bib/book[author=$a,@year=$y]
document("http://www.bn.com")/bib/book[author=$a,@year=$y]
RETURN
RETURN<result>
<result>$a,
$a,
<year>
<year>$y
$y</year>,
</year>,
<total>
<total>count($b)
count($b)</total>
</total>
</result>
</result>

Correct if the GROUPBY is node-identity based


Not equivalent if the GROUPBY is value-based

Database Management Systems, R. Ramakrishnan

24

Group-By in Xquery ??
FOR
FOR$b
$bIN
INdocument("http://www.bn.com")/bib/book,
document("http://www.bn.com")/bib/book,
$a
$aIN
IN$b/author,
$b/author,
$y
IN
$y IN$b/@year
$b/@year
RETURN
RETURN GROUPBY
GROUPBY$a,
$a,$y
$y
IN
IN<result>
<result>$a,
$a,
<year>
<year>$y
$y</year>,
</year>,
<total>
<total>count($b)
count($b)</total>
</total>
</result>
</result>

with GROUPBY

FOR
FOR$a
$aIN
INdistinct(document(http://www.bn.com)/
distinct(document(http://www.bn.com)/bib/book/author)
bib/book/author)
$y
$yIN
INdistinct(document(http://www.bn.com)/bib/book/@year)
distinct(document(http://www.bn.com)/bib/book/@year)

Without GROUPBY

LET
LET$b
$b==document("http://www.bn.com")/bib/book[author=$a,@year=$y]
document("http://www.bn.com")/bib/book[author=$a,@year=$y]
RETURN
RETURN
IFIFcount($b)
count($b)>>00

Database Management Systems, R. Ramakrishnan

THEN
THEN
<result>
<result>$a,
$a,
<year>
<year>$y
$y</year>,
</year>,
<total>
<total>count($b)
count($b)</total>
</total>
</result>
</result>

25

Group-By in Xquery ??
FOR
FOR$b
$bIN
INdocument("http://www.bn.com")/bib/book,
document("http://www.bn.com")/bib/book,
$a
$aIN
IN$b/author,
$b/author,
$y
$yIN
IN$b/@year
$b/@year
RETURN
RETURN GROUPBY
GROUPBY$a,
$a,$y
$y
IN
IN<result>
<result>$a,
$a,
<year>
<year>$y
$y</year>,
</year>,
<total>
<total>count($b)
count($b)</total>
</total>
</result>
</result>

with GROUPBY

FOR
FOR$Tup
$TupIN
INdistinct(FOR
distinct(FOR$b
$bIN
INdocument("http://www.bn.com")/bib,
document("http://www.bn.com")/bib,
$a
$aIN
IN$b/author,
$b/author,
$y
$yIN
IN$b/@year
$b/@year
RETURN
RETURN<Tup>
<Tup><a>
<a>$a
$a</a>
</a><y>
<y>$y
$y</y>
</y></Tup>),
</Tup>),
$a
$aIN
IN$Tup/a/node(),
$Tup/a/node(),
$y
$yIN
IN$Tup/y/node()
$Tup/y/node()
LET
$b
=
document("http://www.bn.com")/bib/book[author=$a,@year=$y]
LET $b = document("http://www.bn.com")/bib/book[author=$a,@year=$y]
RETURN
RETURN<result>
<result>$a,
$a,
<year>
<year>$y
$y</year>,
</year>,
<total>
<total>count($b)
count($b)</total>
</total>
</result>
</result>
Database Management Systems, R. Ramakrishnan
26

Without GROUPBY

Group-By in Xquery ??
FOR $b IN document("http://www.bn.com")/bib/book,
$a IN $b/author,
$y IN $b/@year,
$t IN $b/title,
$p IN $b/publisher
RETURN
GROUPBY $p, $y
IN <result> $p,
<year> $y </year>,
GROUPBY $a
IN <authorEntry>
$a,
GROUPBY $t
IN $t
<authorEntry>
</result>

Database Management Systems, R. Ramakrishnan

Nested GROUPBYs

27

S-ar putea să vă placă și