XQuery

From Training Material
Revision as of 12:04, 14 January 2021 by Lsokolowski (talk | contribs) (→‎Useful Sources, Examples)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


title
XQuery
author
Lukasz Sokolowski, Filip Stachecki

XQuery

XQuery Training Materials

XQuery Introduction ⌘

What Is XQuery?

  • XQuery is a language for gathering data from XML documents.
  • It should do the same job for XML like SQL for relational databases.
  • Documentation http://www.w3.org/XML/Query/
  • Example:
for $x in doc("exercises.xml")/exercises/exercise
where $x/number>5
order by $x/answer
return $x/answer

Path Expressions ⌘

  • used to traverse an XML tree to select elements and attributes
  • similar to paths used for file names in operating systems
  • consist of a series of steps, separated by slashes, that traverse the elements and attributes in the XML documents.
  • widely implemented
    • programming languages(php), tools(Selenium), etc
  • Paths in Xpath
  • https://www.w3schools.com/xml/xpath_intro.asp

FLWOR ⌘

  • For, Let, Where, Order by, Return
  • Expressions are similar to SELECTs from SQL
  • Syntax: (ForClause | LetClause) + WhereClause? OrderByClause? "return" ExprSingle
    • for – defines variable
    • let – one-time value assignment for variable
    • where – filtering elements in loop/sequence
    • order by – sorting sequence
    • return – expression computed for every element in loop/sequence
  • group by – grouping output tuples by keys from input tuples ('XQuery 3.0' only)

Functions ⌘

  • There are over 100 functions built into XQuery, shared also with XPath and XSLT.
  • We can define our own - in the query or in an external library.
  • Can be called from almost any place in a query.
  • Can be used to:
    • manipulate strings and dates
    • perform mathematical calculations
    • combine sequences of elements
    • perform many other useful jobs
  • Xpath Functions
  • https://www.w3schools.com/xml/xsl_functions.asp

Joins ⌘

FLWORs can easily join data from multiple sources.

  • Example:
    • Suppose we want to join information from courses.xml and venues.xml.
    • We want a list of all the Courses along with their name and name of related Venue.
for $course in doc("courses.xml")//course
let $vname := doc("venues.xml")//venue[courseId = $course/courseId]/name
return <item courseId="{$course/courseId}"
             cname="{$course/name}"
             vname="{$vname}"/>

XQuery Basics ⌘

The Design of the XQuery Language:

  • in 1999 based on Quilt, which derives from XQL and XML-QL
  • useful for highly structured and semi-structured documents
  • protocol-independent, can be evaluated on any system
  • declarative language(no procedural)
  • strongly typed, queries can be compiled (finding errors, optimizing evaluation)
  • allows querying across collections of documents
  • use and share with XML 1.0, Namespaces, XML Schema, XSLT and XPath

XQuery in Context ⌘

XQuery depends on/is related to XPath, XSLT, SQL, and XML Schema.

  • XPath selects elements and attributes from an XML document (traversing hierarchy,filtering out unwanted content)
    • XQuery 1.0 and XPath 2.0 have the same data model and the same set of built-in functions and operators
  • XSLT transforms XML into other XML or any kind
    • XQuery selects from collections, XSLT from one entire document; they share data model, built-in functions and many expressions
  • SQL is for highly structured relational data, XQuery for less-structured data
  • XML Schema defines schemas to validate XML and assign types to XML elem and attr
    • XQuery uses its type system - better optimization and errors handling

Processing Queries (model) ⌘

Xquery processing.png

Processing Queries ⌘

  • Input - XML doc, fragments of XML from URI, set of XML docs, native XML db, rel db with XML fe, in-memory XML
  • Query - text file, embedded in program code, dynamically generated library, input by user, composed from modules
    • prolog - optional, declarations separated by semicolons (namespaces, schemas, variables, functions, etc)
    • body - usually single expression, can be sequence of exprs separated by comma
  • Context - outside of query or in prolog (date, time, time zone, variables, ext lib, etc)
  • Processor - parses, analyzes, and evaluates the query
    • analysis phase(like compiling), finds static(syntax, etc) and type errors
    • evaluation phase(like executing), raises dynamic(missing input, /0, ect) and type errors
    • all errors have eight-character names (for example XPST0001)
  • Results - sequence of values, can be:
    • written to a physical XML file(serialization)
    • sent to a user interface
    • passed to another application for further processing

Processing Queries (query example) ⌘

Query with prolog and body:

declare boundary-space preserve;
declare namespace cslist = "http://npdatatypes.com/courses";
declare variable $courses := doc("courses.xml")//courses;

<allCourses>{count($courses/course)}</allCourses>,
<cslist:courseId>{$courses/course/courseId}</cslist:courseId>

The XQuery Data Model ⌘

  • XDM describes the structure of inputs and outputs of the query.
  • like tables, columns and rows for SQL
  • differs from Infoset(XML model) - not only complete XML documents:
    • nodes - an XML construct(element, attribute, document, text, processing instruction, comment)
    • items - generic term (node, atomic value)
    • sequences of elements(no single outermost element) - ordered collections of items(0, 1 or many)
    • atomic values - simple data with no markup

XDM nodes ⌘

  • Returned by many expressions (path expressions, constructors, etc).
  • Combined with hierarchy in XML document - children, parent, ancestors, descendants, siblings
  • Root can be top level element in XML doc or element (or other nodes) in document fragment
  • Every node has a unique identity assigned by the query processor(use is operator or funcs: node-name, name, and local-name)
  • Kinds of values: string and typed
    • string(doc("courses.xml")/courses/course[2]/courseId) - return 102 as a string value
    • data(doc("courses.xml")/courses/course[2]/courseId) - return 102 as an integer number (if specified by schema)

Types ⌘

  • Strongly typed, functions and operators expect their arguments or operands to be of a particular type.
  • Based on XML Schema:
    • basic types - xs:integer, xs:string, xs:date
    • assigned to items during optional schema validation(or untyped without schema)
  • Untyped can be automatically converted - doc("venues.xml")//venue/substring(@id, 1, 2)
  • Typed has to be explicitly converted - doc("venues.xml")//venue/substring(xs:string(@id), 1, 2)

Namespaces ⌘

  • They identify the vocabulary to which XML elements and attributes belong
  • Disambiguate names from different vocabularies
  • These are namespace-qualified:
    • Elements and attributes from an input document
    • Elements and attributes in the query results
    • Functions, variables, and types

Namespaces (query example) ⌘

  • Input document with namespaces (courses_ns.xml)
<nptraining:course xmlns:nptraining="http://npdatatypes.com/nptraining">
  <nptraining:courseId>101</nptraining:courseId>
  <nptraining:name language="en">Drupal 7 for developers</nptraining:name>
</nptraining:course>
  • Query and its results:
Query:
  declare namespace myprefix = "http://npdatatypes.com/nptraining";
  for $course in doc("courses_ns.xml")//myprefix:course
  return $course/myprefix:name
Results:
 <nptraining:name xmlns:nptraining="http://npdatatypes.com/nptraining"
   language="en">Drupal 7 for developers</nptraining:name>

Expressions ⌘

  • Basic unit of evaluation
  • Can contain sub-expressions which also can have other sub-expressions, etc

Categories of Expressions ⌘

Category Description Operators, keywords
primary literals, variables, function calls, parenthesized expressions
comparison based on value, node identity, document order =, !=, <, <=, >, >=, eq, ne, lt, le, gt, ge, is, <<, >>
conditional if-then-else expressions if, then, else
logical boolean and/or or, and
path selecting nodes from XML docs /, //, .., ., child::, etc.
constructor adding XML to the results <, >, element, attribute
flwor controlling the selection and processing of nodes for, let, where, order by, return
quantified determining whether sequences fulfill specific conditions some, every, in, satisfies
sequence-related creating and combining sequences to, union (|), intersect, except
type-related casting and validating values based on type instance of, typeswitch, cast as, castable, treat, validate
arithmetic adding, subtracting, multiplying, dividing +, -, *, div, idiv, mod

Keywords and Names ⌘

  • Keywords are case-sensitive, generally lowercase
    • multiuse operators are never ambiguous (*, in)
  • Names are case-sensitive, they identify: elements, attributes, types, variables, functions
    • XML qualified names (can start with a letter or underscore and contain letters, digits, underscores, dashes, and periods)
    • no reserved words
    • also namespace-qualified

Whitespace in Queries ⌘

  • spaces, tabs, line breaks - allowed almost anywhere in a query
  • break up expressions, make queries more readable
  • bad usage: orderby - should be order by
  • good in both ways around operators: x=y and x = y
  • has meaning in quoted strings, constructed elements and attributes
  • newline and carriage return are treated the same way as others

Literals ⌘

  • directly represented constant values: 101, "DFG", etc.
    • string literals - must be enclosed in quotes
    • numeric literals - p20.5E2, 20, 20.6, etc.
      • type is taken from its format by query processor
  • can be converted - xs:date("2014-07-03"), true(), false(), etc.

Variables ⌘

  • preceded by a dollar sign ($)
  • XML-qualified names
  • value can be any sequence: single node, single atomic, empty, multiple nodes/atomic
  • we cannot assign a new value, instead use a new variable
  • can be bound in - global variable declarations, FLWOR(for, let), quantified expressions, typeswitch expressions
  • function declarations bind variables to values

Function Calls ⌘

  • Example - substring($courseId, 102, 103)
    • substring is the name of the function
    • three arguments are separated by commas and surrounded by parentheses
    • $courseId is a variable reference, the other two are numeric literals

Comments ⌘

  • delimited by (: and :)
  • ignored during processing
  • can contain any text, even XML markup
  • can be nested
  • we can still use this xml comment <!-- ... -->
    • it will appear in the result
    • can be useful debugging tool (if we include evaluated expressions)

Evaluation Order ⌘

  • usually it is straightforward
    • if ($courseId = 101 or $courseId = 102) then $delegatesNo + 2 else $delegatesNo - 2
  • nested expressions will be solved starting from the deepest one
  • precedence can be forced with parentheses ( )
    • false() or true() and false() - and will be evaluated before or
    • ( false() or true() ) and false() - or will be evaluated first

Comparison Expressions (general) ⌘

  • they compare values - general, value, node
    • general - atomic values or nodes with atomic values(operators: =, !=, <, <=, >, >=):
      • no need to escape < via &lt;
      • multi-item sequences:
        • (1,2) > (3,4) => false (none of left values is bigger than on the right)
        • doc("courses.xml")/courses/course/@cat = 'XML' => true (if at least one of the cat attributes is equal to XML)
      • (untyped, typed) = (typed, typed) - untyped will be cast to typed or double
      • untyped = untyped - compared as strings(value comparison also)
      • incomparable values and not allowed:
        • (101, "XML") = (102, "UML") => type error
        • xs:hexBinary, xs:base64Binary, xs:NOTATION, xs:QName, xs:duration, all the date component types starting with g

Comparison Expressions (value, node) ⌘

  • value - operate only on single atomic values(operators: eq, ne, lt, le, gt, ge)
    • single node with single atomic value or empty sequence
    • () eq () => empty sequence (behaves like NULL in SQL)
    • examples - "ghi" lt "jkl"(true), <ID>102</ID> gt <nid>101</nid>(true), (101,102) eq (101, 102)(type error)
    • (untyped with number) ge (untyped with number) - will be compared as strings (need to cast them to numbers)
  • node - determine if we have the same node
    • is operator - compares by nodes identity, not by their values
    • $x is $y - they both must be one of: single node, empty sequence
    • (not empty) is (empty) => empty sequence
    • deep-equal - to compare contents and attributes of 2 nodes

"if-then-else" Expressions ⌘

  • if (expr) then expr else expr
  • first expr must be in (), others not, last one should be if we want to return multiple exprs
  • else expr - required, expr can be empty ()
  • (expr) as xs:boolean (false, 0, NaN, "", () => false, others => true)
  • nesting via elseif

"if-then-else" Expressions (query example) ⌘

Query

for $exerc in (doc("exercises.xml")/exercises/exercise)
return if ($exerc/@difficulty = 'standard')
       then (<q>{data($exerc/question)}</q>,
             <a>{data($exerc/answer)}</a>)
       else <n>{data($exerc/number)}</n>

Results

<q>What's the exact number of stones in The Stonehenge?</q>
<a>30</a>
<q>How many books are in the Harry Potter series?</q>
<a>7</a>
<n>3</n>
<n>4</n>
<n>5</n>
<n>6</n>

"and/or" Expressions ⌘

  • combine Boolean values with and, or
  • usually used in: (if-then-else), flwor's where, path expression predicates
  • false, 0, NaN, "", () => false, others usually => true
  • lower precedence than comparisons, and higher than or
  • negation via not() - takes also a sequence as an argument
  • examples:
    • if ($isOwned and $rooms < 2) then 1 else $rooms
    • $venue/room and $courseId => true (if at least 1 room child and $courseId is not 0 or NaN)
    • not(doc("courses.xml")/courses/course)
    • $venue/@id != '22' different than not($venue/@id = '22')
      • for venue without id respectively: false, true

Paths ⌘

Path Expressions

  • one or more steps separated by (/) or (//)
  • return nodes in document order
  • evaluated relative to some context item, which is starting point for the relative path
    • doc("courses.xml")/courses/course
    • $courses/course
    • course
  • result of step serves as context for eval of next step
  • processor tracks: context node, context sequence, position of context node in context sequence

XML Path Language (XPath) 2.0⌘

  • XPath is used for selecting parts of an XML document.
  • XML document is modelled as a tree of nodes.
  • XSLT and XQuery are built on XPath expressions.

XPath 2.0 Specification⌘

  • XPath 2.0 is a recommendation of World Wide Web Consortium (W3C)
http://www.w3.org/TR/xpath20/

XPath 1.0 vs. 2.0⌘

What's New in XPath 2.0?

  • Sequences
  • New data model
  • New operators and functions

XPath 2.0 Data model⌘

XqueryTypeModelHier.png

XQuery 1.0 and XPath 2.0 Data Model

http://www.w3.org/TR/xpath-datamodel/

Node Types⌘

  • Document node (root node in XPath 1.0) -an entire XML document (not its outermost element)
  • Element node - an XML element
  • Attribute node - an XML attribute
  • Text node - character data content of an element
  • Processing instruction node - an XML processing instruction
  • Comment node - an XML comment

Atomic Value⌘

  • simple data value (no markup)
  • can have a specific type (e.g.xs:integer), or be untyped (xs:untypedAtomic)

Sequences⌘

  • ordered collections of zero or more items (nodes or atomic values)
  • examples: (1, 2, 3, 4), (1 to 10)
  • sequence constructor - some values, delimited by commas, surrounded by parentheses

Node sets vs. Sequences⌘

  • Path expressions in XPath 1.0 return node sets, in XPath 2.0 return sequences
  • Node sets
    • no duplicates
    • no order
  • Sequences
    • ordered collection (list)
    • zero, one, or more items (not just nodes)
    • may have duplicates

Path expressions⌘

  • A path can be absolute or relative.
  • An absolute path starts with a slash ( / )
  • A path consists of one or more steps, each separated by a slash:
    • An absolute path: /step/step/...
    • A relative path: step/step/...

Step⌘

XPathStepSyntax.png

  • A step consists of:
    • an axis - defines the tree-relationship between the selected nodes and the current node
    • a node-test - identifies one or more nodes within an axis
    • zero or more predicates - other conditions of nodes

Axes⌘

XPathAxis.png

Axes Description⌘

  • self - the context node itself
  • child - the children of the context node. Only document nodes and element nodes have children.
  • descendant - the descendants of the context node (the children, the children of the children, and so on).
  • descendant-or-self - the context node and the descendants of the context node.
  • parent - the parent of the context node, or an empty sequence if the context node has no parent.
  • ancestor - the ancestors of the context node (the parent, the parent of the parent, and so on).
  • ancestor-or-self - the context node and the ancestors of the context node.
  • preceding-sibling - children of the context node's parent that occur before the context node in document order.
  • preceding - all nodes that appear before the current node in the document, except ancestors, attribute nodes and namespace nodes.
  • following-sibling - children of the context node's parent that occur after the context node in document order.
  • following - all nodes in the document after the closing tag of the current node except the context node’s descendants.
  • attribute - the attributes of the context node.
  • namespace axis contains the namespace nodes of the context node (deprecated in XPath 2.0)

Node Test⌘

  • A condition that must be true for each node selected by a step.
  • The condition may be based on:
    • the kind of the node (element, attribute, text, document, comment, or processing instruction),
    • the name of the node
    • the type of the node

Predicates ⌘

  • Filter results with specific criteria, can appear at the end of any step
  • [ Expression ] - examples: [answer = "30"], [number < 3], [@lang = "pl"], [question], course[@category]/courseId
  • value comparison differs
    • allows only singles - course[type eq "Remote"] versus course[type = "Remote"]
    • untyped children - venue[@id eq 20]
  • positional predicates(position in sequence, not in parent) - course/type[3] versus (course/type)[3]
  • useful functions: position(), last() - course[position() > 2], course[last()-2]
  • reverse axes have the opposite meaning - doc("courses.xml")//type/ancestor::*[last( )]
  • can be multiple - course[type = "Closed"][2] versus course[2][type = "Closed"]
  • can contain any expression - course[if ($sf) then duration else true()], course[@category = ("uml", "drupal")], course[*[4][self::type]]

Dynamic Paths ⌘

Input Documents ⌘

  • well-formed physical XML documents
  • single document, collection() function, context outside the query, variables

Context ⌘

  • Only slash and square brackets(in predicates) can change context node
    • doc("exercises.xml")/exercises/exercise/(if (answer) then answer else points)
    • ../exercise[number > 5]
  • (.) represents the context node itself in: predicates, paths, functions parameters
    • ../exercise/number[. > 5]
    • ../exercise/answer[starts-with(., "A")]
  • Some functions by deafult use node itself as context
    • ../exercise/question[string-length( ) > 15]
  • root() function
    • doc("venues.xml")/venues/venue/root()

Constructors ⌘

  • It's possible to create new elements and attributes in query results
  • Types of constructors:
    • direct - use an XML-like syntax, fixed names
    • computed - dynamically generated names

Including Elements and Attributes from the Input Document ⌘

Query

for $course in doc("courses.xml")/courses/course[@category = 'drupal']
return $course

Result

<course category="drupal">
  <name lang="en">Drupal 7 for developers</name>
  <type>Closed</type>
  <courseId>102</courseId>
  <duration>2</duration>
</course>

Direct Constructors ⌘

  • If there is opening tag of element in the query, then whole part of query, which is after it, down to ending tag will be treated as constructor
  • This element will be included in the query result with all of its attributes and text
  • Can contain
    • literal characters, entity references, CDATA sections, escaped {{, other element constructors, etc.
    • enclosed expressions that evaluate to: elements, attributes, atomic values, multiple subexpressions
  • Specifying Attributes Directly, Declaring Namespaces in Direct Constructors
  • Use Case: Modifying an Element from the Input Document

Direct Constructor (xml) ⌘

<book isbn="isbn-0060229357">
  <title>Harold and the Purple Crayon</title>
  <author>
    <first>Crockett</first>
    <last>Johnson</last>
  </author>
</book>

Direct Constructor (example) ⌘

  • Constructor of element placed in the query
declare variable $device := doc("../source_files/courses.xml");

<result>{
  for $el in $device//* return
  <elem depth="{ count( $el/ancestor::node() ) }">
  Element with name: {name($el)}</elem>
}</result>

Direct Constructor (multiple example) ⌘

for $entry in doc("source_files/svnlist.xml")/lists/list/entry
return <li>{$entry/@kind, "string", 4+2, $entry/size}</li>

Computed Constructors ⌘

  • Compute element and attr names dynamically
  • Can create those types of nodes:
    • element, attribute, document, text, processing-instruction, comment
  • Use cases:
    • to make minor changes to document's content
    • to move all the elements to a different namespace
    • to rename elem or attr with input content
    • for language translation (look up of elem names in separate dict)

Computed Constructor ⌘

element {name expression} {content expression}

element book {
  attribute isbn {"isbn-0060229357" },
  element title { "Harold and the Purple Crayon"},
  element author {
    element first { "Crockett" },
    element last {"Johnson" }
  }
}

Computed Constructor (element examples) ⌘

element h2 { "Courses Catalog" }

element {concat("h",$levl)} { "Exercises List" }

element {node-name($mineNds)} { "fresh meat" }

element li {"course ID:", data($courses/courseId), ", name:", data($courses/name)}

element li {concat("course ID:", data($courses/courseId), ", name:", data($courses/name))}

Computed Constructor (attr examples) ⌘

attribute mineAttr { $course/@category }

attribute {concat("mine", "Attr")} { $course/@category }

<result>{attribute {concat("mine", "Attr")} { "yum" } }</result>

Invalid use (namespace):

attribute xmlns:nptraining { "http://npdatatypes.com/nptraining" }

Use instead('XQuery 3.0' only): http://www.w3.org/TR/xquery-30/#id-computed-namespaces

Computed Constructor (query example)⌘

Query

for $cat in 
  distinct-values( doc("../source_files/courses.xml")/courses/course/@category )
return
  element { $cat } 
          { doc("../source_files/courses.xml")/courses/course[@category = $cat]/name }

Results

<uml>
  <name lang="en">UML basics for managers</name>
</uml>
<drupal>
  <name lang="en">Drupal 7 for developers</name>
</drupal>
(...)

Selecting and Joining Using FLWORs ⌘

  • Selecting with Path Expressions
  • FLWOR Expressions
  • Quantified Expressions
  • Selecting Distinct Values
  • Joins

Selecting with Path Expressions ⌘

  • A path expression can be the entire content of a query
  • FLWOR expression is not a query requirement
  • Useful for queries where no new elements and attributes are being constructed
  • and the results don’t need to be sorted
  • Can be preferable to a FLWOR because it is more compact
  • and some implementations will be able to evaluate it faster
    • doc("courses.xml")//course[@category = "uml"]/name
    • doc("courses.xml")//course[@category = "uml" or @category = "drupal"]/name

FLWOR Expressions ⌘

  • more readable and structured selections
  • joining data from multiple sources
  • constructing new elements and attributes
  • evaluating functions on intermediate values
  • sorting results

FLWOR Expressions (example) ⌘

for $course in doc("courses.xml")//course
let $courseCat := $course/@category
where $courseCat = "drupal" or $courseCat = "uml"
return $course/name

FLWOR Expressions (variables scope) ⌘

  • If bound in for or let clause, can be referenced anywhere in that FLWOR after the clause that binds it
  • This includes other subsequent let or for clauses, the where clause, or the return clause
  • It cannot be referenced in a for clause that precedes the let clause,
  • and it should not be referenced in the let clause itself (unexpected results)
    • let $c := $c + 2
  • two variables with the same name in the same expression
    • second masks the first and makes it inaccessible (unexpected results)

For clause ⌘

  • Range expressions
for $k in 1 to 3
return <oneCircle>{$k}</oneCircle>
  • Multiple for clauses
for $m in (1, 2)
for $k in ("x", "y")
return <oneCircle>m is {$m} and k is {$k}</oneCircle>

Let clause ⌘

  • Range expression
let $d := (1 to 3)
return <oneCircle>{$d}</oneCircle>
  • in one row/declaration
let $courseCat := $course/@category, $courseName := $course/name

Where clause ⌘

  • multiple expressions
for $course in doc("courses.xml")//course
let $courseCat := $course/@category
where $course/courseId > 99
  and starts-with($course/name, "U")
  and exists($course/duration)
  and ($courseCat = "uml" or $courseCat = "linux")
return $course

Return clause ⌘

  • multiple expressions
for $d in (1 to 3)
return (<first>{$d}</first>, <second>{$d}</second>)

Quantified Expressions ⌘

  • determine whether some or all of the items in a sequence meet a particular condition
  • always evaluate to a Boolean value (true or false) - weak criteria possibilities
  • can be substituted by FLWORs or just path expr
  • can be more compact and easier for implementations to optimize

Quantified Expressions (examples) ⌘

some $cat in doc("courses.xml")//course/@category
satisfies ($cat = "uml")

every $cat in doc("courses.xml")//course/@category
satisfies ($cat = "uml")

not(some $cat in doc("courses.xml")//course/@category
satisfies ($cat = "uml"))

some $k in (2 to 5), $m in (12, 15)
satisfies $m - $k = 13

Selecting Distinct Values ⌘

distinct-values(doc("courses.xml")//course/@category)

let $courses := doc("courses.xml")//course
for 
  $ca in distinct-values($courses/@category),
  $id in distinct-values($courses[@category = $ca]/courseId)
return <result cat="{$ca}" courseId="{$id}"/>

Useful function - distinct-deep (http://www.functx.com/)

Joins - the fun of FLWORs ⌘

  • join data from multiple sources via FLWORs
  • Two-way join - many conditions => where clause, simple conditions => predicate (less verbose)
    • In some implementations, predicates perform faster than where clauses
    • in a predicate (Example)
    • Two-way join in a where clause
    • Three-way join (exercise it!)
  • Outer join

Two-way join in a where clause (example) ⌘

for
  $course in doc("courses.xml")//course,
  $venue in doc("venues.xml")//venue
where
  $venue/courseId = $course/courseId
return
  <item courseId="{$course/courseId}"
        cname="{$course/name}"
        vname="{$venue/name}"/>

Outer join (example) ⌘

for $course in doc("../source_files/courses.xml")//course
return 
<course id="{$course/courseId}">
{
  attribute booking
  {
  	for $big in doc("../source_files/bookings.xml")//booking
	  where $big/@cid = $course/courseId
  	return $big/client
	}
}
</course>

Sorting and Grouping ⌘

  • Sorting in XQuery
  • Grouping
    • with FLWOR and distinct-values()
    • with group by - "generates an output tuple stream in which each tuple represents a group of tuples from the input tuple stream that have equivalent grouping keys" ('XQuery 3.0' only)
  • Aggregating Values

Functions - don't invent the wheel twice ⌘

Built-in Functions ⌘

User-Defined Functions ⌘

  • We can define our own functions and then use them in expressions (also in other functions)
  • Functionns should be declared in xquery prolog
  • Own functiond should be defined in own namespaces or in predefined namespace with prefix local:
  • nice bunch of useful examples - http://www.xqueryfunctions.com/

User-Defined Functions - syntax ⌘

declare function
prefix:function_name($parameter AS datatype)
AS returnDatatype
{
  ...function code here...
}

User-Defined Functions (example 1) ⌘

  • Declaration:
declare function local:doubleIt($x as xs:double)
as xs:double
{ 2* $x };
  • Call:
<result>
{ local:doubleIt(100) }
</result>

User-Defined Functions (example 2) ⌘

  • Declaration:
(: Filters values from sequence and leaves only smaller than provided parameter :)
declare function local:onlySmaller(
$list, $value as xs:decimal)
{
for $el in $list
where $el < $value
return $el
};
  • Call:
<result>{
let $seq := (1, 20, 3, 40, 5, 60)
return local:onlySmaller($seq, 10)
}</result>

Regular Expressions ⌘

  • The Structure of a Regular Expression
  • Representing Individual Characters
  • Representing Any Character
  • Representing Groups of Characters
  • Character Class Expressions
  • Reluctant Quantifiers
  • Anchors
  • Back-References
  • Using Flags
  • Using Sub-Expressions with Replacement Variables

Useful Sources, Examples

Date and Time Types ⌘

  • Extracting Components of Dates, Times, and Durations
  • Using Arithmetic Operators on Dates, Times, and Durations
  • The Date Component Types

Examples