XQuery
XQuery is a functional programming language designed specifically for querying and transforming XML data. Unlike XSLT's template-based approach, XQuery uses a more SQL-like syntax and functional programming paradigms, making it particularly powerful for data-oriented XML processing tasks.
XQuery Basics
Sample XML Data
We'll use this library XML throughout our examples:
<?xml version="1.0" encoding="UTF-8"?>
<library>
<book id="1" category="fiction" published="1960">
<title>To Kill a Mockingbird</title>
<author>
<first>Harper</first>
<last>Lee</last>
</author>
<price currency="USD">12.99</price>
<pages>376</pages>
<subjects>
<subject>Literature</subject>
<subject>Drama</subject>
</subjects>
</book>
<book id="2" category="science" published="1988">
<title>A Brief History of Time</title>
<author>
<first>Stephen</first>
<last>Hawking</last>
</author>
<price currency="USD">15.99</price>
<pages>256</pages>
<subjects>
<subject>Physics</subject>
<subject>Cosmology</subject>
</subjects>
</book>
<book id="3" category="fiction" published="1949">
<title>1984</title>
<author>
<first>George</first>
<last>Orwell</last>
</author>
<price currency="USD">13.99</price>
<pages>328</pages>
<subjects>
<subject>Dystopian Fiction</subject>
<subject>Political Fiction</subject>
</subjects>
</book>
</library>
Basic Query Structure
XQuery expressions range from simple path expressions to complex FLWOR statements:
(: Simple path expression :)
//book/title
(: Extract all book titles :)
for $book in //book
return $book/title
(: More complex query with conditions :)
for $book in //book
where $book/@category = "fiction"
return $book/title
FLWOR Expressions
FLWOR (For, Let, Where, Order by, Return) is XQuery's most powerful construct:
Basic FLWOR
for $book in //book
let $author := concat($book/author/first, " ", $book/author/last)
where $book/price < 15
order by $book/@published
return
<book-info>
<title>{$book/title/text()}</title>
<author>{$author}</author>
<year>{$book/@published}</year>
</book-info>
Multiple For Clauses
for $book in //book
for $subject in $book/subjects/subject
where $book/@category = "fiction"
return
<fiction-subject>
<book>{$book/title/text()}</book>
<subject>{$subject/text()}</subject>
</fiction-subject>
Complex Conditions
for $book in //book
let $fullName := concat($book/author/first, " ", $book/author/last)
where $book/price > 13 and $book/@published > 1950
order by $book/price descending
return
<expensive-book>
<title>{$book/title/text()}</title>
<author>{$fullName}</author>
<price>{$book/price/text()}</price>
</expensive-book>
Data Types and Functions
String Functions
(: String manipulation :)
for $book in //book
let $title := $book/title/text()
return
<book>
<original-title>{$title}</original-title>
<uppercase>{upper-case($title)}</uppercase>
<lowercase>{lower-case($title)}</lowercase>
<length>{string-length($title)}</length>
<substring>{substring($title, 1, 10)}</substring>
<contains-time>{contains($title, "Time")}</contains-time>
</book>
Numeric Functions
(: Statistical operations :)
<statistics>
<total-books>{count(//book)}</total-books>
<total-pages>{sum(//book/pages)}</total-pages>
<average-price>{avg(//book/price)}</average-price>
<max-price>{max(//book/price)}</max-price>
<min-pages>{min(//book/pages)}</min-pages>
</statistics>
Date and Time Functions
(: Working with dates :)
for $book in //book
let $published := xs:integer($book/@published)
let $age := year-from-date(current-date()) - $published
where $age > 30
return
<classic-book>
<title>{$book/title/text()}</title>
<published>{$published}</published>
<age>{$age} years old</age>
</classic-book>
Conditional Expressions
If-Then-Else
for $book in //book
return
<book>
<title>{$book/title/text()}</title>
<price-category>
{if ($book/price > 15) then "expensive"
else if ($book/price > 12) then "moderate"
else "budget"}
</price-category>
</book>
Typeswitch Expression
for $item in //book/*
return
typeswitch($item)
case element(title) return <book-title>{$item/text()}</book-title>
case element(author) return <book-author>{concat($item/first, " ", $item/last)}</book-author>
case element(price) return <book-price currency="{$item/@currency}">{$item/text()}</book-price>
default return <other-element>{local-name($item)}</other-element>
Grouping and Aggregation
Group By (XQuery 3.0+)
<categories>
{
for $book in //book
group by $category := $book/@category
return
<category name="{$category}">
<count>{count($book)}</count>
<books>
{for $b in $book return <title>{$b/title/text()}</title>}
</books>
<average-price>{avg($book/price)}</average-price>
</category>
}
</categories>
Manual Grouping (Pre-3.0)
<categories>
{
let $categories := distinct-values(//book/@category)
for $cat in $categories
let $books := //book[@category = $cat]
return
<category name="{$cat}">
<count>{count($books)}</count>
<average-price>{avg($books/price)}</average-price>
</category>
}
</categories>
User-Defined Functions
Function Declaration
declare function local:full-name($author as element()) as xs:string {
concat($author/first, " ", $author/last)
};
declare function local:format-price($price as element()) as xs:string {
concat($price/@currency, " ", $price/text())
};
(: Usage :)
for $book in //book
return
<formatted-book>
<title>{$book/title/text()}</title>
<author>{local:full-name($book/author)}</author>
<price>{local:format-price($book/price)}</price>
</formatted-book>
Recursive Functions
declare function local:factorial($n as xs:integer) as xs:integer {
if ($n <= 1) then 1
else $n * local:factorial($n - 1)
};
(: Calculate factorials for page counts :)
for $book in //book
where $book/pages < 10 (: Keep numbers reasonable :)
return
<book>
<title>{$book/title/text()}</title>
<pages>{$book/pages/text()}</pages>
<factorial>{local:factorial(xs:integer($book/pages))}</factorial>
</book>
Advanced Querying
Joins
(: Self-join: Find books by the same author :)
for $book1 in //book
for $book2 in //book
where $book1/author/last = $book2/author/last
and $book1/@id != $book2/@id
return
<same-author>
<author>{concat($book1/author/first, " ", $book1/author/last)}</author>
<book1>{$book1/title/text()}</book1>
<book2>{$book2/title/text()}</book2>
</same-author>
Exists and Quantifiers
(: Books with multiple subjects :)
for $book in //book
where count($book/subjects/subject) > 1
return $book/title
(: Books where every subject contains "Fiction" :)
for $book in //book
where every $subject in $book/subjects/subject satisfies contains($subject, "Fiction")
return $book/title
(: Books where some subject contains "Physics" :)
for $book in //book
where some $subject in $book/subjects/subject satisfies contains($subject, "Physics")
return $book/title
Window Clauses (XQuery 3.0+)
(: Sliding window over books ordered by publication year :)
for sliding window $w in //book
start when true()
end when count($w) = 2
order by xs:integer($w[1]/@published)
return
<window>
{for $book in $w return <title>{$book/title/text()}</title>}
</window>
XML Construction
Element Construction
(: Creating new XML structures :)
<book-catalog generated="{current-dateTime()}">
<summary>
<total-books>{count(//book)}</total-books>
<categories>{count(distinct-values(//book/@category))}</categories>
</summary>
<books>
{for $book in //book
order by $book/title
return
<book id="{$book/@id}">
<info>
<title>{$book/title/text()}</title>
<author>{concat($book/author/first, " ", $book/author/last)}</author>
<published>{$book/@published}</published>
</info>
<details>
<category>{$book/@category/string()}</category>
<price currency="{$book/price/@currency}">{$book/price/text()}</price>
<page-count>{$book/pages/text()}</page-count>
</details>
</book>
}
</books>
</book-catalog>
Computed Constructors
(: Dynamic element and attribute names :)
for $book in //book
let $elementName := concat("book-", $book/@category)
return
element {$elementName} {
attribute {"book-id"} {$book/@id},
attribute {"year-published"} {$book/@published},
element title {$book/title/text()},
element author-info {
attribute full-name {concat($book/author/first, " ", $book/author/last)},
$book/author/first/text()
}
}
Modules and Imports
Module Declaration
(: library-utils.xq :)
module namespace lib = "http://example.com/library";
declare function lib:format-author($author as element()) as xs:string {
concat($author/last, ", ", $author/first)
};
declare function lib:calculate-discount($price as xs:decimal, $rate as xs:decimal) as xs:decimal {
$price * (1 - $rate)
};
declare function lib:isbn-checksum($isbn as xs:string) as xs:boolean {
(: Simplified ISBN validation logic :)
string-length($isbn) = 13
};
Importing Modules
import module namespace lib = "http://example.com/library" at "library-utils.xq";
(: Using imported functions :)
for $book in //book
return
<book>
<title>{$book/title/text()}</title>
<author>{lib:format-author($book/author)}</author>
<discounted-price>{lib:calculate-discount($book/price, 0.1)}</discounted-price>
</book>
Error Handling
Try-Catch Expressions
for $book in //book
return
try {
<book>
<title>{$book/title/text()}</title>
<numeric-price>{xs:decimal($book/price)}</numeric-price>
<published-year>{xs:integer($book/@published)}</published-year>
</book>
} catch * {
<error>
<message>Error processing book: {$book/title/text()}</message>
<error-code>{$err:code}</error-code>
<error-description>{$err:description}</error-description>
</error>
}
Performance Optimization
Index Usage
(: Use specific paths instead of descendant axis when possible :)
(: Better: :)
/library/book[@category = "fiction"]
(: Avoid: :)
//book[@category = "fiction"]
Let vs For
(: Use 'let' for expensive calculations :)
for $book in //book
let $authorName := concat($book/author/first, " ", $book/author/last)
let $priceInEuros := $book/price * 0.85 (: Expensive conversion :)
where $book/@category = "fiction"
return
<book>
<title>{$book/title/text()}</title>
<author>{$authorName}</author>
<price-eur>{$priceInEuros}</price-eur>
</book>
Practical Examples
Report Generation
<library-report date="{current-date()}">
<overview>
<total-books>{count(//book)}</total-books>
<total-value>{sum(//book/price)}</total-value>
<average-pages>{round(avg(//book/pages))}</average-pages>
<oldest-book>{min(//book/@published)}</oldest-book>
<newest-book>{max(//book/@published)}</newest-book>
</overview>
<by-category>
{for $category in distinct-values(//book/@category)
let $books := //book[@category = $category]
return
<category name="{$category}">
<count>{count($books)}</count>
<total-value>{sum($books/price)}</total-value>
<average-price>{round-half-to-even(avg($books/price), 2)}</average-price>
<titles>
{for $book in $books
order by $book/title
return <title>{$book/title/text()}</title>}
</titles>
</category>
}
</by-category>
<authors>
{for $author in distinct-values(//book/author/last)
let $books := //book[author/last = $author]
return
<author name="{$author}">
<book-count>{count($books)}</book-count>
<books>
{for $book in $books return $book/title/text()}
</books>
</author>
}
</authors>
</library-report>
Data Transformation
(: Transform to different XML structure :)
<catalog>
<metadata>
<created>{current-dateTime()}</created>
<source>Library Management System</source>
</metadata>
<items>
{for $book in //book
return
<item type="book" id="{$book/@id}">
<name>{$book/title/text()}</name>
<creator role="author">
<given-name>{$book/author/first/text()}</given-name>
<family-name>{$book/author/last/text()}</family-name>
</creator>
<publication-info>
<year>{$book/@published}</year>
<category>{$book/@category/string()}</category>
</publication-info>
<physical-info>
<page-count>{$book/pages/text()}</page-count>
</physical-info>
<pricing>
<amount currency="{$book/price/@currency}">{$book/price/text()}</amount>
</pricing>
<subjects>
{for $subject in $book/subjects/subject
return <topic>{$subject/text()}</topic>}
</subjects>
</item>
}
</items>
</catalog>
XQuery vs XSLT
Aspect | XQuery | XSLT |
---|---|---|
Paradigm | Functional | Template-based |
Syntax | SQL-like | XML-based |
Learning Curve | Moderate | Steeper |
Best For | Data extraction, reports | Document transformation |
Grouping | Natural with FLWOR | More complex |
Recursion | Function-based | Template-based |
Performance | Generally good | Optimized processors |
Best Practices
Query Design
- Use specific paths: Avoid // when structure is known
- Filter early: Apply where clauses as early as possible
- Cache expensive calculations: Use let for complex expressions
- Order efficiently: Order by simple expressions when possible
Code Organization
(: Use meaningful variable names :)
for $book in //book
let $authorFullName := concat($book/author/first, " ", $book/author/last)
let $isClassic := xs:integer($book/@published) < 1960
where $book/@category = "fiction"
return (: ... :)
Error Prevention
(: Validate data existence :)
for $book in //book
where exists($book/title) and exists($book/author)
return
try {
<valid-book>
<title>{$book/title/text()}</title>
<author>{concat($book/author/first, " ", $book/author/last)}</author>
</valid-book>
} catch * {
<invalid-book id="{$book/@id}">
Error: {$err:description}
</invalid-book>
}
Conclusion
XQuery provides a powerful, functional approach to XML processing that excels at data extraction, transformation, and analysis. Its SQL-like syntax makes it accessible to developers familiar with database querying, while its functional programming features enable sophisticated data processing workflows.
Next Steps
- Practice with XPath for node selection
- Compare with XSLT for transformation tasks
- Explore XML Processing for implementation details