I've said before how thrilled I am will the new extended event UI in SSMS for SQL Server 2012. However, you might be one of the early adopters who made up their own scripts to define extended event sessions, and use hardcoded scripts to harvest the results. So, you may run into this problem if you're using what I call "lazy XML" in the event harvesting script. Take, as an example, an extended event session defined as follows with 3 actions:

create event session errorsession on server
add event sqlserver.error_reported
(
action
(
package0.callstack,   
sqlserver.session_id,  
sqlserver.sql_text    
)
where error = 547 and package0.counter <= 3 
)
add target package0.ring_buffer
with (max_dispatch_latency = 1 seconds)
go

In previous versions, you could pretty much depend on the XML presenting the actions in order. So the following XML would return a subset of the event information in rows and columns.

SELECT
  Data2.Results.value ('(data/.)[1]', 'int') AS ErrorNumber,
  Data2.Results.value ('(action/.)[2]', 'int') AS Session,
  Data2.Results.value ('(action/.)[3]', 'nvarchar(max)') AS SQLStatement
from
(
select CAST(xet.target_data as xml) as data
from sys.dm_xe_session_targets xet
join sys.dm_xe_sessions xe
on (xe.address = xet.event_session_address)
where xe.name = 'errorsession') events
cross apply Data.nodes ('//RingBufferTarget/event')
  AS Data2 (Results)

The XML above is based on the assumption that action 2 is the sqlserver.session_id and action 3 is the sqlserver.sql_text, because it doesn't check names, just blindly uses the ordinal number in XPath. It makes the XML easier to write and a bit faster to execute, but its "lazy" XML. This order of actions was a valid assumption in SQL Server 2008; its not valid in SQL Server 2012. The data fragment containing the action data appears like this:

    <action name="sql_text" package="sqlserver">
      <type name="unicode_string" package="package0" />
      <value> 
        DELETE pubs.dbo.jobs
      </value>
    </action>
    <action name="session_id" package="sqlserver">
      <type name="uint16" package="package0" />
      <value>53</value>
    </action>
    <action name="callstack" package="package0">
      <type name="callstack" package="package0" />
      <value>..callstack elided...</value>
    </action>

So now, sql_text (the third action defined) is the first action presented. The fragile harvesting script will break. So, to be one the safer side, if you have any such scripts change them to actually look for the element for want using a named XPath predicate, like this:

SELECT
  Data2.Results.value ('(data[@name="error_number"]/.)[1]', 'int') AS ErrorNumber,
  Data2.Results.value ('(action[@name="session_id"]/.)[1]', 'int') AS Session,
  Data2.Results.value ('(action[@name="sql_text"]/.)[1]', 'nvarchar(max)') AS SQLStatement
from
(
select CAST(xet.target_data as xml) as data
from sys.dm_xe_session_targets xet
join sys.dm_xe_sessions xe
on (xe.address = xet.event_session_address)
where xe.name = 'errorsession') events
cross apply Data.nodes ('//RingBufferTarget/event')
  AS Data2 (Results);

Note that it's probably better to do this with the "data" items too (ErrorNumber in the query above), although you may be a little safer with these, as they have a defined schema per-event. But BOL does point out that events have a "versioned" schema. The actions can be defined in any order, so make your scripts more robust. Use XPath predicates. And if you're not on SQL Server 2012 yet, change your scripts now.

@bobbeauch

SQL Server 2008 adds support for the 'let' clause in FLWOR expressions. The for and let clauses have a similar purpose, to bind content (tuples) to variables.  Either one can begin a FLWOR expression:

declare @x xml = '';
select @x.query('
for $i in (1,2,3)
return $i
');
> returns 1 2 3

declare @x xml = '';
select @x.query('
let $i := (1,2,3)
return $i
');
> returns 1 2 3

The distinction is that let is an assignment clause, in the simple statement using 'let' above, $i refers to the entire sequence (1,2,3). The for clause sets up an iterator. The simple statement above using 'for' loops 3 times and each time through the loop $i refers to a single member of the sequence. So, if I add an 'order by' clause, the results are quite different.

declare @x xml = '';
select @x.query('
for $i in (1,2,3)
order by $i descending
return $i
');
> returns 3 2 1

declare @x xml = '';
select @x.query('
let $i := (1,2,3)
order by $i descending
return $i
');
> error:
> XQuery [query()]: 'order by' requires a singleton (or empty sequence), found operand of type 'xs:integer +'

One limitation on the XQuery let clause is that it does not support constructed elements. So this statement works fine:

declare @x xml = '';
select @x.query('
let $x := 1
return $x
');
> returns 1

but this statement does not:

declare @x xml = '';
select @x.query('
let $x := ( <foo>2</foo>, <bar>2</bar> )
return $x
');
> error:
XQuery [query()]: 'let' is not supported with constructed XML

So Let the use of the let clause begin...no longer do I have to explain what a "FWOR" expression is, hooray. However, nota bene. When 'let' is used inside a loop, it's evaluated each time around the loop:

declare @x xml = '';
select @x.query('
for $i in (1,2,3)
let $j := 42
return ($i, $j)
');
> returns 1 42 2 42 3 42
> $j is evaluated three times

 

Michael Rys (and who would know better/sooner about this) just announced on his blog that XQuery 1.0 and associated specs (including XPath 2.0 and XSLT 2.0) are now official W3C recommendations. Congratulations to the working group on this. The specification process is also underway for a standard XQuery Update Facility and XQuery/XPath Full-Text query facility. The specs are available on the W3C website.

I'll have to revisit/reread the specs, now that they are finalized. The one that always seem to cause the most controversy in classes that I've taught has been the XQuery 1.0/XPath 2.0 Data Model, which is based on sequences, rather than being a document object model, Infoset, or Post Schema Validation Infoset, or a description of the serialized form of XML (Extensible Markup Language (XML) spec). The inclusion of both nodes and atomic values is usually a bit disconcerting, and I get comments like "that's not XML" when people see it. Although SQL Server 2005's XML data type doesn't exactly follow the XQuery 1.0/XPath 2.0 Data Model, rumor has it that the next version of the ISO/ANSI SQL spec (SQL2007?) may have some something to say about this, as well as something to say about XQuery in general. Right now, the SQL2003 spec doesn't specify a query language.

It will also be interesting to see what the SQL Server folks do with regards to updates to support the new specs in the next release, and support of a larger portion of the language constructs. Also, it would be interesting to see support for sequences in other XML specs and/or client and middle-tier APIs. So that "other folks" can read the results...

Categories:
SQL Server XML

There's an interesting article that caught my attention in this month's (July 2006) MSDN magazine by John Mollman about building the MSDN aggregation system. The system uses SQL Server 2005 Service Broker for reliable messaging, activation procedural written in T-SQL, SQLCLR for interactions with the world outside the database and the XML data type and schema support built in to SQL Server.

I found the use of SQL Server Service Broker because of its scalability potential, the data dependent routing implementation, and the plug-in object and interface based provider model most interesting. Check it out.

I've been looking at the DLINQ syntax a bit more (I'm participating in a BOF at TechEd next week), and its always struck me how, at its most basic, its reminiscent of XQuery FWOR. Yes I did leave out the L (Let) on purpose. I guess its the relative placement of the SELECT part of the query:

FWOR:

for $x in /customers/customer/address
where $x/city = "Portland"
order by $x/zipcode
return $x

DLINQ:

from a in db.Address
where a.City == "Portland"
orderby a.Zipcode
select a

I've done the FLWOR/SQL comparison quite a bit. Just replace "for" by "from" and "return" by "select". The ordering of the keywords in DLINQ just makes it a bit more obvious. Or maybe its the "in" keyword in the from/for clause. The superficial similarity kind of ends there, however.

While teaching SQL Server 2005 to developers, I'd always point out the irony in the fact that the new EVENTDATA() function, used to return information in a DDL trigger or EVENT NOTIFICATION, returns XML and would be used almost exclusively by DBAs. This seems almost like a nefarious plot to force DBAs, who would be the most XML-phobic, to learn XML and XQuery/XPath in order to deal with these in their daily jobs. In fact, I once wrote a SQL statement using XQuery to coerce a common variant of EVENTDATA() output into tabular format.

After spending the last few months wallowing in query plans, deadlock graphs, and the like, I've actually become convinced that XML and XQuery in SQL Server is *primarily a DBA tool*. I knew that would catch your attention. Besides event and query notifications, blocked process events, query plans, deadlock graphs, and EVENTDATA(), Database Tuning Advisor and new bulkload format files also use XML format. XML query plans can be studied by using XQuery/XPath and can then be modified (slightly) for use in "plan forcing" where the XML plan is used as a "query hint", USE PLAN. And there's probably some uses that I missed. SSIS files and Reporting Services RDLs are XML format. I'm becoming convinced that T-SQL error messages should be in XML format too.

So DBAs: if you haven't yet learned XQuery/XPath and the SQL Server XML methods, its not too late. It's there just to make you more productive (well, maybe "just" is an overstatement, but you get the point).

Just to whet your appitite, here's a little stored procedure that uses dynamic management views, the new CROSS APPLY operator and XQuery to look for physical operations in query plans and correlate it with the SQL query. Could be useful, perhaps sorted by execution count. Cheers.

CREATE PROCEDURE LookForPhysicalOps (@op VARCHAR(30))
AS
SELECT sql.text, qs.EXECUTION_COUNT, qs.*, p.*
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) sql
CROSS APPLY sys.dm_exec_query_plan(plan_handle) p
WHERE query_plan.exist('
declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan";
/ShowPlanXML/BatchSequence/Batch/Statements//RelOp/@PhysicalOp[. = sql:variable("@op")]
') = 1
GO

EXECUTE LookForPhysicalOps 'Clustered Index Scan'
EXECUTE LookForPhysicalOps 'Hash Match'
EXECUTE LookForPhysicalOps 'Table Scan'

Categories:
Performance | SQL Server XML

If I store XML in an XML data type column its sometimes useful for performance to be able to hoist the value of a particular item to its own persisted computed column. The column can then be used in SQL queries without having to recalculate the XQuery each time its accessed. For example, if I want to use an attribute at the root named orderid:

CREATE TABLE dbo.xml_order (
  id INT PRIMARY KEY IDENTITY, -- to enable XML Index creation
  order_doc XML,
  orderid -- this should be obtained from XML document
)

I must first define a SQL Server UDF to calculate the value.

CREATE FUNCTION dbo.getorderid (@data XML)
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
  RETURN @data.value('/*[1]/@orderid', 'INT')
END

and then use it:

CREATE TABLE xml_order (
  id INT PRIMARY KEY IDENTITY, -- to enable XML Index creation
  order_doc XML,
  orderid AS dbo.getorderid(order_doc) PERSISTED
)

The "WITH SCHEMABINDING" is needed if I want to the column a persisted computed column. The persisted computed column requires the function to be deterministic (returns same output give same input).

This got me thinking...which types of XQuery statements are deterministic and what determines determinism in an XQuery statement? Turns out that ALL XQuery statements are deterministic, because the current SQL Server XQuery dialect doesn't contain non-deterministic XQuery functions like current-dateTime() and friends. Note that this could change as SQL Server XQuery adds (I hope) more of the standard XQuery functions and operators in future. But for now, all is safe and determnistic.

Categories:
SQL Server XML

I've always been slightly puzzled by the permission "EXECUTE" on an XML SCHEMA COLLECTION. Say I have an XML SCHEMA COLLECTION named MySchemas and a table that uses it:

CREATE TABLE MyDocuments (
  id INT PRIMARY KEY IDENTITY,
  thexml XML (MySchemas)
)

And suppose I have a user named FRED that I grant access to the TABLE:

GRANT SELECT, INSERT, UPDATE, DELETE on MyDocuments to FRED

Now FRED can DELETE and SELECT from the table, but because he does not have EXECUTE permission on the XML SCHEMA COLLECTION, he can't INSERT or UPDATE the column that uses it with a schema-valid document. He also can't use typed parameters or variables. UNLESS the value of the XML column, variable, or parameter is NULL. So this works for FRED...

INSERT MyDocuments VALUES(NULL) -- INSERTing a non-NULL would fail.

So does this:

DECLARE @x XML (MySchemas)

But assignment of a document that corresponds to the XML SCHEMA COLLECTION to the variable @x fails unless FRED has EXECUTE on the XML SCHEMA COLLECTION. You grant the permission like this:

GRANT EXECUTE ON XML SCHEMA COLLECTION::MySchemas TO FRED

I can only guess that the reason for this privilege is to keep folks from "probing" the XML SCHEMA COLLECTION (perhaps there is sensitive information in the restrictions) by seeing what works on a column insert or variable assignment and what doesn't. And following the validation error messages. Perhaps.

BTW, the Books Online (I have the December update) gets it right except for SELECT. It says:

Denying the EXECUTE permission denies the principal the ability to insert or update the values in columns, variables, and parameters that are typed or constrained by the XML schema collection. It also denies the principal the ability to query the values in those same xml type columns and variables.

Categories:
Security | SQL Server XML

After speaking yesterday and recanting my story (for about the 100th time) about EventData being XML as a “nefarious plot” to require DBAs to learn XML, I wrote this fairly simple XQuery (actually dead simple XPath) function to change EventData into a rowset. It works based on the fact that you've deposited EventData into a table that looks like this:

CREATE TABLE ddl_log (
 id int primary key identity,
 data XML
);

CREATE TRIGGER mytrig
ON DATABASE
FOR CREATE_TABLE
AS
INSERT ddl_log VALUES(EventData);

This makes your table into a rowset. You could also use a variation of it in your event notification handler, DDL trigger itself. Just leave out the cross apply. I thought I'd seen this before, but can never seem to ever have found it. Now that I've done this, ....learn XML why don't 'ya... There can only be more of it in future. Cheers.

SELECT id, 
 Tab.Col.value('./EventType[1]','nvarchar(50)') AS 'EventType',
 Tab.Col.value('./PostTime[1]','datetime') AS  'PostTime',
 Tab.Col.value('./SPID[1]','nvarchar(50)') AS  'SPID',
 Tab.Col.value('./ServerName[1]','nvarchar(50)') AS  'ServerName',
 Tab.Col.value('./LoginName[1]','nvarchar(50)') AS 'LoginName',
 Tab.Col.value('./UserName[1]','nvarchar(50)') AS 'UserName',
 Tab.Col.value('./DatabaseName[1]','nvarchar(128)') AS 'DatabaseName',
 Tab.Col.value('./SchemaName[1]','nvarchar(128)') AS 'SchemaName',
 Tab.Col.value('./ObjectName[1]','nvarchar(128)') AS 'ObjectName',
 Tab.Col.value('./ObjectType[1]','nvarchar(50)') AS 'ObjectType',
 Tab.Col.value('./TSQLCommand[1]/CommandText[1]','nvarchar(4000)') AS 'CommandText',
 Tab.Col.value('./TSQLCommand[1]/SetOptions[1]/@ANSI_NULLS','nvarchar(3)') AS 'ANSI_NULLS_OPTION',
 Tab.Col.value('./TSQLCommand[1]/SetOptions[1]/@ANSI_NULL_DEFAULT','nvarchar(3)') AS 'ANSI_NULL_DEFAULT_OPTION',
 Tab.Col.value('./TSQLCommand[1]/SetOptions[1]/@ANSI_PADDING','nvarchar(3)') AS 'ANSI_PADDING_OPTION',
 Tab.Col.value('./TSQLCommand[1]/SetOptions[1]/@QUOTED_IDENTIFIER','nvarchar(3)') AS 'QUOTED_IDENTIFIER_OPTION',
 Tab.Col.value('./TSQLCommand[1]/SetOptions[1]/@ENCRYPTED_OPTION','nvarchar(4)') AS 'ENCRYPTED_OPTION'
FROM ddl_log
CROSS APPLY
 data.nodes('/EVENT_INSTANCE') AS Tab(Col)
GO

-- standalone trigger

CREATE TRIGGER mytrig
ON DATABASE
FOR CREATE_TABLE
AS
DECLARE @x XML
SET @x = Eventdata()
SELECT
 Tab.Col.value('./EventType[1]','nvarchar(50)') AS 'EventType',
 Tab.Col.value('./PostTime[1]','datetime') AS  'PostTime',
 -- rest of columns deleted for brevity
FROM @x.nodes('/EVENT_INSTANCE') AS Tab(Col)
GO

OK, because it was Michael that asked originally, XML/XQuery wishes for SQL Server 200x. In no particular order.

Full compliance with "standard" W3C XQuery. If the standards committee finishes before SQL Server 200x ships. Else it will become a "standard" no one follows. Hmm...like SQL. Reason for this is twofold.
1. Folks who get a "generic" XQuery book at the store are now totally confused. Many give up after the third or fourth "not supported" message.
2. RCG (recent college grads) who may learn "generic" XQuery in college. There's a reason why C# looks somewhat like another popular programming language. Because it makes it easier to switch environments. And code.

XQuery editor. That does schema or schemaless. Strong typing. Files or the SQL table row of your choice. They had a good start in the beta, but had to finish VS XML tools.

Client side XQuery. How can I write code that might run on server or client, when it can't run on the client? XLINQ is nice, but it's not a substitute. Should be compatible with server, modulo input and ouput.

Output in encodings other than UTF-16. A CONVERT with extra parm, like DATETIME has.

Full support for XSD schema productions. Yes, I know that its techinically outrageous. But its another source of confusion/frustration. Get an industry standard schema (even Office 2003 schemas). Can't store it without changing "lax" to "skip". But then its not the same schema if I change it.

Standard grammer for composition and decomposition. Now, I'm a big fan of SELECT...FOR XML. It's easier and more powerful than the alternative. But let's talk conversions again. Just layer on XmlElement(), XmlAttribute, etc. Kind of like CAST vs CONVERT in TSQL, they implement both.

A XML.validate method.

A more powerful modify method. But again, this could wait (but not long) for the W3C. Most people don't know that modify IS based on a standard proposal. But XDR and XML Patterns were proposals too. So maybe implement "update" (in addition to insert, replace value of, and delete) for next time. Bet SQL Server is finished before W3C. Multiple node inserts/updates/replace value ofs.

XQuery collection() semantics over rows in a table. That could be considered a collection.

XQuery 1.0 and XPath 2.0 data model (or ANSI SQL 200x data model) support. They're relatively close now.

FOR XML works with UDTs.

Ability to specify DOCUMENT on well-formed data. Like it works on schema-valid. This is a nit, I can do my own constraint. Arguably.

Finally, Node-level locking or Node-level versioning. Yes, I know. Technically outrageous. But it IS a WISH list after all.

I've been working on a student question about using Impersonation inside of a stored procedure. This one's worth sharing.

You can do impersonation using the .NET SqlClient data provider using code roughly like this:

WindowsIdentity w = SqlContext.WindowsIdentity;
WindowsImpersonationContext c = w.Impersonate();
// do something here
c.Undo();

The rule is that in the "do something here" part, I'm allowed to do things like access the file system and these happen using the correct identity. But I'm NOT allowed to do data access. I'd always thought that "data access" meant using the classes in System.Data.SqlClient to access database data. But using the System.Data.SqlTypes.SqlXml class (which uses XmlReader) is also considered data access. So this code fails:

public static void LoadSomeXML(SqlXml thexml)
{
// impersonate
// do something here is:
   XmlDocument doc = new XmlDocument();
   doc.Load(thexml.Value);
// undo
}

interestingly, this code works:
public static void LoadSomeXML(SqlString thexml)
{
// impersonate
// do something here is:
   XmlDocument doc = new XmlDocument();
   doc.Load(thexml.Value);
// undo
}

because it doesn't use XmlReader to do the load. So if you pass in a SqlXml type parameter and use this class inside an impersonation context, it will fail. The error message says "Can't revert thread token in UDF/UDP..." so I wonder if this isn't related to some other threading issues reported using the impersonation context.

A good rule of thumb is to only do the minimum number of operations required while in the impersonation context and revert back (Undo) as soon as possible. In this case, all I really wanted to do was call doc.Save("somefile.xml") to save to the filesystem. If I move the declaration of XmlDocument and doc.Load() outside the impersonation context, doc.Save() works perfectly.

Here's something I've noticed you can do with SQL Server 2005 and Visual Studio 2005.

Take a set of XML documents with the same basic structure. Load one into Visual Studio 2005. Click on the XML menu, Create XML Schema. Make any refinements to the XML schema that is produced that you want, based on your knowledge of the document content. Note that the XML schema is created without the "targetNamespace" attribute. Save to disk.

Open an SSMS 2005 (that's SQL Server Management Studio) query window. Paste in your XML schema and use it to create an XML SCHEMA COLLECTION (of one XML schema). Now you can use the XML SCHEMA COLLECTION to strongly type an XML data type, in a SQL table column, variable, what-have-you.

You can also do XQuery (strongly typed) without using namespace declarations in XQuery preface or namespace prefixes in your query text.

This works because each XML SCHEMA COLLECTION is permitted to have a single "no namespace" schema. SQL Server wouldn't recognize the noNamespaceSchemaLocation (yes, schemas with no namespace are allowed by the XML Schema spec) even if you had it because it does not resolve schemas that don't "live in" SQL Server. And the VS-produced schema specifies attributeFormDefault="unqualified" and elementFormDefault="qualified" so the queries work.

That's for the truly lazy and after all, probably evil to true schema afficianados...so go back and put a namespace in your XML Schema, and use prefixes or default element namespace in XQuery preface.

One of the nice surprises in the April CTP is that SQL Server 2005 XML schema validation errors come with a location now. That's handy. Here's an example:

Msg 6926, Level 16, State 1, Line 1
XML Validation: Invalid simple type value: '1134'.
Location: /*:Invoice[1]/*:LineItems[1]/*:LineItem[1]/*:Sku[1]

Even better would be a line and column number that points out where the processor thinks this error in a value() method happened.

Msg 2389, Level 16, State 1, Line 4
XQuery [xmlinvoice.invoice.value()]: 'value()' requires a singleton (or empty sequence), found operand of type 'xdt:untypedAtomic *'

I know what was wrong with my query (i.e. what the message means) its become one of my favorite error message because it reminds me that this implementation does static type checking, but WHERE is there an xdt:untypedAtomic *? I guess there's no way to tell me where in a single XPath expression I went wrong, but if there are multiple lines or FLWOR expressions, give me a hint. 

The revised (I think) error at least tells me which "value()" function has a bad query. So if there are multiple value()s in the SQL query and one is wrong, I know which one.

Of course, if I got all my XQuery statements right, I wouldn't have this problem...

I've always liked the graphic showplan in SQL Server query analyzer. The biggest hassle with it came when you wanted to send the plan to a friend. Or maybe MS support, but support is your friend too... right? You could send screenshots (which had the annoying habit of never displaying those hover-over stats) or go back to textual showplan.

SQL Server 2005 has XML showplan and I'd once gone as far as to attempt to write a transform to display things nicely. No need. You can do the following from SQL Server Management Studio.

1. Turn on the XML showplan
   -- show estimated plan
   SET SHOWPLAN_XML ON
   GO

   -- or execute statement and show real plan
   SET STATISTICS XML ON
   GO

2. This puts out your showplan as an XML data type column. Click the hyperlink to display the file.

3. Save the XML showplan file with the magic suffix .SQLPlan

4. Now when you double-click on the .SQLPlan file, it opens in SSMS as the interactive showplan with the hover-over stats.

Cool, eh? You can do a variation of this with SQL Profiler too. In fact its easier with SQL Profiler. Now you can mail the .SQLPlan to a friend. With full fidelity.

When reading the CTPNotes file from the new Feb CTP build I stumbled across the fact that the XML schema that contains SQL data types (http://schemas.microsoft.com/sqlserver/2004/sqltypes) is now built-in to the server. Although this may not mean much to most people, it gave me the chance to try something that Dan Sullivan thought up for the first rev of our SQL Server 2005 class. It works now.

One of the enhancements to SELECT...FOR XML is ability to request that the XML it produces be prepended by an XML schema that describes it. A recent change allows you to choose the namespace for that schema. Dan's idea was to add the prepended schema to create a schema collection. After storing the FOR XML outside in an XML schema-valid column, you could make updates to the column that would be validated by the schema. You'd set this up like this:

declare @x xml
select @x = (select * from authors for xml auto, type, xmlschema('urn:authors')).query('*[1]')
create xml schema collection authorsxsd
as @x
go

create table authorsxml (
 id int primary key identity, -- primary key required if XML index needed
 authors xml(authorsxsd))
go

declare @x xml(authorsxsd)
set @x = (select * from authors for xml auto, type, xmlschema('urn:authors')).query('/*[position()>1]')
insert authorsxml values(@x)

The XML Schema produced in the first step will now validate any information entered or updated in the table.

Why the Feb CTP change makes this work is FOR XML....XMLSCHEMA uses the SQL data types schema that's now built in. In previous betas, you could use this schema (error: not built in) or add the schema manually (error: it is built in [but it wasn't]). Thanks SQL Server 2005 XML folks, for this.

Here's the answers to the question from Fun With static XQuery evaluation - 2

-- start with a schema collection

CREATE XML SCHEMA COLLECTION ages AS
'<xs:schema
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   targetNamespace="urn:ages"
   xmlns:tns="urn:ages">
<xs:element name="age" type="xs:int"/>
</xs:schema>
'
GO

DECLARE @x xml(ages)
SET @x = '<age>12</age>'
-- fails ??!
SELECT @x.query('string(/age)')
GO

This fails because there can be more than one <age> element and fn:string requires a singleton or empty sequence.

--- These work ---

-- this query restricts it to the first age element
DECLARE @x xml(ages)
SET @x = '<age>12</age>'
SELECT @x.query('string(/age[1])')
GO

-- this restricts the variable to XML documents. Fragments disallowed.
-- This means there can be only ONE (or zero) age elements.
-- No subscript is needed on the query then.
DECLARE @x xml(document ages)
SET @x = '<age>12</age>'
SELECT @x.query('string(/age)')
GO

The second one was a bit harder if you haven't run across the (document schemacollection) construct. Remember that XML data type can contain documents or fragments. Putting "document " before the schema collection name in any typed XML declaration restricts instances to an XML document (ie, single root element). The default is "content" so:

declare @x xml(content ages)    -- use ages xml schema collection, allow fragments or documents
declare @x xml(document ages) -- disallow fragments; documents only
declare @x xml(ages)               -- equals using "content"

Note that you can only enforce "document only" using this keyword with TYPED XML. It's not supported on untyped XML instances. You can do the equivalent enforcement with an untyped XML column in a table by using an XML check constraint, like this:

create table foo (
  xmlcol xml constraint mycontr
         xmlcol.value('count(/*)', 'int') = 1 and xmlcol.exist('/text()')=0

Hope you've enjoy this foray into static typing and XQuery. Because this is a "implementation decision" you won't find much about this in the W3C spec. The best information about this is in the excellent XML Best Practices for Microsoft SQL Server 2005 document.

BTW, in case you collect W3C specs for your own offline reference (like I do), bear in mind that the final SQL Server 2005 implementation of XQuery will be aligned with the W3C July 2004 XQuery spec series. XQuery is still a W3C "work in progress". SQL Server 2005 implements a subset of the functions and operators, adds functions to access T-SQL variables and SQL columns, and also implements static typing. So it's not a 1-to-1 match with the spec, but if you like W3C specs, July 2004 is the one you want. For now.

After the last two entries, you might be thinking "I guess I can never use text() as a node test with typed XML again". Not so. The error message reads: 'text()' is not supported on simple typed or 'http://www.w3.org/2001/XMLSchema#anyType' elements. So what's left? Mixed content, for one thing. Mixed content consists of a mixture of text and also embedded subelements.

If we change the schema to allow mixed content (this schema also allows a particular subelement):

CREATE XML SCHEMA COLLECTION mixedage AS
'<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="urn:ages"
xmlns:tns="urn:ages">
  <xs:complexType name="age" mixed="true">
    <xs:complexContent mixed="true">
      <xs:restriction base="xs:anyType">
         <xs:sequence>
           <xs:element name="dogyears" type="xs:int"/>
         </xs:sequence>
      </xs:restriction>
    </xs:complexContent>
  </xs:complexType>

<xs:element name="age" type="tns:age"/>
</xs:schema>
'

Then the text() node test works with typed XML just fine:

DECLARE @x xml(mixedage)
SET @x = '
<ag:age xmlns:ag="urn:ages">This is the age in dog years<dogyears>3</dogyears></ag:age>'
-- now it works OK
SELECT @x.query('
declare default namespace "urn:ages";
/age/text()')
GO

Reference back to the previous entry. Now that we know the rules, let's try them out:

-- snip --

Data(),text() and string() accessors

XQuery has a function fn:data() to extract scalar, typed values from nodes, a node test text() to return text nodes, and the function fn:string() that returns the string value of a node. Their usages are sometimes confusing. Guidelines for their proper use in SQL Server 2005 are as follows. Consider the XML instance <age>12</age>.

Untyped XML: The path expression /age/text() returns the text node "12". The function fn:data(/age) returns the string value "12" and so does fn:string(/age).

Typed XML: The expression /age/text() returns static error for any simple typed <age> element. On the other hand, fn:data(/age) returns integer 12, while fn:string(/age) yields the string "12".

-- snip --

Try this:

DECLARE @x xml
SET @x = '<age>12</age>'
-- works as expected
SELECT @x.query('data(/age)')
GO

DECLARE @x xml
SET @x = '<age>12</age>'
-- fails
-- Msg 2211, Level 16, State 1, Line 6
-- XQuery [query()]: Singleton (or empty sequence) required, found operand of type 'element(age,xdt:untypedAny) *'
SELECT @x.query('string(/age)')
GO

Oh. This message looks familiar. It turns out that XQuery functions are strongly typed also. Here's the definition of fn:string and fn:data:

fn:string($arg as item()?) as xs:string

fn:data($arg as item()*) as xdt:anyAtomicType*

The "item()*" means that data takes a sequence of 0-n items. "item()?" means that string only takes a sequence of 0-1 item. Let's fix it then.

SELECT @x.query('string(/age[1])')

Cool. Here's the test for comprehension. Let's try this with typed XML.

-- start with a schema collection

CREATE XML SCHEMA COLLECTION ages AS
'<xs:schema
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   targetNamespace="urn:ages"
   xmlns:tns="urn:ages">
<xs:element name="age" type="xs:int"/>
</xs:schema>
'
GO

DECLARE @x xml(ages)
SET @x = '<age xmlns="urn:ages">12</age>'
-- fails as expected
SELECT @x.query('
declare default namespace "urn:ages";
/age/text()')
GO

DECLARE @x xml(ages)
SET @x = '<age xmlns="urn:ages">12</age>'
-- works as expected
SELECT @x.query('
declare default namespace "urn:ages";
data(/age)')
GO

DECLARE @x xml(ages)
SET @x = '<age>12</age>'
-- fails ??!
SELECT @x.query('string(/age)')
GO

Why does the last query (against strongly typed XML) fail, even though there is a schema? How can you fix it? There are two different "right answers".

There's been lots of puzzled faces lately when I try to explain doing XQuery against strongly typed XML (XML typed by a SCHEMA COLLECTION) vs untyped XML. The largest FAQ is why when I have this document:

<person>
  <name>bob</name>
  <age>51</age>
</person>

using the value function (after assignment to @person) @person.value('/person/age', 'int') returns my favorite error:

Msg 2389, Level 16, State 1, Line 6
XQuery [value()]: Operator 'value()' requires a singleton (or empty sequence), found operand of type 'xdt:untypedAtomic *'

Huh? Although you know by looking at the document that there is only one age element, the XQuery parser uses static evaluation. It doesn't read your document (or read your mind) and assumes there can be more than one age element. After all, there's no schema to enforce the singleton age element, I could have 3 or 4 ages. It doesn't want to guess and be wrong at execution time. Using:

@person.value('/person[1]/age[1]', 'int')

works. I can see why age[1] is required, but why person[1]? Doesn't XML have a single root element? Actually, no. SQL Server 2005 supports fragments (well-formed, multiple root) as well as documents. Fragment support is part of the XQuery 1.0/XPath 2.0 data model.

Most people get by that. The real fun starts when you do examples using untyped XML and XPath expressions with the text() node test. text() works just fine when using untyped XML, but fails against typed XML with simple content. Here's an example (the result of a discussion with Dan Sullivan):

CREATE XML SCHEMA COLLECTION root AS
'<xs:schema
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   targetNamespace="urn:geo"
   xmlns:tns="urn:geo">
<xs:element name="Root" type="xs:string"/>
</xs:schema>
'
GO

-- UNTYPED
-- this works
DECLARE @x  xml
set @x = '<g:Root xmlns:g="urn:geo">asdf</g:Root>'
select @x.query('
 declare namespace g="urn:geo"
 /g:Root/text()')

-- TYPED
-- Msg 9312, Level 16, State 1, Line 4
-- XQuery [query()]: 'text()' is not supported on simple typed
-- or 'http://www.w3.org/2001/XMLSchema#anyType' elements,
-- found 'element(g{urn:geo}:Root,xs:string) *'.

DECLARE @x  xml(root)
-- same document
set @x = '<g:Root xmlns:g="urn:geo">asdf</g:Root>'
select @x.query('declare namespace g="urn:geo"
/g:Root[1]/text()')

But why? Isn't text() a node test that returns the value of a text() node. After casting about in XQuery specs, and SQL BOL, I finally came across this in the XML Best Practices paper.

-- snip --

Data(),text() and string() accessors

XQuery has a function fn:data() to extract scalar, typed values from nodes, a node test text() to return text nodes, and the function fn:string() that returns the string value of a node. Their usages are sometimes confusing. Guidelines for their proper use in SQL Server 2005 are as follows. Consider the XML instance <age>12</age>.

Untyped XML: The path expression /age/text() returns the text node "12". The function fn:data(/age) returns the string value "12" and so does fn:string(/age).

Typed XML: The expression /age/text() returns static error for any simple typed <age> element. On the other hand, fn:data(/age) returns integer 12, while fn:string(/age) yields the string "12".

-- snip --

Well, that was confusing. But now I think I get it. When does a element not have a text() node (or more preicsely, not allow the text() node test)?? When it's a strong-typed query using a simple type element...that's when. But why? Although I know the rules now, I'm still somewhat baffled.

This is getting pretty long, more on this topic in a bit...

Haven't been blogging as much recently, as I've been teaching and travelling quite a bit. Last week though, I received an interesting query on XQuery query plans (that's more uses of the word query in one sentence than I've seen before). We were looking at SQL query plans when XQuery is involved. I'd remembered that:

1. The primary XML index builds a node table over an XML column
2. If there is no XML index, the node table must be built at query time

This seems to quite a big effect on number of estimated rows. Take a 1-row table with an untyped XML column. The row contains a fairly simple document from the W3C XQuery use cases. Let's do a simple query (like /BookStore/Book) over the document.

Without the primary XML index, the execution plan contains 3 UDX expressions, two of them have a large number of estimated rows, 1000 and 10000 estimated rows. The plan step that estimates 1000 rows returns 3 actual rows; the 10000 estimate step returns 114 rows. On the other hand, once the primary index is added, these two UDX expressions are replaced by clustered index seeks, with a fairly close estimate of rows vs the actual rows.

Looks like, if you're going to be doing any non-trivial amount of queries against the XML data type a primary XML index is pretty close to being a requirement. Building it as an index bodes much better than building it at execution time.

HTTP Endpoints in SQL Server 2005 are normally a way to support SQL Server clients that speak the SOAP protocol. It turns out that, although the HTTP endpoint requires a SOAP request, the response need not necessarily be SOAP or even XML. You can return something other than the vanilla document-literal SOAP format; the way to accomplish this is to use FORMAT=NONE in your WEBMETHOD declaration.

Usually HTTP endpoints return an XML complex type, SqlResultStream; you get this behavior with FORMAT=ALL_RESULTS, the default. FORMAT=ROWSETS_ONLY produces only SqlRowSet complex types, this shows up in .NET clients as a DataSet. FORMAT=NONE is the alternative. When you specify NONE, SQL Server doesn't even bother to wrap the result in a SOAP packet. You can use this for sending *anything* back to the client. Although your response can be a nicely formatted SOAP packet, it can also be any XML, more complex SOAP (e.g., additional SOAP Headers). The response doesn't even have to be XML at all, anything can be streamed back to the client, although this has a potential for surprising the receiver.

You use FORMAT=NONE with a stored procedure that has no input or output parameters. It must return a resultset with a single "column" with the special name "XML_F52E2B61-18A1-11d1-B105-00805F49916B". This name is actually an indicator to the underlying protocols that the result should be streamed back to the client, rather than formatted into columns and rows. You might be wondering where you've seen this column name before; if you use SELECT...FOR XML queries, this is the “column” that these queries return. You can also return raw SELECT...FOR XML output using FORMAT=NONE.

When I first heard of this option, the use that immediately came to mind was support of advanced web service procotols. Although I can *output* non-vanilla SOAP using FORMAT=NONE, there doesn't currently seem to be a way for a "webmethod" stored procedure to get ahold of the raw *input* message to an HTTP Endpoint. So, right now, I can't send complex input and have the stored procedure respond to it. Hmm...

Another little thing I'd found had changed in Dec CTP. I'd reported a bug on this one and knew it was gonna be fixed eventually, but better sooner than later. I came up when Dan Sullivan suggested his "universal web service" based on SQL Server's XML data type and XML Schema Collections. Here's the prototype:

CREATE PROCEDURE some_ws (
 @somexmlinout   OUT xml(someothercoll))
AS
-- your code goes here
-- or your external name does

In previous versions, the HTTP Endpoint that defined this web service generated WSDL that, when uses with VS-proxy-generator, looked (coersed into objects) like this:

some_wssomexmlinout i = new some_wssomexmlinout();
some_wsResponsesomexmlinout j = new Responsesomexmlinout();

Object[] o = endpoint.some_ws(i, out j);

Representing the SQL OUTPUT variable as two variables not only made this more complex, but freaked out VB.NET proxy which doesn't really have a built-in concept of output-only parameters (though its possible to do with an attribute).

In Dec CTP, the "universal web service" builds proxies that look like this:

some_wsTypesomexmlinout i = new some_wsTypesomexmlinout();

Object[] o = endpoint.some_ws(ref i);

That's better. Even VB.NET likes it. Now if only I could provide a schema collection (or indicate in the endpoint definition) so that the WSDL would reflect (the only) two types in the namespace. Rather than it's (completely correct) definition of this as "any" from a collection of namesapces (those defined by the schema collection).

Well, unitl that happens, there is always custom WSDL generation...

Notice that this MUST have been a fix to HTTP Endpoint WSDL-generator, because my version of Visual Studio web service proxy generator (Oct CTP) was unchanged. Cool.

OK, I couldn't help myself. The Dec 2004 CTP-specific readme had quite a number of things that are new. There are many improvements in SSIS, for example. The “real“ readme and the known issues file (sqlki.chm) still have July dates on them. But there are always a few little things I have queued up to try with a new build. Here's some that took 5 minutes after installing to test:

1. "Smiley face" XQuery comments still don't work. {-- comment --} still does.
2. INTERSECT and EXCEPT still not there. They are however, still in BOL.
3. FOR XML still does not work with UDTs. However, convert UDT to XML type DOES.

create table ctab (cnum complexnumber)
insert ctab values('1:1i')
insert ctab values('2:2i')

-- works! 2 rows returned with nicely formatted XML in them
select convert(xml, cnum) from ctab

-- fails, Msg 6865, FOR XML does not support CLR User Defined Types
select * from cnum for xml auto

I wonder why one works and not the other. Should use almost the same codepath for serializing UDT. Oh well. That's all for now. More later.

When you retreive a rowset that contains an XML data type column or use an XML data type column as parameter input, you have two choices. You can use the vanilla .NET type, System.String, or a special types from the System.Data.SqlTypes namespace called SqlXml. ADO.NET 2.0 handles conversion when you use strings (as long as they're valid XML, naturally). SqlXml has a method that produces an XmlReader and a constructor that takes XmlReader as input. Richer datatype.

Recently I wrote code that used an XML data type as an output parameter from a stored procedure. I specified the output parameter as SqlDbType.Xml and went to cast the result to the rich SqlXml type...invalid cast. Output parameters appear to always be returned as System.String or System.Data.SqlTypes.SqlString, NOT SqlXml. After the call, the SqlDbType is still reported as SqlDbType.Xml. *Input* parameters work fine as SqlXml, parameters with Direction InputOutput or Output are SqlString. Hmmm... Let's see if the upcoming Community Technology Preview, due out soon, helps this.

Categories:
Data Access | SQL Server XML

Every time I demonstrate using XQuery against a table that contains an SQL Server XML data type to a new class, a get about one out of four people with puzzled looks. And they're the ones who know XPath and XSLT. The part that seems weird to them is that a XQuery against what they see as "a collection of documents in a table" produces a one-row answer for each row in the table. You can tell they were expecting a single document (or fragment) answer. So,

SELECT xmlcol.query('/somequery') FROM xmltable

produces a rowset with a single column; there is one row in the "answer" for each row in the table. Eventually someone pipes up with: "well, I want just one answer". Can't you do that?

You can actually get just one answer in a few different ways. The easiest is to query the entire table using SELECT...FOR XML and do the query on the result:

DECLARE @x XML
SET @x = (SELECT xmlcol from xmltable FOR XML RAW, TYPE)
SELECT @x.query('/somequery')

The only problem you can run into with this is the maximum size of an SQL Server XML data type instance, which is ~2gig. You also have to take the "extra" elements that FOR XML generates into account in your final query. You can also do the original XQuery, then do a FOR XML to combine the results together.

But, you can even cut it finer than that. Using an XML.nodes query with CROSS APPLY or OUTER APPLY, you can generate one row in the answer for each node in the sequence that XML.nodes returns (including or ignoring empty-sequence rows, that's what CROSS vs OUTER apply does). These "intermediate rows" from XML.nodes actually contain "virtual documents" with the context node (starting point of the XQuery) possibly pointing somewhere other than the root. You then do XQuery against the virtual document on the left-hand side of the CROSS APPLY. Like this:

SELECT custid, tab.col.query('itemno') --get itemno subelement of lineitem
  FROM xmlinvoices
  CROSS APPLY (SELECT * FROM
    xmlinv.nodes('//lineitem')) as tab(col)

In this case, if you have a 10-row table, but the 10 XML documents in the table contain 42 lineitem elements, you get 42 rows in the answer. Cool 'eh? If you currently use OPENXML, learn XML.nodes, because this will eventually replace OPENXML in our toolbox. It's MUCH less memory-intensive than OPENXML.

Finally (someone asked this yesterday), if you DO want to start with 1-row/answer per row in table and just throw away the rows that contain empty sequence answer, a simple subquery will do:

SELECT * FROM
  (SELECT xmlcol.query('/somequery') AS col FROM xmltable) AS x
  WHERE col <> ''

There. Exactly the answer you want...whatever the problem. Have fun.

And now, for something a little technically lighter... I've taken to using a convention when writing statements that involve XML/XQuery and SQL together. I put the XML input or XQuery program on separate lines than the SQL part of the statement, this seems to make both query languages and also inline XML data easier to read when they're in the same statement. Like this:

-- SQL part
SELECT invoice.query('
{-- XQuery part, smiley face comments still unsupported --}
declare namespace inv = "urn:www-develop-com:invoices";
for $i in /inv:Invoice
return <foo></foo>
{-- more SQL follows --}
')
FROM invoices

I don't necessarily use the XQuery comments at the end (or at the beginning either), but you get the gist. If you adapt this convention, don't EVER try this:

INSERT INTO xmltable VALUES('
<?xml version="1.0" encoding="UTF-8"?>
<doc></doc>
')

Looks like a variation of the first example, but it won't work. The ?xml declaration (it's not a processing instruction) must be *exactly* the first characters in an XML document if used (its optional). The parser uses the first few characters in the declaration, if you use it, to identify the XML document, realizing that the declaration also specifies the document encoding. "Pretty formatting" XML in the example above produces a carriage return-line feed followed by XML declaration, which is forbidden. Error is:

Msg 9438, Level 16, State 1, Line 1
XML parsing: line 2, character 6, text/xmldecl not at the beginning of input

Either this:

INSERT INTO xmltable VALUES('<?xml version="1.0" encoding="UTF-8"?>
<doc></doc>
')

or leaving the declaration out if possible:

INSERT INTO xmltable VALUES('
<doc></doc>
')

will work fine.

 

Rereading the post about this from last night, it appears that I may have used an ambiguous analogy when attempting to figure why this feature works the way it does, and given folks the wrong idea. It has to do with whether the results of the query would be wrong or the query itself is “incorrect”. Using the invoice example from the previous post:

-- this SQL query would fail to compile
SELECT invoice.query('
declare namespace inv = "urn:www-develop-com:invoices";
for $i in /fooelement
return <foo></foo>
')
FROM invoices

-- this SQL query would compile, execute,
-- and produce correct (but not schema-valid) results
SELECT invoice.query('
declare namespace inv = "urn:www-develop-com:invoices";
for $i in /inv:Invoice
return <foo></foo>
')
FROM invoices

The reason for this is that the SQL Server “query” method on SQL Server's XML data type always produces *untyped* XML by definition. So the first SQL statement fails because the query itself is wrong, not because the results would be incorrect.

In the analogy to SQL, the statement: “SELECT foo FROM invoices” fails not because there can't be a “foo” column in the output rowset but because selecting foo on the input in invalid. Actually, you can produce a “foo” column in the output:

SELECT invoiceid AS foo FROM invoices

so perhaps that wasn't such a bad analogy after all.

Dan Sullivan pointed this out to me a couple of weeks ago. I'd forgotten about it, but it came back to bite me in a demo last week. It's a feature.

If you have an XML column, variable, or UDF return value that's tied to an XML SCHEMA COLLECTION, any XQuery against that XML will be checked for compliance to the schema as part of SQL query parsing. That's right, *XQuery* code will be checked. Here's an example:

I have an XML SCHEMA COLLECTION named invoice_xsd that contains one XML schema that defines the types in an invoice. The schema contains no definition for a "foo" element, just "invoice-like things" like PartNumber and LineItem. And I have a table

CREATE TABLE invoices (xmlinvoice xml(invoice_xsd))

The following SQL statement:

SELECT xmlinvoice.exist('/foo') FROM invoices

produces an error at *query parse time*. Not the answer "false". The error is:

Msg 2260, Level 16, State 1, Line 2
XQuery: There is no element named 'foo'
Msg 9504, Level 16, State 0, Line 2
Errors and/or warnings occurred when processing the XQuery statement for xml data type method 'exist', invoked on column 'invoice', table 'invoices'. See previous error messages for more details.

The SQL query *doesn't even execute*.

Dan's original "proof" of this behavior was that this function doesn't even catalog, ie, CREATE FUNCTION fails:

CREATE FUNCTION DoSomethingWithInvoice(@a xml(invoice_xsd))
RETURNS int
AS
-- other code elided
DECLARE @x XML
SET @x = @a.query('/foo')

Now that's what I'd call *early* validation. I guess its the same as the fact that "SELECT foo FROM invoices" also fails with "invalid column name 'foo'" error. Very cool.

I've known for a couple of weeks now that XQuery and the new XML-SQL client mapping have been dropped from System.Xml in .NET 2.0. The XQuery implementation over the XML data type in SQL Server 2005 is NOT going away, of course. Just the client-side bits. Folks are encouraged to keep using XSLT 1.0 and XPath 1.0 on the client for a while. Hmmm...

After reading (more than twice) through all of the reasons for this decision, the one that makes the most sense to me is the product schedules. The reason that doesn't quite ring true to me is "folks are happy with the XML DOM". I can't help thinking that folks were quite happy doing everything through cursors in the early days of SQL because they didn't quite grok where the power of the relational model was yet. SQL cursors were "comfortable", as the XMLDOM is to XML programmers today. The schedule argument is more reasonable. If XQuery spec won't be "done" until next year, there is hesitation about producing an implementation based on an "in progress" spec that could change at the last minute. If you remember XSL Patterns and XDR schemas in Microsoft APIs you'll understand why. But...

This week at Win-Dev folks "across the hall" were lecturing in earnest about the WSE 2.0 offering. The WSE (web service extension) offerings are supported add-ons, outside of the "core" .NET APIs, and mostly implementations of various WS-* specs-in-progress. They're not guarenteed to be compatible with future offerings or with Indigo, the next generation WS-*++ implementation. In fact, some of the specs that were supported in WSE 1.0 have already completely vanished from the WS-* landscape. There's *way* more churn in this space.

Got me thinking...why not a similar model for XQuery? That is, a supported add-on implementation of the current specification with namespaces that begin with Microsoft.* rather than System.Xml.*. Guarenteed to change, at least subtly, but existing to get folks used to using it. The alternatives, that is, using Saxon.NET or working on a community implementation of XSLT 2.0/XQuery 1.0/XPath 2.0 are already happening. How about it...Microsoft.Xml.Query/Microsoft.Xml.Mapping anyone?

A few weeks ago, I was surprised by an error message when attempting to create an HTTP endpoint with CREATE ENDPOINT. The error was "You do not have permission to perform this operation". The reason I was surprised was that SQL Server 2005 was running in a domain environment and I was logged on as domain administrator at the time. This meant I had sa-level database privledges, and privileges on the OS as well. Hmmm...

The reason for this turned out to be pretty straightforward. I was running the SQL Server process as a relatively unprivileged account, principal of least privilege and all that. When you create an HTTP endpoint, SQL Server issues a "namespace reservation" for part of the HTTP namespace. The reservation is used when other applications (like IIS 6.0) use the HTTP.sys implementation at the same time. It attempts to issue the reservation *using the identity of the principal this is running the SQL Server process*, not as your currently logged on user.

The way to accomplish the reservation under these conditions is to use a system stored procedure, sp_reserve_http_namespace.
It looks like this:

sp_reserve_http_namespace N'http://mymachine.mydomain.com:80/mydir'

Note that in order for this procedure to work, you must be logged in to SQL Server as a Windows login that has OS admin privileges. And so I was, and it worked. So did CREATE ENDPOINT... FOR HTTP. However I noticed that, in my CREATE ENDPOINT DDL statement I had to use the exact machine domain name for the SITE operand, rather than the default ('*' which means "use all machine names not otherwise reserved"). Oh.

That had everything to do with my input to sp_reserve_http_namespace. Using:

sp_reserve_http_namespace N'http://*:80/mydir'

instead, set things up so that I could use '*' as a SITE operand. Whew.

Back to XML and schema collections in SQL Server 2005...

After my experiences with Office 2003 schemas yesterday (thanks for getting me going, Dan), last night I went out onto the Internet, prowling around at 1-2am looking for XML schema repositories. The most likely candidates are mostly gone or inactive. Microsoft's http://www.biztalk.org now links to http://www.microsoft.com/biztalk, information about the Biztalk product. No schema repository here, that I could see. James Tauber's http://www.schema.net has a "returns soon" sign on its digital door. The XML.ORG Registry seems to have the most, although less than I'd expected. There were actually some with namespace URI's for in 1999 XMLSchema namespace, and some DTDs too. Hmm...

Why all this interest and schemas and schema repositories? I'm looking for some non-trivial XML schemas (preferably interlocked sets of schemas, ie. with imports) along with exemplar documents, to try out with SQL Server 2005's CREATE XML SCHEMA COLLECTION and static-typed XQuery. If you know of any, send me links. Should make for some enjoyable times on planes in the near future, like puzzle books.

And what does this have to do with Peaches en Regalia? Once upon a time, in a "previous life", I wrote a program called SMUS2T that converted Amiga SMUS music format to a format for a popular sequencer program, Dr T's KCS. SMUS2T is so obscure that a Google search returns no hits. I then set out to find use cases in test data to break it, naturally. Settled on Peaches en Regalia (by Frank Zappa) because, if you've ever heard it, it contains almost every musical sequence that my spec had. Quintuplets, septuplets, perfect test data. And I liked listening to it, at least I did for the first 50 iterations or so. Two days later, the program could convert it and all 4 tracks played synchronized from beginning to end. And it could convert Yusef Lateef tunes correctly as well.

So now I'm looking for test data again. XML schema collections this time. To test something new. I'm sitting here listening to the digitally remastered version of Peaches from the Hot Rats album, sounds different after he re-produced it for the CD medium. Although I think my favorite track is now "The Gumbo Variations".

After lots of interesting (but fruitful) experiences with the XQuery engine in SQL Server 2005, I decided that I'd rather have an XML schema collection for everything than put [1] (that's the numeric predicate meaning "item 1 using 1-based ordinals") everywhere in the XQuery that I needed a singleton. It's a good idea to use "typed XML" anyway. Dan Sullivan thought a good non-trivial use-case would be Office 2003 docs and went forth to create SQL Server 2005 XML schema collections from the publically available schema documents.

First thing he ran into is that specifying processContents="lax" now causes an error when cataloging the XML schema collection. It used to catalog without error and then use processContents="skip" behavior instead in beta1. The Office schemas use this construct, so he's created some code to change it to processContents="skip" so SQL Server 2005 would "like" it. We've run into a couple more cataloging problems, repeated attributes on the same element, etc... Wonder where you report Office schema problems.

If you're wondering what this is about technically, pulling out my favorite XML Quick Reference (by Aaron and Gudge), they define processContents (on xsd:any or xsd:anyAttribute) as "Specifies whether a schema processor should validate the elements/attributes appearing in place of the wildcard". The default is "strict", meaning validate them. Lax means the schema processor *should* (but is not required to) validate if schema information is available. Skip means a schema processor should not validate these.

BTW guys, when's Essential XML Quick Reference Volume 2 due out? There *lot's* more specs now. Maybe you could publish an Essential XML Reference Encyclopedia?

With the introduction of the XML data type (using an XML data model) comes there's a choice of pulling XML out of the database as a stream or column. Using an XML data type as a column type will get you a column of type XML. That's SqlXml to you in .NET. SELECT ... FOR XML gets you a stream. In SQL Server 2005 there a new keyword TYPE on SELECT ... FOR XML that makes the stream into an XML type. Some/most tools make the stream look like a column with up to a size of 4k per "row". SSMS in SQL Server 2005 does better with it than Query Analyzer.

There's different APIs for these too. In ADO.NET there is ExecuteXmlReader vs ExecuteReader/ExecuteScalar. In OLE DB you use a COM IStream implementation, there was a Stream class added to ADO classic. Although you can use the either API for stream or column, it appears to be a stretch to use “other one“ in both cases. As an example, using a 'SELECT xmlcol from xmltab' with ExecuteXmlReader only gets you the XML in the first row. As you'd expect, if you remember ExecuteXmlReader returns a singleton like ExecuteRow in ADO.NET 2.0 does. At this point, I'm recommending using the "right" API for the representation. And you can always translate to a string. And, using the client XML APIs, any of the supported XML data models (such as document or Infoset).

SQL Server 2005's HTTP endpoint functionality uses http.sys, the HTTP stack in the kernel, to do it's processing. Until lately Windows Server 2003 was the only OS to implement http.sys and HTTP endpoints only worked when SQL Server 2005 ran on this OS.

Windows XP SP2 shipped with http.sys and HTTP endpoints are now enabled on this OS as well. There is a difference between the two OSes that makes this feature subtely different on each one. The difference is in the version of IIS that runs on each OS. If you are running IIS at the same time, XP SP2 runs IIS 5.1 and Windows Server 2003 runs IIS 6.0. The difference is that IIS 5.1 does not use http.sys even if it is available and therefore grabs all of the HTTP traffic on whatever ports it's listening on. In most cases this is port 80 and/or 443. This interferes with SQL Server HTTP endpoints trying to use these ports. Dan Sullivan confirmed this when running the class on his laptop running Windows XP SP2 last week.

There are a couple of ways to get around this. You can specify a port other than port 80/443 for SQL Server's HTTP endpoints and run both endpoints and IIS at the same time. Another workaround is to shutdown IIS and only run SQL Server's HTTP endpoints.

Some folks have asked why I appear to use the XQuery data() function on occasion, but not always, when I'm doing demos. If you want the atomic value of an attribute, rather than the attribute node itself, data() is your friend. data() makes an atomic value from any XQuery item, but with attribute nodes it's particularly useful. SQL Server's XQuery functions don't allow returning bare attributes at the root level (bare text nodes are allowed), and the value function is always looking for a singleton atomic value. The data() function helps in these situations; here's an example:


declare @x xml
set @x = '<foo bar="baz"/>'
-- returns error: "Attribute may not appear outside of an element"
select @x.query('/foo/@bar')


declare @x xml
set @x = '<foo bar="baz"/>'
-- returns baz
select @x.query('data(/foo/@bar)')

Using the data() function is different from using the string() function. data() takes a sequence of items (nodes or atomic values) and atomizes them, ie. returns a sequence of atomic values. string() returns the string value of a single item.  Here's an example comparing data() to string():

declare @x xml
-- XML fragment
set @x = '<x>hello<y>world</y></x><x>again</x>'
select @x.query('data(/*)')

returns a sequence of two string values:
helloworld again


declare @x xml
-- XML fragment
set @x = '<x>hello<y>world</y></x><x>again</x>'
select @x.query('string(/*)')

returns a static typing error, because string() requires a singleton or empty sequence as input

declare @x xml
-- XML fragment
set @x = '<x>hello<y>world</y></x><x>again</x>'
select @x.query('string(/*[1])')

returns a single string value:
helloworld

I'll have more to say about static typing in SQL Server 2005 in the future. It can seem like it makes simple queries, where you “know” the data but don't use a schema harder to write, but I'm getting to like it.

I've been trying out the new session support in HTTP endpoints. You manipulate this through SOAP headers, and looking through the list of SOAP headers supported allowed me to understand this feature from an entirely different perspective. It looks like the session support makes this into more of an alternate client stack than a mechanism to serve web services on the internet or (more likely) intranet.

The HTTP endpoints session support is meant to overcome the limitation that HTTP is a stateless protocol, while TDS/SNI/Netlibs is a stateful one. That is, you normally login to SQL Server perform multiple operations and logoff. With HTTP endpoint sessions, you can do this. There are SOAP headers for "connection string parameters" like initial catalog and application name, there are initiate (login) and terminate (logoff) attributes. There are headers exposing concepts like database transactions and prepared statements. There are even headers for new SQL Server 2005 features like database mirroring partner (for failover) and notification request (for query notifications).

I'll post some samples to show what's possible. They use the SoapHeader definition ("include file") starter (tweaked a bit) from the BOL. Bear in mind that if you define the endpoint on your own machine (you can't come over to my house and use my machine, sorry) you need to update the web reference. Doing so wipes out the SOAP Headers that you need in the proxy class (Reference.cs) to make this work. So don't forget to add 'em back. Save the original code in case you forget. For a more detailed explanation than that, refer to BOL or come to the class and I'll show you how. Or come to Windev and Dan Sullivan will, he's doing the talk on this feature. And have a think about why this is useful, maybe I'll discuss this next.

As Mark Fussell will tell you, one of the coolest things in System.Xml 2.0 is the static creator methods on XmlReader and XmlWriter. It made his top 10 list, albeit at number 10.

I've been looking into how the client side handles SQL Server 2005's XML data type and was bemoaning the fact that System.Data.SqlTypes.SqlXml had a CreateReader method that probably didn't set what I wanted set - ConformanceLevel.Fragment. Sounded like just the ticket for SQL Server 2005. Checked with Reflector, and lo and behold, they *do* use XmlReader.Create with ConformanceLevel.Fragment. Very cool, and the information was helpful to Kent Tegels on the beta newsgroup as well, just this morning. My research timing karma seems pretty good lately.

Now, I want an overload that looks like SqlXml.CreateReader(XmlReaderSettings). And something that handles sequences of atomic types. Maybe in .NET 2.0 beta2. Hey Mark... maybe I'll hit him up for it at WinDev.

About a month ago, there was another chapter in the long discussion about DataSets and web services. An old cohort Scott Hanselman fired off the first salvo, which was taken up by Ted Neward. Doug Purdy responds that Purchase Orders are root of all evil and lots of other chimed in. I'm just catching up...

This comes up again as I work with SQL Server 2005 web services. They too produce “SqlRowset == Schema + Any" in WSDL. Both XML and object afficianados tend to dislike this, but it exactly reflects the relational resultset, rowset, whatever you'd like to call it.

Relational results are columns and rows, where each row contains the same number of columns and each column contains the same data type for each row. Been around for a while now. The resultset is a "generic container class" or special 2-dimensional array if you want to think in those terms, in which each instance can contain different types, but the shape is rectangular. Resultsets, looking at an Open Data Services program which makes the protocol explicit, are preceeded by a sequence of describe packets that contain the name, data type, and other information about each column. So "Schema + Any" means exactly that. *Instance* schema, followed by the instance it describes. Corresponds quite nicely with relational databases, where the majority of corporate data is stored.

The problems stem from the fact that lots of web services toolkits, that are doing XML-to-Object mapping, don't recognize this paradigm. They are looking for an individual (static) schema, not an instance (dynamic) schema. In addition, the XML Schema spec states that how you find a schema from an instance of an XML document is implementation dependent. It mentions a hint (xsi:schemaLocation) and some location strategies for schemas with namespaces that are URLs, but says processors don't have to support the hint. Most WSDL that uses "Schema + Any" doesn't provide the hint. With Microsoft XML processors, this may be because "Schema + Any" is part of the implementation, an "implementation dependent" location strategy.

There's a few ways to resolve this, if you're looking for strongly typed objects at compile time. One way is strongly-typed DataSets, which fetch out the information at compile time, make a DataSet wrapper class, and, for web services, write WSDL that amounts to "Any from a specific namespace", and import the namespace schema, which includes the strong type. Another way is to write your own WSDL for each resultset.

If you write a stored procedure in SQL Server that looks like this:

CREATE PROCEDURE getauthors
AS
SELECT au_id, au_fname, au_lname FROM authors

you know exactly what that resultset will look like. Unless the DBA changes the column type, in which case you'll have other problems. Even if you use the "SELECT * FROM ..." concept (which is not usually recommended) this works unless the DBA adds or deletes a column, or changes the order of definition. I haven't been able to find a metadata table in SQL Server (or in the ANSI/SQL spec) that lists the shape of all resultsets returned by stored procedures (there is one for *parameter* information) , so you have to use your knowledge of the resultset to handcraft the WSDL. In ADO.NET, you can use the DataSet's "WriteXmlSchema" method as a starter. Or use "SET FMTONLY ON;SET NO_BROWSETABLE ON;EXECUTE...". SQL Server 2005 web services that map stored procedure invocation to web service methods will let you specify a custom WSDL-generating stored procedure. Statically store the metadata (instance schemas) in the database or dynamic generate them at WSDL generation time.

BTW, this mapping of resultsets to XML schemas amounts to relational-to-XML mapping. So web service toolkits can accomplish their XML-to-object mapping. I'll return to this theme in future...

I was checking some of the SQL Server 2005 features on the way over on the plane. The latest build that I have is SQL Express build, fairly recent. Many of the XQuery features have been updated to Nov 2003 XQuery spec. Very cool. But comments wasn't one of them. A nit, I know.

The {-- a comment --} style comments still work when using SQL Server XQuery but the new style, smiley-face comments (: a comment :) don't. They generate a syntax error.

Now I *liked* the old style comments (don't like to see my code grin at me, call me sensitive), but the smiley-face ones are part of the spec revision, unless I'm missing something. And they are supported in the client-side System.Data.SqlXml's XQueryCommand dialect of XQuery. That's even more puzzling because the two dialects (SQL Server and System.Data.SqlXml) are synchronized in other ways. For example, System.Data.SqlXml no longer supports the XQuery “LET” verb and has compile-time strong type checking like SQL Server.

I lag in reporting things I find, because I know beta 2 is still a work in progress. Thought it would be added. All of the book examples have the smiley face comments, just do “find and replace” if you run 'em. Beta 2 still isn't out yet, maybe... :)

Theme design by Nukeation based on Jelle Druyts