Skip to content

Dynamic Join and Compare with TSQL and XQuery

  • by

This topic was inspired by a simple process that seems common on the database world, trigger based auditing. Now, their are probably better ways to do change auditing, depending on the DBMS you are using, but lets ignore that as that’s not the point of this article. The focus of this article will be about using XML and XQuery to eliminate some dynamic SQL. A big drive in programming is to be dynamic, meaning that the same chunk of code should work for many situations without modification and automatically handle schema changes applied to the affected table. This isn’t always easy, and in the case of a generically coded audit trigger, would normally require executing dynamic SQL strings using sp_executesql, which isn’t too bad if you can parameterize everything. However, if you want to join the deleted and inserted tables in a trigger, that’s problematic as you can’t parameterize JOIN clause (since you are trying to be generic, you don’t know the key columns). Apart from any security issues with concatenating SQL strings, the big issue is that you will hit a compilation performance hit each time you run your generated string. Also, in my case I was looking at doing a column wise audit, which meant looping along the table columns, and even more SQL concatenation. Like the man on the TV says, “There has to be a better way!”

Since this wasn’t a real problem that needed solving (just a though inspired by something I saw), I used this as exercise to expand my knowledge. I’m not sure how exactly I decided to look at XML as an alternative, other than it was something I had only just touched on a few times in TSQL and didn’t fully understand. It seemed like it could contain all the magic I needed. I’ll include a full example script at the end, but I will walk through each of the steps below. As a disclaimer, I am not an XQuery expert, and I learned a lot by creating the below code, so if you feel any of my explanations are wrong, let me know in the comments please.

First thing, for this example and proof of concept, we just need a example table with a primary key. We won’t put any data in it, we will just populate example inserted and deleted tables to simulate the trigger (or any other situation where you may have before/after data to compare, this isn’t limited to just trigger use). The main goals we are working towards are a dynamic join, and a dynamic column to row transpose operation.

CREATE TABLE TempKeyTableBefore (Code int PRIMARY KEY,
                                    Value varchar(30),
                                    Description varchar(255),
                                    SortOrder int)
                                    

That’s our example table. We’ve created a table, because we wan’t to get the primary key information from the INFORMATION_SCHEMA tables. The key information will be needed to perform the join.

/* Get Dynamic Primary Key for Join */
DECLARE @PKey AS varchar(max) = ''
SELECT @PKey =  @PKey + '<' + u.COLUMN_NAME + ' />'
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk 
JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE u 
   ON pk.CONSTRAINT_TYPE = 'PRIMARY KEY'
  AND u.TABLE_NAME = pk.TABLE_NAME
  AND u.CONSTRAINT_NAME = pk.CONSTRAINT_NAME
WHERE pk.TABLE_NAME = 'TempKeyTableBefore'
ORDER BY u.ORDINAL_POSITION 

/* Change our value to XML */
DECLARE @PKeyXML XML = CAST(@PKey AS XML)

I tried to eliminate any concatenation and casting to XML, but this was the easiest way to get the key column names. With @PKey initialize, we use the SELECT = method to concatenate the column names together as XML tags, and afterwards we cast that as actual XML type variable.

We will also need our example before(deleted) and after(inserted) data for the rest of the experiment.

                  
DECLARE @INSERT TABLE (Code int PRIMARY KEY,
                                    Value varchar(30),
                                    Description varchar(255),
                                    SortOrder int)                                    
                                    
DECLARE @DELETE TABLE (Code int PRIMARY KEY,
                                    Value varchar(30),
                                    Description varchar(255),
                                    SortOrder int)   
                 
--Prepare Test Data                                  
insert into @INSERT
      values (1, 'ABC', 'Agency', 1)
      , (2, 'ABC', 'Agency', 1)
insert into @DELETE
      values (1, 'XYZ', 'Agency', 2)
      , (2, 'XYZ', 'Agency Z', 5)

With our example tables, we are going to transform the data to XML documents, using the FOR XML AUTO syntax. FOR XML causes the results of a query to be create as XML data, and the AUTO basically means that the schema is automatic based on the data.

/* Create Ins/Del XML */
declare @InsXMLValues as XML = (select * from @INSERT ins                                                                       
                        FOR XML AUTO)
 
declare @DelXMLValues as XML = (select * from @DELETE del                                                                       
                        FOR XML AUTO)

This generates an XML document like this. Each row is an XML node named after the table alias, and each column is an attribute. There is no root node.

<ins Code="1" Value="ABC" Description="Agency" SortOrder="1" />
<ins Code="2" Value="ABC" Description="Agency" SortOrder="1" />

With the data, now we need to include the key information, so we can do the join in XQuery. This is done by using the modify XML function provided by TSQL. This can be called where TSQL expects data to be changed, such as an UPDATE or SET statement.

 select @InsXMLValues 
/* Append Key info To XML data so we can gernerate a Key String using XQuery */
declare @InsValuesWithKey  XML = '<root><values /><pkey /></root>'
set @InsValuesWithKey.modify('insert sql:variable("@InsXMLValues") into (/root/values)[1]')
set @InsValuesWithKey.modify('insert sql:variable("@PKeyXML") into (/root/pkey)[1]')

/* Append Key info To XML data so we can gernerate a Key String using XQuery */
declare @DelValuesWithKey XML = '<root><values /><pkey /></root>'
set @DelValuesWithKey.modify('insert sql:variable("@DelXMLValues") into (/root/values)[1]')
set @DelValuesWithKey.modify('insert sql:variable("@PKeyXML") into (/root/pkey)[1]')

So in the above code we are first setting our variable to a basic XML document to put our data into, and using the insert statement inside modify to put the data and key into the document. The sql:variable syntax allows us to grab data from other variables and use it in the XQuery, as by default only the XML data of the column/variable we are in the context of is avaliable. The into clause specifies the node we are inserting into. Using XPath syntax we use (/root/values)[1] as the location for our data. This means that from the root(/), get the node root then the node values and use the first one you find [1]. The same is done to insert the Key names into the document. This results in the below XML.

<root>
  <values>
    <ins Code="1" Value="ABC" Description="Agency" SortOrder="1" />
    <ins Code="2" Value="ABC" Description="Agency" SortOrder="1" />
  </values>
  <pkey>
    <Code />
  </pkey>
</root>

Now here’s were it starts getting fun with XQuery. Using the TSQL query function, we can run XQuery to generate new XML based on our existing XML. As a side note, this function only accepts string literals. That means no dynamic XQuery from columns or variables, you would have to do a dynamic SQL statement to achieve that. We are going to use this query to grab our data and append a key value to it by joining the attributes from the data nodes, to the primary key column name nodes. This makes use of FLWOR syntax to provide looping functionality that we need to pull this off. It’s also important to note here that XQuery is a functional language, which means you have to think only in terms of loops and recursion. The syntax here I struggled with the most, as it wasn’t always clear from the error messages what I was doing wrong.

/* Transform the XML into values with a Key string */
set @InsValuesWithKey = @InsValuesWithKey.query('for $r1 in /root/values/*
                                                return <ins> {$r1}
                                                      <keyval>{for $a1 in $r1/@*,
                                                              $ka in /root/pkey/*
								where(local-name($a1) = local-name($ka) )
								  return concat("''", data($a1), "'',")
								}</keyval>                      
                                                      </ins> ')
                                                     
/* Transform the XML into values with a Key string */
set @DelValuesWithKey = @DelValuesWithKey.query('for $r1 in /root/values/*
                                                return <del> {$r1}
                                                      <keyval>{for $a1 in $r1/@*,
                                                              $ka in /root/pkey/*
								 where(local-name($a1) = local-name($ka) )
								  return concat("''", data($a1), "'',")
								}</keyval>    
                                                      </del> ')
                                               

The above is the same thing twice, once for each set of data. To summarize what the XQuery is doing, we are iterating using the $r1 variable to loop across all the nodes underneath /root/values/, which will be all of the ins or del nodes. We are then returning a a new del/ins node, that contains $r1. The curly braces are used to show that you are changing back to an expression from literal XML tags. Without them, it would return the literal string $r1 to the XML output. Now, we attach our keyvalue node that will contain the a generated key string for the row based on its primary key columns. This is done by using a double for loop across all the attribute nodes @* of $r1 as $a1 and all of the nodes that are below /root/pkey/ as $ka. The were clause of the loop specfies the criteria for returning the data in the loop. In this case we are using the local-name function to match the attribute name to the primary key node name. These means that for every column that is part primary key, we will enter the loop body. Inside the loop we use the concat function to create a string that will be a single quote and comma delimited list of values that will be a single unique key value string for the row data. The data function is used to return the value of the attribute node. This results in the below XML.

<ins>
  <ins Code="1" Value="ABC" Description="Agency" SortOrder="1" />
  <keyval>'1',</keyval>
</ins>
<ins>
  <ins Code="2" Value="ABC" Description="Agency" SortOrder="1" />
  <keyval>'2',</keyval>
</ins>

Now that we have a unique key value inserted into the before and after data, we can do a join to match up each delete with it’s insert.

/* Using the generated Key Value, join the Ins to the Delete table */
declare @Join XML = (select m.query('.'), n.query('.')
                        from @InsValuesWithKey.nodes('*') T(m), 
				@DelValuesWithKey.nodes('*') U(n)
                              WHERE m.value('(./keyval/text())[1]', 'varchar(max)') =
                                          n.value('(./keyval/text())[1]', 'varchar(max)')
                        FOR XML AUTO)

The nodes function basically creates a table where each row is the individual node returned by the query expression. These tables must be aliased and the expression T(m) creates it as the table alias T with column m. Poorly named, I know. So, we are basically joining two tables where each row is the ins or del node that contains the child ins or del node and the keyval node. The value function is used to retrieve a node value based on an XPath, and requires a second argument for what value to cast to. In this case we are saying that from the current node (.) get me the keyval node below it, and then the text() node below that, and return the first result [1]. In this case I’m doing a where clause join, and will only detect changes. You would have to alter there join in order to return new records or deleted records. From here I simply select the current node for both rows and use FOR XML AUTO to get yet another XML document. The resulting XML would look like this.

<T>
  <ins>
    <ins Code="1" Value="ABC" Description="Agency" SortOrder="1" />
    <keyval>'1',</keyval>
  </ins>
  <del>
    <del Code="1" Value="XYZ" Description="Agency" SortOrder="2" />
    <keyval>'1',</keyval>
  </del>
</T>
<T>
  <ins>
    <ins Code="2" Value="ABC" Description="Agency" SortOrder="1" />
    <keyval>'2',</keyval>
  </ins>
  <del>
    <del Code="2" Value="XYZ" Description="Agency Z" SortOrder="5" />
    <keyval>'2',</keyval>
  </del>
</T>

The T node comes from the poor table alias I assigned, but we will roll with that. So, now we have matched before/after (deletes/inserts) values. Now the second part of this was to transpose the columns to rows, to to a column wise result. This is done with the below XQuery.


--Transpose the results, Each column becomes a row with before/after value                                                                           
declare @Transpose as XML                      
select @Transpose =  @Join.query('<changes> {for $r1 in /*
                                            return <change>
                                                        <Key>{ data($r1/ins/keyval/text()) }</Key>
                                                              {for $a2 in $r1/del/del/@*,
                                                                    $a1 in $r1/ins/ins/@*
									where(local-name($a1) = local-name($a2)
										and
										$a1 != $a2)
									return <item>
                                                                              <ColumnName>{ local-name($a1) }</ColumnName>
                                                                              <OldValue>{ data($a1) }</OldValue>
                                                                              <NewValue>{ data($a2) }</NewValue>                                                                                                                 
                                                                         </item>}
											</change>
                                      }</changes>')    

So for this one we start by looping on each node under the root of the XML as $r1. We are returning each of these under a change node, which will contain a key node, and several item nodes for each column that has changed. One thing I have noticed, is that context can affect the performance of an XQuery quite a bit. When watching these queries using SQL Server Profiler, reorganizing your queries to avoid expressions that return to the root of the document and instead work off of the current context can help speed them up quite a bit. For our item loop, we are going to do another double loop to iterate across all the attributes for the before and after values using the $r1/del/del/@* and $r1/ins/ins/@* XPaths. The where condition again uses the local-name function to match the attribute names, and since we are only interested in changes, we only wan’t to look at items where $a1 is not equal to $a2. In this case we return an item node that contains the column name as well as the old and new value. This results in the below XML.

<changes>
  <change>
    <Key>'1',</Key>
    <item>
      <ColumnName>Value</ColumnName>
      <OldValue>ABC</OldValue>
      <NewValue>XYZ</NewValue>
    </item>
    <item>
      <ColumnName>SortOrder</ColumnName>
      <OldValue>1</OldValue>
      <NewValue>2</NewValue>
    </item>
  </change>
  <change>
    <Key>'2',</Key>
    <item>
      <ColumnName>Value</ColumnName>
      <OldValue>ABC</OldValue>
      <NewValue>XYZ</NewValue>
    </item>
    <item>
      <ColumnName>Description</ColumnName>
      <OldValue>Agency</OldValue>
      <NewValue>Agency Z</NewValue>
    </item>
    <item>
      <ColumnName>SortOrder</ColumnName>
      <OldValue>1</OldValue>
      <NewValue>5</NewValue>
    </item>
  </change>
</changes>

The last part is just to select this change data in table format, so it could be inserted into a logging table for later review.

select n.value('(./ColumnName/text())[1]', 'varchar(max)') AS ColumnName,
         n.value('(./OldValue/text())[1]', 'varchar(max)') AS OldValue,
         n.value('(./NewValue/text())[1]', 'varchar(max)') AS NewValue ,
         n.value('(../Key/text())[1]', 'varchar(max)') AS KeyValue
            FROM @Transpose.nodes('/changes/change/item') T(n)         

Again, we use the nodes function to return a table based on the nodes we want to select. In this case we specify an XPath that takes us right to each item node. We then use the value function again along with a relative XPath to get each of the columns we want. The . refers to the current node, which will be item, and the .. refers to the parent of the current node with will be the change node. This is important as we stuck the key value (used to identify what entity the the change applied to) in the change node above the item nodes. This results in the below output table.

ColumnName OldValue NewValue KeyValue
Value ABC XYZ ‘1’,
SortOrder 1 2 ‘1’,
Value ABC XYZ ‘2’,
Description Agency Agency Z ‘2’,
SortOrder 1 5 ‘2’,

And that’s it. I learned quite a bit writing that, and I hope you learned something from reading it. Below is the complete example code, which was tested on SQL Server 2008 Express. Enjoy.


/* Create example table to read Primary Key from */
IF OBJECT_ID('TempKeyTableBefore') IS NOT NULL
	DROP TABLE TempKeyTableBefore
	
CREATE TABLE TempKeyTableBefore (Code int PRIMARY KEY,
                                    Value varchar(30),
                                    Description varchar(255),
                                    SortOrder int)
                                    
                                    
                                    
DECLARE @INSERT TABLE (Code int PRIMARY KEY,
                                    Value varchar(30),
                                    Description varchar(255),
                                    SortOrder int)                                    
                                    
DECLARE @DELETE TABLE (Code int PRIMARY KEY,
                                    Value varchar(30),
                                    Description varchar(255),
                                    SortOrder int)   
                 
--Prepare Test Data                                  
insert into @INSERT
      values (1, 'ABC', 'Agency', 1)
      , (2, 'ABC', 'Agency', 1)
insert into @DELETE
      values (1, 'XYZ', 'Agency', 2)
      , (2, 'XYZ', 'Agency Z', 5)
 
/* Get Dynamic Primary Key for Join */
DECLARE @PKey AS varchar(max) = ''
SELECT @PKey =  @PKey + '<' + u.COLUMN_NAME + ' />'
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS pk 
JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE u 
   ON pk.CONSTRAINT_TYPE = 'PRIMARY KEY'
  AND u.TABLE_NAME = pk.TABLE_NAME
  AND u.CONSTRAINT_NAME = pk.CONSTRAINT_NAME
WHERE pk.TABLE_NAME = 'TempKeyTableBefore'
ORDER BY u.ORDINAL_POSITION 

/* Change our value to XML */
DECLARE @PKeyXML XML = CAST(@PKey AS XML)

/* Create Ins/Del XML */
declare @InsXMLValues as XML = (select * from @INSERT ins                                                                       
                        FOR XML AUTO)
 
declare @DelXMLValues as XML = (select * from @DELETE del                                                                       
                        FOR XML AUTO)

/* Append Key info To XML data so we can gernerate a Key String using XQuery */
declare @InsValuesWithKey  XML = '<root><values /><pkey /></root>'
set @InsValuesWithKey.modify('insert sql:variable("@InsXMLValues") into (/root/values)[1]')
set @InsValuesWithKey.modify('insert sql:variable("@PKeyXML") into (/root/pkey)[1]')

/* Append Key info To XML data so we can gernerate a Key String using XQuery */
declare @DelValuesWithKey XML = '<root><values /><pkey /></root>'
set @DelValuesWithKey.modify('insert sql:variable("@DelXMLValues") into (/root/values)[1]')
set @DelValuesWithKey.modify('insert sql:variable("@PKeyXML") into (/root/pkey)[1]')
 

 
/* Transform the XML into values with a Key string */
set @InsValuesWithKey = @InsValuesWithKey.query('for $r1 in /root/values/*
                                                return <ins> {$r1}
                                                      <keyval>{for $a1 in $r1/@*,
                                                              $ka in /root/pkey/*
																 where(local-name($a1) = local-name($ka) )
																		  return concat("''", data($a1), "'',")
																		  }</keyval>                      
                                                      </ins> ')
                                                     
/* Transform the XML into values with a Key string */
set @DelValuesWithKey = @DelValuesWithKey.query('for $r1 in /root/values/*
                                                return <del> {$r1}
                                                      <keyval>{for $a1 in $r1/@*,
                                                              $ka in /root/pkey/*
																 where(local-name($a1) = local-name($ka) )
																		  return concat("''", data($a1), "'',")
																		  }</keyval>    
                                                      </del> ')
                                               
                                                     
 
/* Using the generated Key Value, join the Ins to the Delete table */
declare @Join XML = (select m.query('.'), n.query('.')
                        from @InsValuesWithKey.nodes('*') T(m), 
								@DelValuesWithKey.nodes('*') U(n)
                              WHERE m.value('(./keyval/text())[1]', 'varchar(max)') =
                                          n.value('(./keyval/text())[1]', 'varchar(max)')
                        FOR XML AUTO)
                       

                       
--Transpose the results, Each column becomes a row with before/after value                                                                           
declare @Transpose as XML                      
select @Transpose =  @Join.query('<changes> {for $r1 in /*
                                            return <change>
                                                        <Key>{ data($r1/ins/keyval/text()) }</Key>
                                                              {for $a2 in $r1/del/del/@*,
                                                                    $a1 in $r1/ins/ins/@*
																  where(local-name($a1) = local-name($a2)
																			  and
																		  $a1 != $a2)
																		return <item>
                                                                              <ColumnName>{ local-name($a1) }</ColumnName>
                                                                              <OldValue>{ data($a1) }</OldValue>
                                                                              <NewValue>{ data($a2) }</NewValue>                                                                                                                 
                                                                         </item>}
											</change>
                                      }</changes>')          
 
        
--Select results                   
select n.value('(./ColumnName/text())[1]', 'varchar(max)') AS ColumnName,
         n.value('(./OldValue/text())[1]', 'varchar(max)') AS OldValue,
         n.value('(./NewValue/text())[1]', 'varchar(max)') AS NewValue ,
         n.value('(../Key/text())[1]', 'varchar(max)') AS KeyValue
            FROM @Transpose.nodes('/changes/change/item') T(n)                           
 
/* Cleanup */ 
DROP TABLE TempKeyTableBefore

Leave a Reply

Your email address will not be published. Required fields are marked *

 characters available