Getting started with XML in SQL Server – PART 2 – Import XML files into your SQL Server

In order to parse XML data you need to have it ready in your SQL server. Since you will not always want to use SSIS or the Import Wizard let’s have a look at 2 different ways to import XML data into a table using TSQL.

 

1. OPENROWSET:

Probably the most common and easiest way to load any sort of BLOB data into your SQL server is the OPENROWSET function that comes with SQL server. To utilize it you need to ensure that it is enabled. Use the following code to enable it:

Once that is taken care of you can import any XML file residing on your machine into a table. Let’s assume you have an XML file in your C:\ drive in a folder called SQL with the filename Collection.xml. The code below will insert it into a temporary table which we create on the fly:

You can now query the column XML_Data_Column and display the file you just loaded.

 

2.  OLE Automation Stored Procedures

Not always does the XML data reside on your server or local machine, often times it needs to be retrieved from the internet using various APIs out there. Even though this is a little more complicated than using OPENROWSET there is no need to worry, SQL server got you covered. Using OLE Automated Stored Procedures you will be able to retrieve your data directly with a few lines of TSQL code. Once again we need to ensure that the functionality is enabled in our SQL server, to make sure it works use the code below:

In order to make the method call we need to do a few things:

  1. we need to create an object
  2. we need to open this object
  3. we need to let the object know what exactly we want to do
  4. we need to send the request
  5. optionally we can check the response status to see if our request was successful
  6. finally we need to save the response to a table.

Now this seems to be an awful lot to do, but it allows us very conveniently to do everything straight out of SQL server without having to set up a retrieval process outside of our code, so jumping through a few extra hoops might be worth it.

Let’s have a look at an example. The code below makes an API call to the website www.boardgamegeek.com and retrieves a list of all board games I own and have actually found the time to play. Since the site usually does not send the response we are looking for in the first go – they send an acknowledgement response first – we need to put the request in a loop until we get the output we really want. The code looks like this:

You can filter out any responses that do not have a HTTPStatus of 200 so you won’t save any trash in your DB, but for this example I decided it makes sense to leave it in there.

And there you have it. 2 ways to import XML using TSQL into your database. I am sure there are other ways (using a linked server connection comes to mind *ugh*) but for now these 2 will have to do.

Feel free to leave any feedback in the comments below and look out for Part 3 where I’ll show you how to get XML into your SQL server using SSIS.

 

Getting started with XML in SQL Server

Part 1 – Parsing XML

Part 3 (not available yet) will use SSIS to achieve the same results as in Part 1 & 2

Part 4 (not available yet) will show you how to use XQuery and create XML files out of SQL Server

Getting started with XML in SQL Server – Part 1 – Parsing XML

A lot of people don’t like XML. It has a lot of overhead, it’s not easy to read for those unfamiliar with the structure and parsing it is a pain in the ass. Well, let’s see what SQL Server offers to load it anyways.

 

Let’s use this simple XML as an example:

This example represents the data of 2 items. Each item has 3 data elements and one attribute. It is important to know the difference between elements and attributes as they are parsed differently. To keep it simple this example does not use a namespace.

One thing you will notice quickly when dealing with XML is that you need to know the structure as well as the datatypes, you will not be able to do much without that knowledge.

The code below assigns our sample data to a variable with the datatype XML, a SELECT statement displays the assigned value of the variable:

 

XML output columns usually are represented as hyperlinks in SSMS (XML columns can – just like VARBINARY – hold quite a lot of data, so think twice before including them in your queries directly). If the XML is not too big you can click on it and a new tab will pop up with the XML code inside:

XML_in_SSMS

XML displayed in SSMS

 

Notice something? SQL Server got rid of the XML declaration. Keep this in mind when dealing with XML files, sometimes SQL Server refuses to convert a string to XML because of the declaration. In those cases you will have to get rid of it. A simple statement using STUFF in combination with PATINDEX will do the trick:

 

No how do we parse our variable? SQL Server offers different ways to parse XML in SQL server, see 2 different approaches below.

 

1. Using CROSS APPLY and the value() method:

If you execute this statement and have a look at the result you will hopefully see why more than one CROSS APPLY is needed to actually parse the data. Value() requires to specify an ordinal number to ensure it will return a singleton, this is done in square brackets. [1] will return the first available value matching the expression used, [2] the second and so on. To flatten the output a CROSS APPLY for each child element with multiple elements will do the trick, if only the root node is referenced the query will always return the first value taken from the first match for all results (take a look at the output on the columns id_without_crossaply / name_without_crossapply / name_with_one_crossapply).

As mentioned earlier attributes and elements are treated differently. To retrieve the value of an attribute you will add the @ sign to the name of the attribute. Elements will require the exact path, their value is retrieved using text().

 

2. Using OPENXML:

This approach is quite different. The system stored procedure sp_xml_preparedocument is used to create an object, referenced by an integer that is stored in a variable. This integer is required when using OPENXML to reference the right rowset provider. To access the data a WITH clause is used to define the output columns, allowing the use of aliases or the given name of attributes and elements. If you chose to use the same name for attributes as used in the XML you do not need to specify the path to the attribute (‘./@id’ is not required in this case). Elements however require a path and text() is used once again to retrieve the value.

Note that you should always destroy the object using sp_xml_removedocument once the XML is no longer required to free system resources.

In this example we chose to directly use the /items/item path, which allows to flatten the XML directly. See how the result changes if you change the path to /items (adding /item to the path of each element in the WITH clause): only the first row will be displayed.

 

Which approach is better? I don’t know. Each XML will be different, these example will not work for every XML out there. But at least now you have the basic means to start tackling XML.

 

Getting started with XML in SQL Server

Part 2 – Import XML files into your SQL Server

Part 3 (not available yet) will use SSIS to achieve the same results as in Part 1 & 2

Part 4 (not available yet) will show you how to use XQuery and create XML files out of SQL Server