Wednesday, September 23, 2009

XML Default Namespaces and XPath Queries

The use of namespaces in XML schemas are useful and even necessary, but they cause a lot of confusion! This is compounded (confounded?) by the XPath API having different rules for applying namespaces than the rules applied to documents.

The biggest confusion for me came from the use of the default namespace. Use of a default namespace simplifies writing an XML document because it limits or eliminates the need to prefix element and attribute names. However, it places a bigger burden on the reader because the use of any namespace mandates the additional requirement of using a namespace manager in the API calls when executing queries. Any namespace that may be used in the document must be added to the namespace manager's map, including the default namespace. Be aware that the default namespace is not the same as no namespace, which is a vitally important principle of XPath queries.

Additionally, the query itself must include a namespace prefix for any and every element to be parsed. Regardless if the XML document qualifies the element to identify the namespace or not, the XPath query must qualify the elements. An unqualified element in the query will search the "no namespace" or "null" namespace, not the default namespace; XPath has no concept of a default namespace. This short blurb is found in the Microsoft documentation: XPath treats the empty prefix as the null namespace. So if your schema specifies a target namespace, an unqualified query element will never match anything in it.

What's the solution? Here are two.

1. Remove all namespaces from your XML document. For simple documents you don't plan on validating, this may be an acceptable alternative. But it's probably not a good idea for complex documents or documents where the construction will be validated; in these cases, the use of namespaces is almost mandated.

Sample XML document with no namespaces:

<?xml version="1.0" encoding="utf-8"?>
<parent>
  <child id="1">
    <item>Item 1</item>
    <item>Item 2</item>
  </child>
</parent>


Sample reader code:

use System.Xml;

XmlDocument doc = new XmlDocument();
doc.Load("/hasnodefault.xml");

// Query some elements. This is what you would naturally expect.
// Namespace prefixes cannot be used in these queries.
XmlNode root = doc.SelectSingleNode("/parent");
foreach (XmlNode node in root.SelectNodes("child/item"))
{
  // Do something with the child items.
  ;
}


2. Use a namespace manager when you are parsing your documents. If you use a default namespace, you must map it to some prefix (something other than an empty string); I will often use "default", which makes it obvious where I'm looking. Your XPath queries, unfortunately, will necessarily be more complicated, as you will have to include the namespace prefix on all elements you will be querying.

Sample XML document using a default namespace (xmlns=...):

<?xml version="1.0" encoding="utf-8"?>
<parent xmlns="http://tempuri.org/sample.xsd">
  <child id="1">
    <item>Item 1</item>
    <item>Item 2</item>
  </child>
</parent>


Sample reader code:

use System.Xml;

XmlDocument doc = new XmlDocument();
doc.Load("/hasdefault.xml");
XmlNamespaceManager nsmanager = new XmlNamespaceManager(doc.NameTable);

// Map the default namespace. Not optional.
// Note that the namespace URI is significant, not the prefix.
nsmanager.AddNamespace("default", "http://tempuri.org/sample.xsd");

// Query some elements.
// The namespace prefixes are required or the query will return null.
XmlNode root = doc.SelectSingleNode("/default:parent", nsmanager);
foreach (XmlNode node in root.SelectNodes("default:child/default:item", nsmanager))
{
  // Do something with the child items.
  ;
}


The short version of this story is: if you use namespaces in your document, even if it is a default namespace, you must qualify the elements used in your XPath queries and use a namespace manager with them.

Obviously, things can get more complicated than these simple examples. Understanding these fundamental concepts about namespaces is critical to maintaining your sanity.

For .NET users new to XML, the XPath queries (a W3C standard for addressing parts of XML documents) are implemented by the .NET framework's XmlNode.SelectNodes() and XmlNode.SelectSingleNode() methods and their related counterparts built on XPathNavigator and XPathExpression classes.

More information about XPath 1.0 is available at http://www.w3.org/TR/xpath.

Tuesday, September 15, 2009

Convincing Visual Studio 2005 that SqlClient is a valid namespace

While getting a project started to connect and use an SQL Server database from a mobile device, I ran into the same problem apparently many others have, too. You had to force Visual Studio to include some framework components so you were allowed to use them. I happened to be using Visual Studio 2005 (aka Visual Studio 8.0) to build an application for a .NET Compact Framework 2.0 platform that exchanged data with a Microsoft SQL Server 2005 database over a wireless network.

To execute a query directly against the remote database without using the SQL Compact (aka SQL Mobile aka SQL CE) SDK and without using dataset objects, required SqlConnection, SqlCommand, SqlParameter, and several other classes that existed in the System.Data.SqlClient namespace. Unfortunately you couldn't just type your using statements and have VS magically link them to your application. You expected Microsoft to make it automatic? No way. That's where the Add Reference... feature is needed and was a source of confusion (days!) in my early development with the VS .NET IDE.

Without adding a proper reference, the statement:

using System.Data.SqlClient;

caused the following compilation error and would not complete the build:

The type or namespace name 'SqlClient' does not exist in the namespace 'System.Data' (are you missing an assembly reference?)

Part of the problem is the multiplicity of .NETCF versions. As is standard with Microsoft, things get moved, renamed, and otherwise mangled between versions. In .NET CF 2.0, SqlClient is part of the System.Data namespace and the System.Data.dll library file; in later versions, there is a separate System.Data.SqlClient.dll. To allow this necessary namespace to be used in your project, you must add a reference to the System.Data.SqlClient component. The trick is learning a couple things about a listed component on the .NET tab: the "Runtime" column identifies the framework version it belongs to, not the "Version" column, that is for the version of the component itself. Also the "Path" column refers to the component as used by the IDE, not by the target device. The path does not even exist on the target device. There were a few clues, but putting them together to arrive at a coherent solution took experience or lots of trials & errors.

At the time I wrote this, the SqlClient namespace for .NET CF 2.0 was added by referencing the component named System.Data.SqlClient, version 3.0.3600.0, runtime version v.0.50727, whose (IDE) path was C:\Program Files\Microsoft Visual Studio 8\SmartDevices\SDK\\SQL Server\Client\v2.0\System.Data.SqlClient.dll. And before you point out my "typo", yes, that was the real path.

Adding this component as a reference finally allowed me to compile without VS coughing up the error.

Tuesday, August 4, 2009

Visual Studio Resource Designer

Visual Studio 2008 was designed to make your life easier.
Now that you're done laughing, it is actually a very nice environment, if you don't mind wasting hours working around a few big bugs and mind-twisting procedures for simple objectives. It takes me back to the good ole days of programming, where the words to live by were, "Save early, save often."
VS2008 is a great reminder that no non-trivial piece of software is perfectly bug-free. Although it tries to be helpful by automatically coordinating your editing to ripple the changes through to the other classes, files, and tool panels in your project, it is very irritating to add a static modifier to a method only to have the entire IDE crash seconds later. I have learned when modifying existing classes, a quick tap of the CTRL-S key to save the edit before VS applies the change in a dozen other places saves me about 10 minutes per crash. I only have 2 GB of memory and a dual-core processor, so obviously my development system is a little slow :-(.
Oh well, next problem. Following some of the Microsoft's tutorials is an exercise in abnormal psychology. How many different write-ups that explain the same procedure can you apply until you realize they are all leading you down a blind alley. That was how I felt when I realized today that there is only one correct way to add an embedded resource to a project that will actually be available at runtime.
I kept seeing the same incorrect instructions over and over: from the Resources folder in the Solution Explorer, select Add > New Item... from the folder's context menu. Great, my bitmap file just got included as an embedded resource, right? Wrong.
The right way: In your project's Properties folder there is a resource object named Resources.resx. Double-click this object to open the Resource Designer; alternatively, you can also double-click the Properties folder and select the Resources tab. At the top of the pane is a toolbar with an Add Resource dropdown. Select Add and choose or create your new resource. Now it gets included as an embedded resource, right? Sorry, wrong again.
Once your new item has been included as a resource in your project, you have to visit the object in the Resources folder. Select the item. In the Properties list, change Build Action to Embedded Resource. OK, now is it an embedded resource? Wait, don't forget to hit Save. Yes, it is at long last, after executing multiple steps, an embedded resource.
So, how much time did you save today? :-/