|
Introduction
While working on a project involved heavily with XML, XML Schemas and Web Services, the need for a class library to handle schema validation of XML files became very apparent. I have created such a class that will validate an XML file against any number of given schema documents.
Download source code
Why XML Schemas?
XML documents are human readable, text representations of data. These documents often need a well defined structure in order to be portable across platforms and development systems. One way to accomplish this is by developing an XML Schema. The purpose of a schema is to describe a class of XML documents using XML markup constraints to document the usage and relationship of their underlying foundations, such as elements, data types and content. The best analogy I have seen is that XML Schemas are like classes and XML documents are like objects. This is similar to object oriented concepts. Typically a class uses a template for objects to be "instantiated." This is what XML Schemas attempt to do with XML Documents. XML documents are referred to as instances and XML schemas are classes. Essentially schemas can be used to catalogue classes of XML documents.
Schema Usage Scenarios
Since XML is becoming so prevalent these days for many different things, I can immediately think of several specific reasons for implementing XML Schema validation tools.
Data Exchange is a common scenario for XML. Often XML is used to transport data from one platform to another. Take for example a company which publishes a Web service to accept orders for products. Any customer can submit XML order documents to place an order. A tool that will validate the order document before it is sent on to the order processor will insure specific business rules are being followed.
More information about XML and XML Schemas can be found at http://www.w3.org/XML/ and http://www.w3.org/XML/Schema.
Coding a Validation Class Library
To begin I have created a class called XMLUtils to encapsulate the validation functionality for my XML and XSD documents. I have also borrowed parts of a class that my friend Robert Chartier wrote to use as our basic HttpRequest wrapper. Thanks Rob! I will outline the classes and describe their functionality below. For brevity I haven't included full namespace references. I have included them in the downloadable code.
Figure 1.1 - XMLUtils.cs Class File
public class XMLUtils
{
public XMLUtils()
{
}
public static Stream GetXMLStream(string XMLSource)
{
//check for raw XML, XML file, or XML url
Stream XMLStream = null;
if(XMLSource.StartsWith("<?XML") ||
XMLSource.StartsWith("<schema"))
{
//raw XML
XMLStream = new MemoryStream();
XMLStream.Read(ASCII.GetBytes(XMLSource), 0,
XMLSource.Length);
}
else
{
//lets try System.Uri
try
{
System.Uri XMLUri = new System.Uri(XMLSource);
if(XMLUri.IsFile)
{
//local file
XMLStream = new FileStream(XMLSource,
FileMode.Open);
}
else
{
//url
Santra.Common.HttpRequest http = new
Santra.Common.HttpRequest();
http.RequestUrl=XMLSource;
XMLStream = http.GetRequestStream();
}
}
catch(Exception ex)
{}
}
return XMLStream;
}
public static System.Boolean Validate(String XML, String[] schemas)
{
System.Boolean isValid = false;
XMLSchemaCollection xsc = new XMLSchemaCollection();
try
{
foreach (String xsd in schemas)
{
xsc.Add(null,xsd);
}
Stream XMLStream = GetXMLStream(XML);
if(XMLStream != null)
{
XMLValidatingReader vreader = new
XMLValidatingReader(XMLStream, XMLNodeType.Element, null);
vreader.Schemas.Add(xsc);
while (vreader.Read())
isValid = true;
}
}
catch (Exception e)
{
isValid = false;
}
return isValid;
}
}
The first method in this class is GetXMLStream. This method takes one parameter and returns a System.IO.Stream representation of the XML document that I passed as a parameter. The first parameter is a string representing raw XML, an XML file name, or URL to an XML document. First the GetXMLStream method checks to see if raw XML has been passed. If the XML parameter is raw XML, it converts it to a stream object. If no raw XML was passed, the method attempts to determine if the file is local or a URL. If it is a file, it creates a filestream with the contents of the file. If it is a URL it uses the HttpRequest object to return an HttpRequest Stream.
Next is the Validate method. This method takes two parameters and returns a Boolean. The first parameter is a string representing raw XML, an XML file name, or the URL to an XML document. The second parameter is a String array of XSD file names or URLs. This method contains a private variable that will track the state of validation.
Next an XMLSchemaCollection is created to hold each of the XSD schema documents that I passed in. Next I get a stream object representing the current XML document to validate. If the stream is not null (which means XML data exists) then I create an XMLValidatingReader. This object provides our class library with all the logic needed to validate XML documents against schema documents. Next I add the schema collection to the Schemas property of the validating reader. Then I use a while loop to enumerate through each of the nodes in the XML document and the XMLValidatingReader handles the actual validation by ensuring the node meets the constraints established by the schema documents.
Next I will detail the HttpRequest class used to get a stream of XML from a remote URL.
Coding the Http Request Wrapper
The HttpRequest class is a basic wrapper for System.Net.HttpWebRequest. I use this class to get a stream of XML when passing in a remote XML url. Again for brevity some namespace references may not be present, but they are included in the downloadable source.
Figure 1.2 - HttpRequest.cs Class File
public class HttpRequest
{
//public enumerations
public enum HttpRequestMethodValues
{
POST,
GET
}
//internal property values
private HttpRequestMethodValues httpMethod = HttpRequestMethodValues.GET;
protected System.String method;
protected System.String url;
public HttpRequest()
{}
public HttpRequest(System.String requestUrl, System.String requestMethod)
{
//Set the internals
this.url = requestUrl;
this.method = requestMethod;
}
public System.IO.Stream GetRequestStream()
{
HttpWebRequest request = (HttpWebRequest) HttpWebRequest.Create(this.url);
System.Type methodType = this.HttpMethod.GetType();
request.Method =
HttpRequest.HttpRequestMethodValues.GetName(methodType,this.HttpMethod);
WebResponse response = request.GetResponse();
return response.GetResponseStream();
}
public System.String RequestMethod
{
get{return method;}
set{if (value != method) method = value;}
}
public System.String RequestUrl
{
get{return url;}
set{if (value != url) url = value;}
}
public HttpRequestMethodValues HttpMethod
{
get { return httpMethod; }
set { httpMethod = value; }
}
}
This class has 2 constructors, a default one and one override that allows us to pass parameters in during creation. I have created some private variables to hold the values of the properties such as RequestMethod, RequestUrl and HttpRequestMethodValues. The GetRequestStream method creates an HttpWebRequest using the parameters I have assigned to the properties and returns a RequestStream which I return to the calling application. This object makes a Web request for the give url, in the case of our application, an XML file. Then a stream object containing the contents of the XML file is returned. Next I will explain the client console application written to access the class library we have created.
Figure 1.3 - SchemaVal.cs Client application
class SchemaVal
{
[STAThread]
static void Main(string[] args)
{
Int32 argCount = args.Length;
switch (argCount)
{
case 0:
DisplayUsage();
break;
case 1:
DisplayUsage();
break;
}
if (argCount >= 2)
{
//1st argument is XML document name
String XMLDoc = args[0];
//2nd argument is ,-delimited string of schema docs
String[] xsdDocs;
//check for single or multiple xsd docs
String tempXsd = args[1];
if (tempXsd.IndexOf(",") != 0)
{
//multiple xsd docs exist
xsdDocs = tempXsd.Split(',');
}
else
{
//single xsd doc exists
xsdDocs = new String[] {tempXsd};
}
System.Boolean isValid = XMLUtils.Validate(XMLDoc,xsdDocs);
if (isValid)
{
Console.WriteLine("The XML document and Xsd documents
are valid.");
}
else
{
Console.WriteLine("The XML document and Xsd documents
were not valid.");
}
}
}
private static void DisplayUsage()
{
Console.WriteLine("Usage : ");
Console.WriteLine(" schemaValidator XMLFile xsdFiles");
Console.WriteLine("");
Console.WriteLine("Examples : ");
Console.WriteLine(@" schemaValidator c:\products.XML
c:\products1.xsd,c:\products2.xsd");
Console.WriteLine(@" schemaValidator c:\products.XML
http://www.domain.com/products.xsd");
}
}
This client console application demonstrates the use of our Schema Validation class library. First we handle the arguments that are passed to the application, such as the XML file to validate, and one or more XSD schema documents to validate against. Next we create a string array of XSD documents. We pass this string array to our XMLUtils.Validate method for validation. This method returns a Boolean true/false value depending on the validation results. If the XML file fails against one of the schemas, false is immediately returned. Finally at the end of our class file, I have created a method for displaying the correct usage to the user.
Conclusion
In this article I have given you the basis for developing your own class library for validating XML documents against schema documents. By encapsulating our validation logic into a class library, we make this code very portable. This example could be extended by possibly creating a method to return more information about failed validation such as the failing node or element and the schema file that caused the validation error.
About the Author
Jeff Gonzalez has been working in the IT industry for the last six years. He started his IT career as an NT4 administrator and network engineer. While working for a hosting company, he recognized the power of Windows DNA and sought out to learn everything he could about it. Since his foray into the Internet development world, he has worked on several e-commerce, e-business, and intranet applications. He can be reached at rig444@hotmail.com.
|