asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search










Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

The X-Factor in SOA
By Joseph Poozhikunnel
Rating: 3.8 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Introduction

    The three X's that are often used in SOA are XML, XML Schema and XSLT. This article intends to examine them in terms of SOA as shown in figure 1. An SOA need not use any of them, but as more and more teams are standardizing the messaging mechanism, the more XML is being used. Therefore XML is the payload in any SOAP message. This adds the necessity of validating these messages and that could be done using XML Schema, but XML Schema has limitations. Fortunately, we can use XSLT to work around many of these limitations. XSLT can also be used for transforming the XML message into the format that the receiver is expecting.

    That's how all three technologies are used in an SOA. Since they play a very critical role in any SOA system it's important to analyze and design the use of these technologies. It's similar to if you were designing a database for your project -- the effort taken is considerable and the involvement of a data architect is a necessity. Similarly, designing an SOA to use the previously-mentioned XML technologies requires the same sort of effort from the teams involved.


    Figure 1: SOA with the three X's

    Message

    A message can have its payload consist of XML data. Therefore the sending service should be sending the necessary information in XML format. To create this XML payload the applications are written to generate the message in XML. A typical design process is creating the application to generate the XML message. XML then becomes the integration solution. Therefore sound strategies need to be developed to ensure the optimal utilization of XML in the integration mechanisms. All SOAP messages carry XML as their payload. The ease and compatibility in use of XML leads to other issues such as visualizing data in hierarchical order which is difficult when the message becomes more complex and therefore this could result in an unstable tree structure of XML messages.

    In designing an XML message mapping between two systems, it's easy to do a one-to-one or one-to-many mapping, but creating a many-to-many mapping is difficult. The mistake made by most developers is thinking in terms of file IO and not in terms of data transferred where IO is only part of the operation. Data is in XML format but with respect to behavior they have the same characteristics as the data in a database. Therefore modeling data on the basis of the latter would be more likely to lead to a good design.

    The application code on either side would also do a lot of the code parsing. This could be extremely complicated and there is no defined optimal path to choose as there is in databases. Most experienced developers take a long time in designing a data model and it would take that much long to design the XML message to be optimally used by the applications so that the applications that used them are not bogged down by an unwieldy XML structure.

    Thus it's important to define the design decisions to build the XML structure which is the core of the integration message. The several factors that need to be considered are performance, security, extensibility, reusability and data access.

    On taking these factors into consideration several issues need to be determined. They are:

    1. The bandwidth needed for the transmission: This is based on the size of the message and if the message is too large there are mechanisms to break it down and send it as smaller packets. An XML message can become too large and could hinder performance of the application as well as the performance of other applications sharing the resources on the server. Therefore it's important to study the limitations of the system and also the size of the message. Its better to decrease the size of the message and this could be done by splitting the message into multipart messages.
    2. Corresponding Schema definition: These XML messages would need a corresponding schema definition. This is important for type checking of the message as well as ensuring the proper constraints are applied across the value and hence less room for errors after the message is received. This also helps services to model themselves within the boundaries defined by the schemas. Schemas are contract between the messages, more of it later in this article. But as part of the messages the decision that needs to be taken would be whether the message should be accompanied by the schema or would there be a repository for the schema and appropriate reference is done to it. The best practice is to have the schema at central repository and any validation is done against it. This decreases the chance of schema proliferation across applications which would unwieldy to handle.
    3. Corresponding transformation: Transformation of the messages plays an important role in any application integration. There are several reasons for it but most important is that the service already exists and therefore it can accept messages only in its format in order to avoid modification of the service itself. Transformation would be also necessary in cases where the XML message would be mapped from one to many or even many to many. Such transformation of messages do involve a performance hit and therefore the good structuring of the XML message is important else too much reliance would be on the transformation and hence effecting performance considerably.
    4. Security of the message: Security is very critical aspect of any message. Individual parts of an XML message can be encrypted and this protects the actual values from being read by any unauthorized users. The SOAP messages would need to be selectively encrypted hence allowing the intermediate nodes to operate on the unsigned data while providing persistent protection of the SOAP message. Most security algorithms are tuned for fine performance and hence the hit on performance should be minimal.
    5. Ability to extend the message: Extending the message is very much a possibility therefore the teams designing an XML message should brace for this change anytime. Think of this similar to changes occurring to database schema or the services due to business rule changes. This could also mean changes that could break the services hence versioning is a better possibility. Version changes are not only to the services but also to the XML message and this included the schemas as well. Though in designing a XML message it's not possible to account for all possible changes but at least factors that could vary and may need to be extended can be taken into consideration. Moreover the message needs to validate against the schema which also needs to be versioned. Using some well defined patterns in creating XML messages would alleviate some of these issues. Since there is no standard acceptable pattern in the user communities it's difficult to have a common vocabulary in its usage but at least understanding them would give the designer of the message varying options. Simple things such as choosing between global attribute declaration or local attribute declaration need to be planned and then further developments should follow these standards rigidly.
    6. Loosely typed: XML messages should not be loosely typed. As an example if the sending application has a value of type string and the receiving application expects it in integer type then a transform can be applied to convert it from string type to integer type. But this should not be done because the transformation mechanism now has too much domain specific information needed for transforming. This dependency will not truly make our systems totally decoupled. Many developers may not have realized this as a dependency and is cause for unknowingly creating dependency. Therefore it is important for the teams to identify the business terms and also associate the values with the respective types.

    Extensibility and Reuse

    There are certain aspects to be considered for extensibility and reuse of the messages. Some of the important considerations are:

    1. Prefer attributes over child elements: The reason for this option while designing XML messages is because if attributes are used over child elements then further nesting of XML messages can be done easily.
    2. Large messages should have modular schemas: Here the messages can also be modularized if it's a large message. Moreover the schema needs to be modularized so it can be reused by another message and hence reduces the number of schema duplication. Large messages if modularized would result in reuse of several portions of it for another message being created and hence less repeatable elements. But this needs to be evaluated with regard to the performance tradeoff with the services merging these messages.
    3. Bind XML messages to namespace dynamically: This helps in reusing the XML messages in conjunction with other XML messages as well as used across different environments. This also raises an important aspect of design wherein standards should be established with regard to the namespaces to be used hence there will not be a proliferation of namespace which otherwise would eventually lead to total chaos with regard to the XML message management.

    There are other means of providing extension which should be avoided. They are:

    1. Element recursion: Do not let elements within elements to contain the parent elements again. Recursion is bad design should be avoided at any cost. It also hits eventual performance of the system using such a message.
    2. Using the "any" type: This lets you substitute any type for the value. This should be avoided since it would lead to several mismatches and is not a good design.

    Performance

    Good structuring of the message will ensure efficient data flow within the messaging system itself. XSLT can be used to extract relevant data in a message flow path and transforming it to another message suitable for the receiving application. Therefore if the message is well structured the XSLT processing performance would be better and this is one area that can be focused to improve performance.

    Size of the messages also matters with regard to performance. It should be reduced to essential data only. If there is a need to send a large message then the message can be split such that they are sent as smaller message that can be concatenated later. Therefore if these messages need to be processed then they can be done by multi-threading these tasks hence improves the overall performance and does not block the complete system for a single large message to be processed.

    Additionally do not make the message element name too verbose because this also adds to the performance hits.

    Standards

    Establishing standards in the beginning would help in alleviating several of misconceptions in designing, deployment as well as versioning of these messages.

    • The XML message itself should have been built on established standards and adopted company wide. Some of which are:

      1. Commenting the XML message which would help in ascertaining its use as well as future reusability of the message.
      2. Using patterns like exception handling patterns.
      3. Naming standards.
      4. Restricted types.
      5. Local or Global usage of attributes.
      6. Namespace usage conventions.

      At the initial design it may be difficult to adopt these mechanisms but this design can be used as a baseline to establish the standards.

    • Architectural standards such as developing the conceptual and physical representations of these messages should also be initiated.

    • Infrastructure standards such as server configurations and environment in which these messages flow. This is necessary at the early stages to define the metrics for the production environment and hence there will not be any surprises with regard to performance.

    • Maintenance standards such as versioning also needs to be adopted which would be discussed later in this article.

    Storage

    The XML messages being transferred across would need to be stored at some point of transmission. They can be stored as XML files or can be in a database. They can be in XML file that is dropped of at queue for the messaging bus then processes them in that order. The message could also be a SOAP message which may need to be temporarily stored sometime for a longer time. This would be a good case to use a database for the storage. Efficient storage could be done by designing an efficient database schema. Since the XML message can contain varying set of values based on the schema it is good to design the database using attribute based database design. This sort of design would also be a good pointer to the quality of the design of the XML message. Since such a database design would mimic the XML message very closely. This article will not delve into it but probably it needs an article of its own to explain the mechanism of such a design.

    Contract

    The messages sent across the messaging channels should conform to a contract. The messages being SOAP messages would contain XML payload as the data that is transferred between the services.

    These messages should fulfill a contract agreed upon by the services bilaterally or even multilaterally. A contract is an obligation that two or more services agree upon in sending and receiving messages. If the message does not fulfill the contract then the message is rejected. It is very important for systems to have such a contract since this gives the boundary of change that the application can perform independently of each system. This also guarantees that the message is fully acceptable to each of the service that has entered into the agreement. The question that arises is how a contract can be designed to fulfill this need.

    In this article I assume all the messages sent across are well formed XML messages. For non-XML documents the option would be to transform these messages into XML document by intermediary layers. Today it is standard to use SOAP as the transport mechanism and the payload of a SOAP message contains XML.

    XML Schema

    A schema describes the structure of the information. It establishes a standard vocabulary in creation of XML messages. It sets the constraints hence checks for the validity of the message passed between the systems, therefore defining message boundaries. For example if the message contained an order for an item but accidentally the system added duplicates then multiple items could be sold for a single unit price. This could have been prevented if a well defined schema was used to validate the XML message. Indirectly it is encapsulating the business rule as in the customer orders where same item cannot be ordered in duplication with a single unit price.

    There are basically two kinds of validity -- the validity of content models and the validity of specific units of data. Content model validity tests whether the order and nesting of tags is correct. Datatype validity is the ability to test whether specific units of information are of the correct type and fall within the specified legal values.

    The vocabulary of an XML Schema document is comprised of about thirty elements and attributes.

    The datatyping power of XML Schema can be seen in the declaration for <phoneNumber> in figure 2. We begin by defining a phoneNumberType datatype which is a string that needs to be exactly three digits followed by a hyphen followed by exactly three digits and another hyphen and then followed by four digits:

    <datatype name="phoneNumberType">
      <basetype name="string"/>
      <lexicalRepresentation>
        <lexical>999-999-9999</lexical>
      </lexicalRepresentation>
    </datatype>
    

    Figure 2: Phone Datatype

    With the phoneNumber datatype defined, it's now easy to declare that a <phoneNumber> element which must be of the type as seen in figure 3.

    <elementType name="phoneNumber">
      <datatypeRef name="phoneNumberType"/>
    </elementType>
    

    Figure 3: The phone Element Type

    Using this schema any XML message can be validated to conform to the phone number structure.

    Typical Design

    At first the applications are created to generate the XML messages. The schema is created as an afterthought. Then there is a mismatch of the schemas and now to integrate it transformations are created. The complexity of the message is now added to the transformation system. Moreover serious incompatibilities in the schema would result in the increased difficulty for creating the transform. This result in extremely complex logic in the transformation and the bugs are difficult to track. To correct the system already built some of the schemas are remodeled and added complex exceptions to the transformation logic. Finally the integration is done irrespective of the irregularities in the xml message.

    Improved Design

    For an improved design the following steps need to be followed:

    1. Identify the XML instance.
    2. Write a schema based on this XML structure.
    3. Use this schema as the contract between the messages.
    4. Transform the message as needed.
    5. Similar messages are created which uses similar data and more.
    6. Add on to the existing schema.
    7. Repeat the steps recursively.

    The application should be written only after this flow is established if necessary a simple application can be created to test these messages. Sometimes the applications already exist even in such case start of creating an efficient XML structure first and follow the steps as enumerated.

    Location Transparency

    The sending message can have an attribute 'SchemaLocation' which would indicate the location of the schema. But now the messages need not have an idea about the location of the schema. The schema information can be stored in a database with the receiving address location and based on the receiving address information the appropriate schema is used to validate the message. This would mean the SOAP message is interrogated for its end point address and using this information the routing mechanism polls the database to retrieve the appropriate schema to validate the message. The purpose of considering the receiving end point is that the same message could be validated only for the information needed by the receiving end point and so more specific schemas can be created based on the need.

    Versioning

    This is another critical component of the schema. Since a message itself could change and there could be requirement where both the old and new message could be emitted through the same system. Additionally it also gives the possibility of flexibility in changes as needed if multiple receiving points have variations in data and hence extensions only to portions of the schema are done.

    There are multiple approaches to versioning of a schema.

    1. Using schematron: Schematron is another schema language. It can be embedded within the XSD using the <appinfo> element within the XML Schema document. Constraints can be expressed using <assert> elements. Schematron will extract the directives from the XSD document and create a schema which will be then used to validate the instance document.

      The figure 4 below shows an example of schematron embedded in a schema that can be used to validate an XML message for version 2.0.0.

      <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
          targetNamespace="http://www.mySchema.org"
          xmlns="http://www.mySchema.org" 
          xmlns:sc="http://www.mySchema.org/schematron"
          elementFormDefault="qualified">
        <xsd:annotation>
          <xsd:appinfo>
            <sc:title>Version Validation with Schematron</sc:title>
            <sc:ns prefix="v" uri="http://www.mySchema.org"/>
          </xsd:appinfo>
        </xsd:annotation>
        <xsd:element name="ver">
          <xsd:annotation>
            <xsd:appinfo>
              <sc:pattern name="Version check">
                <sc:rule context="v:check">
                  <sc:assert test="v.Version = 2.0.0" 
                        diagnostics="equivalent">
                      Version is valid
                  </sc:assert>
                </sc:rule>
              </sc:pattern>
              <sc:diagnostics>
                <sc:diagnostic id="equivalent">
                  The version is incorrect 
                  v = <sc:value-of select="v:ver"/>
                </sc:diagnostic>
              </sc:diagnostics>
            </xsd:appinfo>
          </xsd:annotation>
          <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="Ver" type="xsd:string"/>
            </xsd:sequence>
          </xsd:complexType>
          <!-- Other schema entries for the xml document -->
        </xsd:element>
      </xsd:schema>
      

      Figure 4: Schematron in an XML Schema.

    2. Using XSLT expression: XSLT is a good expression to check the version. The only disadvantage of using this option is an XSLT Stylesheet has to be maintained for each XML schema document. This results in increased complexity in maintaining and matching the right documents across each other.

      Figure 5 is an example of usage of XSLT to verify the version.

      <?xml version="1.0"?>
      <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
              xmlns:xs="http://www.mySchema.org"
              version="1.0">
      
        <xsl:output method="text"/>
      
        <xsl:template match="/">
          <xsl:if test="/xs:ver <> '2.0.0'">
            <xsl:text>The version is incorrect</xsl:text>
          </xsl:if>   
          <xsl:if test="/xs:ver <> '2.0.0'">
            <xsl:text>The version is correct</xsl:text>
          </xsl:if>   
         </xsl:template>
      </xsl:stylesheet>
      

      Figure 5: XSLT example for checking the version in an XML message.

    3. Using any of the programming languages: This is done by creating another application layer that verifies each schema version by preprocessing them in the messaging pipeline. The disadvantage is that this could be bottleneck for overall performance of the system. This should be the least encouraged mechanism.

    Maintenance

    As the applications grow, the number of schemas would increase. Additionally there would be enhancements to existing schemas and possibility of maintaining multiple versions could arise. All this would need a sound maintenance plan in maintenance of the schemas. Some of the best practices are:

    1. Create modular schemas so they cane be highly normalized and hence can be reused easily. Since they are modular they can be combined into larger schemas as needed by using include and import attribute in schemas.
    2. Redefine elements with which one schema document can actually override elements or attributes of another.

    In conclusion using XML schema as the contract for the messages avoids data redundancy, misaligned vocabularies and enforces business rules.

    Transform

    A message transmitted across the pipeline may not be of the acceptable structure for the receiver. The reason for the receiver needing a different structure could be varying. But normally it could be because the application exists which a common scenario is. This would mean to avoid changing the application it would be better to pass the message in an acceptable format to the receiving service. It could be also be that the application does not needs all the information that was transmitted and hence would like to receive it in a different format.

    To perform this transformation XSLT is used. XSLT provides the following capability:

    1. Conversion of the message received.
    2. It can be used in dynamic filtering.
    3. Schema versioning.

    Transformation is processor intensive and hence can degrade performance. Therefore to design an XSLT system the following needs to be taken into consideration.

    1. Introduce Queuing mechanism. Queuing mechanism would help each document to be processed in order and hence several documents are not processed at the same time since this could result in the messaging bus unable to perform tasks due to processor intensive task of the XSLT.
    2. Modeling XML into manageable size. If the XML messages are of smaller size then processing them using XSLT would be less processor intensive.

    Conclusion

    The three X's are used extensively in SOA compatible systems. It should be considered as a major component since the success of an SOA used within an organization rests on these components more than any other infrastructure. Creating it may look easy but a good design is a considerable effort from the teams involved. If these considerations are not carefully evaluated in the design it could lead to major issues in performance, maintenance and extensibility of the system.

    About the Author

    You can find Joseph Poozhikunnel's blog on software development at http://thenoproblemguy.blogspot.com.

  • Rate This Article
    Not HelpfulMost Helpful
    1 2 3 4 5
    Other Articles
    May 19, 2005 - Building an Enterprise Service Bus to Support Service Oriented Architecture
    In this article, Joseph Poozhikunnel defines an Enterprise Service Bus (ESB) that can be created to support any Service Oriented Architecture (SOA) adopted by an organization. The type of ESB required could vary as there is no "one size fits all", therefore the article examines a few of the mechanisms available that could be adopted to implement an ESB.
    [Read This Article]  [Top]
    Apr 14, 2005 - Building an End User Defined Data Model - Part 2
    In the seconmd part of his series on building an end user defined data model, Peter Scheffler gets into the actual meat of the model and discusses real-world implementation details and the actual table layouts.
    [Read This Article]  [Top]
    Mar 24, 2005 - Building an End User Defined Data Model - Part 1
    In the first article in this series, Peter Scheffler introduces the concept of a rules-based database engine that allows clients to make changes to their database structure without breaking the applications that access the database.
    [Read This Article]  [Top]
    Jan 19, 2005 - Developing a Simple Service Oriented Architecture
    The basic premise of a Service Oriented Architecture (SOA) system is to decouple applications from each other in order to make them autonomous. In this article, Joseph Poozhikunnel presents a simple SOA framework that can be used as a starting point for a system that addresses your specific business needs.
    [Read This Article]  [Top]
    Nov 3, 2004 - 10 Steps to a Successful Versioning and Deployment Strategy for .NET
    A well rounded versioning and deployment strategy considers several overlapping and interdependent .NET Framework concepts. In this article, Michele Leroux Bustamante will take you through a ten step program that reviews these core concepts, their relationship, and provides guidance for successful application deployments for the .NET Framework.
    [Read This Article]  [Top]
    Oct 27, 2004 - Business Intelligence with Microsoft SQL Server Reporting Services - Part 2
    Adnan Masood continues his discussion of Microsoft SQL Server Analysis services and Microsoft SQL Server Reporting services. In this part, he discusses the steps that go into building more advanced reports.
    [Read This Article]  [Top]
    Oct 13, 2004 - Business Intelligence with Microsoft SQL Server Reporting Services - Part 1
    Adnan Masood discusses Microsoft's comprehensive integrated business intelligence, data mining, analysis and reporting solution: Microsoft SQL Server Analysis services and Microsoft SQL Server Reporting services.
    [Read This Article]  [Top]
    Dec 15, 2003 - Realizing a Service-Oriented Architecture with .NET
    Chip Irek examines the architectural issues and component design issues of building a .NET application in a service-oriented architecture.
    [Read This Article]  [Top]
    Oct 21, 2003 - Achieving Reuse in ASP .NET - Part 1: Barriers to Reuse
    The importance of reuse can't be overstated, especially in light of the degree to which we go out of our way to avoid it, but implementing a reuse strategy means creating high-quality low-cost applications that just might save your job.
    [Read This Article]  [Top]
    Jun 16, 2003 - The .NET Architect: Enterprise Template Dynamic Help
    One of the most critical components of any application is the help file collection. The fourth article in Brian Korzeniowski's Enterprise Template series examines Dynamic Help in Visual Studio .NET and focuses on the logical process of creating help content.
    [Read This Article]  [Top]
    Mailing List
    Want to receive email when the next article is published? Just Click Here to sign up.

    Support the Active Server Industry



    JupiterOnlineMedia

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info


    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers