Wednesday, 11 June 2014

Panic about Large Schema, VIEW is here


Many complex data schemas such as industry standard XML schemas are very large and contain complex structures with many element and attribute declarations. The input XML files which users get from different sources will have only few elements and attributes, so the whole schema is not required for parsing and composing. When large schemas are imported to Hierarchical Data stage, it presents the user with a large amount of information with all the elements and attributes. Though you require only a part of the schema to parse or compose your document you have to deal with whole information which is overwhelming and in turn makes the design of parsing and composing job a cumbersome task. Hierarchical Data stage provides a feature called view to handle such scenarios.

A schema view is a subset of a complex schema. You create a view by selecting types, elements, and attributes from a complex schema to describe the documents. You can design jobs to parse and compose complex data using views rather than using original schema. Using the views, you are not overwhelmed by large amount of information presented by the original complex schema.

Creating a view

You create a view in Schema Library Manager. Before creating a view, you need to import the schema to the Schema Library Manager. Once the schema is imported to the library, you open the library to start creating a view. There are 2 ways to create a view:

1. Select a global element and then click on the “Create View” button on top-right corner.

2. Right click on any element in the schema tree and select the “Create a view” options from the context menu. There are 2 options available in the context menu to create a view as shown in figure below. The first option allows you to create a view with the root element of the global element as root. The second option allows creating a view where the root element will be the selected element.

 After one of the above 2 options are selected, a window called Create View appears as shown in below figure.

The Name and Description fields allow you to specify a name and description for the view. The Find option on the right side allows you to search any node in the schema. It has 2 options on how the search will be performed:

1. Search for the node within the children of open nodes in the schema shown in the table below.

2. Search for the node up to the selected level deep of closed nodes. The default level set is 4 and maximum is 9.

The table has 4 columns named Source Node, Include in View, Include All Descendants and Chunk.

1. The Source Node column shows the structure of the actual schema on which you want to create your view.

2. The Include in view column contains check boxes next to all the nodes. You need to select the check boxes for the nodes that you want to be included in your view.  When a list or a group is included in your view, the check boxes appear in “Include All Descendants” and “Chunk” column. When any node is selected, all the mandatory nodes get selected automatically so that you do not need to worry about the validity of the resultant view.

3. The Include All Descendants check box allows you to include all the descendent nodes nested within a list or group in your view.

4. The chunk check box allows you to chunk a node selected in the view. When a node is chunked, the descendants of the node are not included in the view. During the data processing, the data described by the node and its descendants is to be passed as a simple text string. This feature is useful for defining a view to combine several document parts into a document. In the view, you will chunk the nodes that represent the document parts.

To create a view, select the elements and attributes you need for your view and click on Ok to save the view. In the below figure, a view is created for employee. You can see that only few elements are selected.

The view will appear under the global elements for which the view is created. In the below figure it can be seen that view that is being created for employee appears under departments node.

The view can then be used in Parser and Composer step for parsing and composing required documents and it makes life lot easier.


If the schema is large, view can be created and used in Parser and Composer steps.


"The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions."

1 comment:

  1. My name is Dinakaran Subburaj, Creating the view in the large XML schema is my idea. i worked in DTCC in Mar-2011 to Jun-2013 and we had issue processing with large schema. So it is my recommendation to IBM to create a sub schema where it has only the elements that are to be processed. I am happy to see that my idea is been implemented.