Tips to choose the right schema and input XML file to parse
using XML Parser step
Overview:
The blog provides few tips in choosing the valid schema,
input XML file and the node as document root in XML Parser step
XML stage is a schema driven stage. To parse or compose XML
data, it requires to import the schema to the Schema Library Manager. Before
importing the schema to the schema library manager, the following points needs
to be taken care.
- Make sure the namespaces mentioned in the XML file is same as the ones mentioned in the XSD file
Suppose the namespace used in xsd
is
<?xml version="1.0"
encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://ibm.com/infosphere/xml/Employee_2013" xmlns:tns="http://ibm.com/infosphere/xml/Employee_2013" elementFormDefault="unqualified">
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://ibm.com/infosphere/xml/Employee_2013" xmlns:tns="http://ibm.com/infosphere/xml/Employee_2013" elementFormDefault="unqualified">
The XML file should be using the
same namespaces.
Incorrect usage :
<?xml version="1.0"
encoding="UTF-8"?>
<tns:employees
xmlns:tns="http://ibm.com/infosphere/xml/Employee_2014">
Here it is using the namespace
ending with 2014 which is not the same as in the schema where it ends with 2013
Correct usage :
<?xml version="1.0"
encoding="UTF-8"?>
<tns:employees
xmlns:tns="http://ibm.com/infosphere/xml/Employee_2013">
- XML file should contain all the namespaces related to the XML elements or nodes used in the file as it is mentioned in the XSD.
Suppose the xsd file contains the following lines
<?xml version="1.0"
encoding="utf-8"?>
<xs:schema xmlns="http://www.ibm.com/Schema.xsd" xmlns:tns="http://www.ibm.com/Schema.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com/Schema.xsd" elementFormDefault="qualified">
<xs:schema xmlns="http://www.ibm.com/Schema.xsd" xmlns:tns="http://www.ibm.com/Schema.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com/Schema.xsd" elementFormDefault="qualified">
<xs:element name="employee" type="tns:EMP_Name" />
Incorrect usage:
<employee>
<firstName>Cynthia</firstName>
<middleName>P</middleName>
<lastName>Donald</lastName>
</employee>
<firstName>Cynthia</firstName>
<middleName>P</middleName>
<lastName>Donald</lastName>
</employee>
Here in the xml file, namespace is not included
Correct usage:
<employee
xmlns="http://www.ibm.com/Schema.xsd"
xmlns:tns="http://www.ibm.com/Schema.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema" >
<firstName>Cynthia</firstName>
<middleName>P</middleName>
<lastName>Donald</lastName>
</employee>
<middleName>P</middleName>
<lastName>Donald</lastName>
</employee>
- When the schema is imported to schema library manager, you can view all the global elements in the schema file as root nodes. Figure1 below represent the schema in the Schema Library Manager.
In XML Parser step, you need to
choose the schema as document root to parse the input XML file. The global
element should be chosen as document root based on the input file.
Suppose the input file is
The root node in the input file is
departments. We need to choose the global element where departments is the root
and it contains employees as child.
You need to choose the node
departments as document root in the XML Parser step.
If the input XML file starts with
employee node, then employee type should be chosen as document root.
Disclaimer: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”
No comments:
Post a Comment