Drive a way to JSON Parsing and Composing
Overview:
JSON (JavaScript object notation) is a light weight data interexchange format, syntactically simpler. Many applications store the data in JSON format. XML stage in Information Server DataStage product provides the capability to parse and compose JSON data. XML Stage is a schema driven stage. Schema is required to parse and compose the JSON data. Importing the JSON file to Schema Library Manager generates the schema which can be used in JSON Parser and JSON Composer steps.
Import Schema to Schema Library Manager
Overview:
JSON (JavaScript object notation) is a light weight data interexchange format, syntactically simpler. Many applications store the data in JSON format. XML stage in Information Server DataStage product provides the capability to parse and compose JSON data. XML Stage is a schema driven stage. Schema is required to parse and compose the JSON data. Importing the JSON file to Schema Library Manager generates the schema which can be used in JSON Parser and JSON Composer steps.
Import Schema to Schema Library Manager
The JSON file which you would like to parse can be used to create a schema in the schema library manager. Schema Library Manager is opened from DataStage Designer client, Import->Schema Library Manager menu option. Create a library and import the JSON file which generates the schema.
Design to parse the JSON file
Figure1 illustrates the design of a job where it reads the contact information that is represented in a JSON format directly from JSON parser step in XML stage and parses the data and parsed records are written to two different flat files having contacts and phone numbers information in the form of relational data.
Design to parse the JSON file
Figure1 illustrates the design of a job where it reads the contact information that is represented in a JSON format directly from JSON parser step in XML stage and parses the data and parsed records are written to two different flat files having contacts and phone numbers information in the form of relational data.
Figure1: Design to parse the JSON file
Figure2 displays the assembly editor of XML stage which is opened by clicking on Edit assembly in Stage editor properties. In the Assembly editor, add the JSON Parser step from the palatte to the assembly outline. In JSON source tab of JSON Parser, you can specify from where the input is read to parse. Here choose the Single file option as it is a single json file and specify the path of the file.
Figure2 displays the assembly editor of XML stage which is opened by clicking on Edit assembly in Stage editor properties. In the Assembly editor, add the JSON Parser step from the palatte to the assembly outline. In JSON source tab of JSON Parser, you can specify from where the input is read to parse. Here choose the Single file option as it is a single json file and specify the path of the file.
Figure2: Assembly editor design
In Document root tab, choose the schema which is imported at the beginning. We need to make sure that right schema is chosen as document root for the input file which we are going to parse. In the output tab, the parsed data should be mapped to the columns of the sequential file.
Figure3 displays the mapping of parsed data to the sequential file columns.
Figure3: Mapping of source and target elements
Design to Compose the Relational Data to a JSON file
Figure4 illustrates the design of a job where it reads the contacts and phonenumber information from the sequential file and is composed to a JSON file using JSON composer step in the XML stage.
Figure4: Design to Compose a JSON file
XML stage in the job is reading the data from two sources which are sequential file stages Contacts and PhoneNumbers. The records are in two different files and in the schema it is represented in hierarchical format, so we need to join the data using HJoin step in the assembly editor.
HJoin step is added to the Assembly outline. It requires specifying parent and child list. In our example, contacts list is the parent and phoneNumbers is the child list. If both the lists needs to be joined based on a key, it needs to be specified. Here in our example, the lists are joined based on firstName and LastName column. Figure5 illustrates the design of HJoin step
Figure5: Design of HJoin Step
The data should be composed to JSON file using JSON Composer step. Add the JSON composer step to the assembly outline after HJoin step. Select the option Write to file and specify the directory where the JSON file needs to be created. In Document root tab, choose the schema based on which the file needs to be created. Here we choose the schema imported in the beginning. The schema is available in the Mappings tab to represent the input data in JSON format. It requires mapping of the input elements at source with the schema elements at target to compose into a file in the Mappings tab. Figure6 illustrates the mappings of source and target elements.
Figure6: Mapping of source and target elements.
Conclusion:
Using JSON parser and JSON composer steps, the relational data can be transformed to hierarchical data format JSON and vice versa.
Disclaimer: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”
No comments:
Post a Comment