Configure the Unstructured Data stage to extract the data
from the multiple Microsoft Excel sheets.
Procedure
- Double-click the Unstructured Data stage to open the stage
properties.
- Click Configure.
Note: Do
not configure any stage properties in this step because you can configure
all the required configurations in the Configuration window.
- In the Configuration window, specify the full file path
of the Microsoft Excel input file Employee2.xls.
- From the Range option, select Specify
the start row.
- In the Range expression field, specify A3:H3 .
When the stage runs with Specify the first row option and no specific
sheet name is specified in the Range expression,
the job finds the last row dynamically and extracts rows to the last
row at runtime.
- In Column header, select First
row of data ranges.
- Click Load. The Excel columns that
exist in the specified data range are listed in the Import pane.
- On the Property tab, select the
checkbox next to the property, to extract the property value. In this
example, select the Sheetname as the property.
- Click Import. The column mappings
are generated by the stage.
- To make the SheetNname column the first column in the list:
- Select the SheetName column.
- Click Up until the SheetName
column is the first column in the list.
- In the mapping table, insert a row for ADDRESS column in
the input file that has hyperlink.
- Click Insert.
- In the Excel item option, select Column
ADDRESS.
- In the Import option cell in
the new row, select Hyper link address.
- Specify the DataStage column name EMAIL_ADDRESS for
the new row.
- Click OK.
- Confirm that the values that you entered on the Configuration
window are saved on the Property tab of the
stage editor.
- Click tab to change the data type or other attributes. Change the type of EMP_NO column to Integer.
- Click OK.